Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Executive function measurement in urban schools: Exploring links between performance‐based metrics and teacher ratings

Executive function measurement in urban schools: Exploring links between performance‐based... INTRODUCTIONThere is a growing body of literature that investigates executive functions of children in ethnic minority families from high‐poverty communities (e.g., Choi et al., 2016; Lawson et al., 2018; 2013; Nesbitt et al., 2013). However, this population is often excluded from psychometric and cognitive development research. While some attempts have been made in recent years to norm standardized executive function tasks using nationally representative samples (e.g., Flores et al., 2017; Weintraub et al., 2013; Zelazo et al., 2013), much of the existing psychometric research overrepresents children from white, middle‐income, or affluent families. Far less is known about whether these findings can be generalized to children from ethnic minority families in low‐income communities. A critical part of understanding current views about the universality of executive function development must involve this important group of children (Miller‐Cotto et al., 2021). For example, recent data indicate that approximately 14% of children from the United States live in poverty, nearly 50% are from ethnic minority families, and approximately 71% of children living in poverty are from ethnic minority families (Children's Defense Fund, 2021). However, reaching ethnic minority families from low‐income communities for research is challenging. One way to promote their participation in developmental research is to run studies in collaboration with their local schools, providing excellent opportunities to include teacher ratings of executive function‐related behaviors in the classroom. However, relatively little is known about how well teacher rating scales capture executive functions in low‐income, ethnic minority communities compared to those of their White, more affluent peers.Executive functioning is often thought to involve higher‐order mental processes needed to facilitate goal‐oriented behavior through conscious control of thoughts and actions (Diamond, 2001; Miyake et al., 2000). However, authors often conceptualize these skills differently based on their own theoretical backgrounds and there is not yet a full consensus on a single definition (Jacob & Parkinson, 2015; Nilsen et al., 2017). Despite the widespread challenges in defining executive functions, common components include working memory, inhibition, and shifting (Miyake et al., 2000; Riggs et al., 2006). Each component provides an element of purposeful self‐directed behavior, making each important for cognitive development.Performance‐based tasksAn increasing number of studies are looking at the development of executive functions using performance‐based tasks. Generally, these studies find skills increase from early childhood into young adulthood (e.g., Lee et al., 2013). Individuals with better performance‐based scores (compared to their age‐matched peers) tend to do better on various academic metrics. For example, they have more successful transitions into school (e.g., Müller et al., 2017), better academic achievement in elementary (e.g., Monette et al., 2011) and secondary school (e.g., Samuels et al., 2016), better mental health (e.g., Diamond, 2013), and better long‐term employment potential (e.g., Bailey, 2007). Children entering school with better cognitive skills seem to be in a better position to grow academically and progress, while children with poor cognitive skill development seem to fall behind further over time (e.g., Abenavoli et al., 2015). Given the key role of executive functions in child development, it is important to understand the nature and course of its development across a wide range of children.There are variations across different groups of children that help to better understand the course of executive function development. Children facing great adversity like developmental disabilities (e.g., Hosenbocus & Chahal, 2012), extreme levels of stress (e.g., Blair & Raver, 2016), or high levels of poverty (e.g., Lawson et al., 2013) have lower skills. More recent large‐scale longitudinal studies afford a better look at the development of these skills across a wider range of children (e.g., NICHD Study of Early Child Care and Youth Development in the United States and the Millennial Cohort study in the United Kingdom), but performance‐based task data collection remains time/staff intensive.Behavior rating scalesGiven the importance of better understanding the development of executive functions and resource‐heavy constraints of performance‐based tasks, there have been attempts to develop alternatives. Commonly called, “everyday measures,” these ecologically valid metrics have been developed to assess executive function abilities through day‐to‐day behaviors in real‐world contexts (Kouklari et al., 2018). One useful approach involves metrics that can be completed by parents, caregivers, or teachers rather than trained researchers or clinical/educational psychologists. However, developing such metrics made it clear that although performance‐based tasks have strong links to everyday outcomes, they are not necessarily sensitive to day‐to‐day executive functioning, meaning that they tend to have low ecological validity (Burgess et al., 2006). Behavior rating scales became an effective way to gauge everyday executive functioning for younger populations. They can provide a unique perspective on different elements that performance‐based tasks may not measure (e.g., Isquith et al., 2004; Soto et al., 2020). In addition, they can be completed by parents or teachers because they have more extensive experience with the child.There are a few different examples of behavior rating scales that tap directly into everyday executive functioning (e.g., Behavior Rating Inventory of Executive Functions, Gioia et al., 2000; Child Executive Functioning Inventory, Thorell & Nyberg, 2008; Behavior Assessment System for Children, Reynolds & Kamphaus, 1992, 2004, 2015). The present study focuses on the Behavior Assessment System for Children (BASC; Reynolds & Kamphaus, 1992, 2004, 2015). The BASC has been used in a variety of developmental and clinical settings for a few decades, but it is only recently that a subset of items has been used for everyday executive functioning (Garcia‐Barrera et al., 2011; Karr & Garcia‐Barrera, 2017). It has been validated with clinical samples in the United States (e.g., Soto et al., 2020). There are parent and teacher versions for participants ages 2–25 years (teacher version only to 18 years).Comparing performance‐based and rating scalesThere are now a good number of studies comparing performance‐based tasks with rating scales, but the results are not consistent. Some studies have small sample sizes (n < 200; e.g., Choi et al., 2016; McAuley et al., 2010), often due to clinical samples that could lead to Type II errors. Large, sufficiently powered, nationally representative samples, from a variety of contexts (e.g., Finders et al., 2021; Litkowski et al., 2020) are important for a better understanding of how well these different metrics can be used to measure skill development. Additionally, some studies indicate that behavior rating scales have better reliability and ecological validity than performance‐based measures (Barkley & Murphy, 2010). Others suggest that performance‐based tasks and rating scales assess different aspects of executive functions and could be complementary (Karr & Garcia‐Barrera, 2017). Toplak et al. (2013) ran a meta‐analysis on 68 studies and found a median correlation of r = 0.18 between performance‐based and rating scales. However, they did not split their sample by type of rating scale or whether the study used parent or teacher versions. Studies that do split these two versions seem to suggest that the teacher version is more reliably correlated to performance‐based tasks than the parent version (e.g., Gutierrez et al., 2021). Teachers might be more able to assess a student more objectively and have a better grasp of the student's attitude towards learning and disruptive behavior that may signal executive functioning difficulties (van Tetering & Jolles, 2017).While there is a body of work that examines the relationships between performance‐based tasks and parent/teacher rating scales (e.g., McAuley et al., 2010; McCoy, 2019; Sherman & Brooks, 2010; Sulik et al., 2010), most of these studies of typically developing children include samples that are from predominantly White, affluent backgrounds (e.g., Magimairaj, 2018; Miranda et al., 2015). We could find only one study with a high proportion of children from ethnic minority or low‐income backgrounds by Camerota et al. (2018), who found good links between the CHEXI rating scale and a battery of computerized performance‐based tasks in a large sample (N = 844) of children ages 3–5 years, with 46% from families considered at or below the United States federal government poverty threshold and a good mix of racial and ethnic backgrounds (60% White, 31% African American, 20% Hispanic, 7% Asian American, 1% Native American, 1% Pacific Islander). However, although statistically significant, the correlations were small effect‐sized (r = 0.10). Establishing a universal theory about everyday executive functioning requires samples from a wide range of typically developing backgrounds. Furthermore, it is important that attention is placed on the metrics being used to assess cognitive skills as it may not be appropriate to use the same assessments with individuals of various racial/ethnic minority groups and socioeconomic backgrounds.The current studyThe current study is designed to address specific gaps in the evaluation of everyday executive functioning in children with the aim of understanding whether behavior rating scales capture distinctions between performance‐based measures.First, we include two large samples of older children largely from ethnic minority families living in high‐poverty communities because they are an important part of understanding the development of these essential cognitive skills.Second, we compare two versions of a behavior rating scale, the BASC‐2 and BASC‐3, completed by teachers with computerized performance‐based tasks. Despite the existence of a newer version of the BASC (BASC‐3), the BASC‐2 behavior rating scale is included in the present study for two reasons; (1) the BASC‐3 was not available when Sample 1 data collection began; and (2) many archived and longitudinal datasets use the BASC‐2. The behavior rating scales tap into different behaviors exhibited by students and fall into four sub‐scales: problem solving, behavioral control, attentional control, and emotional control. We selected computerized tasks because of their relevance to this age group and because they can be administered to larger groups of students, making the large sample size possible.Third, we compare models where the performance‐based tasks make individual predictions with those where they are mapped onto one latent construct to gain a sense of how well behavior rating scales capture the distinctions between performance‐based tasks. Studies using pairwise correlations between each rating scale and performance‐based task tend to show nonsignificant correlations (e.g., Davidson et al., 2016; Gross et al., 2015), whereas those using composite scores for performance‐based tasks sometimes show stronger correlations (e.g., r = 0.30 in Soto et al., 2020). In line with existing research, we hypothesize that behavior rating scales are reliant upon behaviors that use multiple executive function skills and might not fully capture the distinctions between performance‐based tasks (Isquith et al., 2013; Soto et al., 2020; Toplak et al., 2013).METHODParticipantsSample 1Sample 1 included 243 older children (N = 243; Mage = 9.28 years, SDage = 0.80; nfemale = 125), who are predominantly African American (n = 216), with students from Latin American (n = 14), Asian American (n = 6), Pacific Islander (n = 5) and White (n = 3) backgrounds. Most qualified for free and reduced lunch (n = 241). Data were collected during after‐school hours from elementary schools in high‐poverty urban areas in the eastern United States. Participants were recruited from 12 public schools where administrators agreed to participate and host the program. The data reported here are from the baseline testing of a larger sample of third to fifth‐grade students participating in an after‐school program to learn how to play chess (see https://osf.io/yac8e/ for a summary of the funded project). The overall project received ethical review from multiple institutions: University of Cambridge's Psychology Research Ethics Committee (IRB 2011.39), Virginia State University (IRB 1011–37), and Virginia Commonwealth University (IRB HM20000017).Sample 2The second sample included 229 older children (N = 229; Mage = 10.02 years, SDage = 1.01; nfemale = 120), who are predominantly African American (n = 132), with students from Latin American (n = 92), White (n = 3), and Pacific Islander (n = 1) backgrounds. All students qualified for free and reduced lunch (n = 229). Data were collected during school hours from elementary schools in a high‐poverty urban area in the midwestern United States. Participants were recruited from three public schools where administrators and teachers agreed to participate during school hours. Data collection for Sample 2 received ethical review from the University of Cambridge's Faculty of Education Ethics Committee.Sample recruitmentRecruitment procedures were similar for both samples; study invitations included flyers sent home, teacher announcements, and advertising during after‐school professional development sessions. Parental consent and child assent were collected from each participant. In Sample 1, children received a $10 gift card and a small prize for completing the computerized tasks. In Sample 2, children received two small prizes for completing the computerized tasks. BASC rating forms were presented to teachers in a secure envelope; teachers completed one rating scale for each student and returned them to the researcher to maintain confidentiality. Teachers were asked to complete ratings based on how students behaved in school over the previous several months. As an incentive, teachers in Sample 1 were compensated $10 for each completed student rating form. For Sample 1, BASC2EF and performance‐based data were collected within 4 weeks, and within 1 week for Sample 2.Materials and proceduresTeacher behavior rating scalesSample 1BASC2EF (Karr & Garcia‐Barrera, 2017) is part of an ecologically valid (BASC‐2; Reynolds & Kamphaus, 2004), psychometrically sound teacher rating scale that includes 33 items about executive functioning in a classroom setting verified in cross‐cultural and clinical samples (Garcia‐Barrera et al., 2011; 2013; 2015). Classroom teachers completed a full BASC2 for each child, but only BASC2EF items are used here. Karr and Garcia‐Barrera (2017) evaluated the psychometric properties of the BASC2EF teacher version and found strong internal consistency (Cronbach's α range 0.81 to 0.89) for four subscales: (1) behavioral control includes items on distractibility and disruptive/harmful behavior towards peers; (2) attentional control includes items on following directions and keeping focused; (3) problem solving includes items on a child's ability to practice good study routines, cooperate with classmates, stay organized, make decisions, solve problems, and work in stressful settings; and (4) emotional control includes items on the propensity for anger or getting upset as well as adjustment to changing/stressful circumstances. The subscales contained 12, 7, 9, and 5 items, respectively. Responses to each item are given using a four‐point Likert scale: never, sometimes, often, and always.Sample 2Like BASC2EF, BASC3EF is a psychometrically sound (BASC‐3; Garcia‐Barrera et al., 2011; 2013; 2015; Goldstein & Naglieri, 2014) teacher rating scale that includes 31 items about executive functioning in a classroom setting and includes the same subscales as BASC2EF (Reynolds & Kamphaus, 2015). The items in each subscale changed slightly from the BASC2EF to the BASC3EF, with several items updated or removed. There is strong internal consistency (Cronbach's α range 0.90 to 0.94) for the four subscales: (1) behavioral control did not include new items, but continued to include items on self‐control and distractibility, and had fewer items about harmful and disruptive behavior; (2) attentional control added more items about following directions and keeping focused; (3) problem solving removed items on good study routines, cooperating with classmates, staying organized, making decisions, and working in stressful settings, and instead targeted planning and analyzing problems; and (4) emotional control removed items about adapting to changing circumstances while centering on adjustment to stressful situations and propensity for anger. The subscales contained 7, 8, 9, and 7 items, respectively.Computerized performance‐based tasksChildren from both samples completed the same six performance‐based tasks using the secured Thinking Games website (http://instructlab.educ.cam.ac.uk/TGsummary/) to allow for in‐school administration of tasks to large groups and streamlined data management (materials openly available from https://osf.io/whzrg/). Each task took about 5–7 min to complete, with task order determined by the logistics for the wider projects. Sample 1 participants completed tasks in varied orders over multiple days as part of a large battery of tasks administered during the after‐school club, Sample 2 completed tasks in the same order and in one session during the regular school day.Participants were encouraged to respond as quickly as possible while still being accurate. Accuracy and response time were collected for each trial on every task. Overall accuracy and response times to correct items were used to compute an efficiency score for each task (Ellefson et al., 2017; see Equation 1).1AccuracynumbercorrectresponsesTimeinseconds,tomakecorrectresponse\begin{equation}\frac{{Accuracy\;\left( {number\;correct\;responses} \right)}}{{Time\;\left( {in\;seconds,\;to\;make\;correct\;response} \right)}}\end{equation}Efficiency is a way to incorporate both accuracy and speed while avoiding extensively non‐normal distributions usually seen in these tasks for accuracy (ceiling effects) and response time (long tails). The expected response times and number of trials make the efficacy scores vary substantially across tasks, as such z‐scores are computed for each task and used in the analyses.These computerized tasks have been used extensively in cognitive psychology, cognitive development, and cognitive neuroscience. They have not been standardized in the same way that clinical neuropsychology tasks might be, as they have been developed for slightly different purposes. There is a large literature base that supports the use of performance‐based tasks derived from a cognitive psychology perspective for measuring executive function skills (Bignardi et al., 2021; Diamond et al., 2007; Espy et al., 2001; Zelazo, 2006). The tasks described below are commonly used tasks to measure the specific executive function skills linked to them (Luciana & Nelson, 2002; Parsey & Schmitter‐Edgecombe, 2013; Patel et al., 2021).Inhibition – Stop signal taskThis child‐friendly version of the original stop signal task (Logan, 1994) presents an image of a soccer field with a ball centrally positioned either on the left‐ or right‐hand side of the computer screen. The task includes 108 trials (presented in three blocks). Participants are instructed to use their keyboards to click the left arrow key when the ball is on the left‐hand side of the screen and the right arrow key when the soccer ball is on the right‐hand side of the screen. Left‐hand and right‐hand trials are divided equally (54 trials each). Participants are instructed to refrain from clicking either arrow when they hear a referee's whistle. The referee's whistle is played randomly on 20% of the total trials. In line with standard stop signal procedures, the time gap between the presentation of the image and the presentation of the whistle sound is either increased or decreased depending on participant accuracy.Sustained attention – Continuous performance taskThis child‐friendly version of the children sustained attention task (Servera & Cardo, 2006) is a computerized task based on the continuous performance tests paradigm (Rosvold et al., 1956). The current version included 300 trials. During each trial, a random numeral between one and nine was presented in the middle of the screen. Participants pressed the spacebar for each numeral, except when they see the number four. Each numeral appeared with equal probability.Working memory – Spatial span taskThis child‐friendly adaptation of the Corsi blocks tasks (Corsi, 1972) is split into two parts: forward patterns (presented first) and backwards patterns. The computer screen displays an array of nine boxes for each trial. Boxes briefly light up in a pre‐selected order. Participants are asked to click on the boxes either in the same order (forwards) or the reverse order (backwards) that they appeared. Only the backwards version is included here. After two practice items (each with two boxes lighting up), participants receive trials of increasing length, completing two trials each with lengths of 3‐ to 7‐box sequences. The task stops automatically after five consecutively incorrect trials.Shifting – Figure matching taskThis task is a slightly modified presentation of Ellefson et al. (2006) and is suitable for children. The task includes 128 trials, each with four simultaneous events. A target figure is presented in the center of the screen and varies by shape (triangle or circle) and/or color (blue or red). In each lower corner of the screen, there is a small figure; one matched the shape of the target and the other matched the color of the target. Participants follow the cue to sort by shape or color by pressing one of two keys on the computer keyboard. The 128 trials are presented randomly within four 32‐trial sets, counter‐balanced between participants. There are two pure sets (either all color trials or all shape trials) and two mixed sets with color and shape trials presented using an alternating‐runs sequencing (Rogers & Monsell, 1995) that changes tasks every two trials (e.g., color‐color‐shape‐shape‐color‐color‐shape‐shape).Planning – Tower of Hanoi taskThis is a computerized version of the task used by Welsh (1991). The Tower of Hanoi task is often referred to as a “disk transfer” task, as participants are asked to transform a start state of disks into a goal configuration in as few moves as possible with a set of rules imposed that restrain the way disks may be moved (i.e., larger disks cannot be placed on smaller disks). The minimum number of moves needed to transform the bottom set to match the top set increases with each successful trial. The increased number of minimum moves increases the overall task difficulty. Participants are given a practice problem and offered feedback for illegal moves. Once the practice trial has been completed, participants are given six 3‐disk problems, including 2‐ to 7‐move problems. This is followed by three 4‐disk items, including 7‐, 11‐, and 15‐move problems. To progress onto the more difficult problems, participants must make two consecutive minimum‐move solutions. Participants have a maximum of 20 moves to match the goal arrangement before being offered a new problem (with a maximum of six attempts to achieve two consecutive minimum‐move solutions). The task ends when participants have either successfully solved all problems or when they reached a problem that they could not solve twice in a row within six attempts.Executive decision making – Hungry donkey taskThis child‐friendly version of the Iowa Gambling task measures participants' ability to make decisions under uncertain circumstances, using logic‐based, cost‐benefit analyses (Crone & van der Molen, 2004). Participants help a hungry donkey collect as many apples as possible by choosing one of the four doors on the screen for each trial. Doors open by pressing the ‘a’, ‘s’, ‘d,’ or ‘f' keys. Participants win or lose apples when they open a door. Two doors are advantageous (win more apples than lose) and two are disadvantageous (lose more apples than win). For each set of advantageous/disadvantageous doors, one door produces small gains and small losses, while the other large rewards and large losses. The assignment of the doors was counterbalanced across participants. Accuracy is measured here as the total number of advantageous doors selected.DATA ANALYSISWe used structural equation modeling (SEM) to test links between BASC2EF and computerized performance‐based tasks (Figure 1). We ran analyses using psych (Revelle, 2021), lavaan (Rosseel, 2021) and SEMtools (Jorgensen et al., 2021) packages for R (R core team, 2019). The analyses were pre‐registered; the raw data and R scripts are openly available (https://osf.io/whzrg/). To identify the factor structure for the computerized performance‐based tasks and BASC2EF rating subscales, we ran reliability tests using Cronbach's alpha and measurement models using exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). We ran additional models that had poor fits (including ones preregistered for Sample 1); they are not reported here. Due to space limitations, descriptive statistics, correlations amongst individual items on the BASC2EF and BASC3EF, measurement models for the performance‐based tasks and teacher ratings (Samples 1 and 2), goodness‐of‐fit statistics for the measurement models, and the results of any other models tested that had poor fits are not reported here but are openly available (https://osf.io/whzrg/).1FIGURESEM models testing the links between performance‐based computerized tasks and behavior rating scalesInitial models run with all BASC2EF items were poor fits. We decided to look at the specific items in BASC3EF and noticed that many of the problematic items from BASC2EF were not in BASC3EF. BASC2EF model fits improved after removing the items not in BASC3EF. BASC2EF models for which items not appearing in the BASC3EF have been removed are labeled ‘adapted’ models. These models are reported here. BASC3EF was not available when we started the Sample 1 study. Based on the post‐hoc, BASC2EF results, we collected Sample 2 data using BASC3EF so that we could test how well BASC2EF models applied to BASC3EF. We report BASC3EF models with all items (note – BASC3EF models with only items in both versions had similar fits).Before running the analyses, we took a few preliminary steps. We converted Likert scale responses from the rating scales to numerical values, with 1 for never, 2 for sometimes, 3 for often, and 4 for always. Reverse coding was used so that all items used positive valent scales so that higher scores represented frequent positive behaviors. Efficiency scores were computed for each of the computerized performance‐based tasks. We converted efficiency scores for each performance‐based task to standardized z‐scores because the variability in scores due to the number of trials and time needed to complete the items across tasks created is problematic for SEM. Following procedures adopted by Lawson and Farrah (2017), we created age‐adjusted data by taking residuals from a linear regression with age as the predictor and each individual performance‐based efficiency score and BASC item as an outcome. Missing data were classified as missing at random and were replaced using full information maximum likelihood estimation (FIML; Allison, 2012; Enders, 2013). We adjusted for non‐normal distribution in some of the measured variables using the Yuan‐Bentler MLR estimation and reported only the scaled values for the fit indices (Bentler & Yuan, 1999; Brown, 2015; Tong et al., 2014; Yuan & Bentler, 1998). The goodness‐of‐fit of each model was assessed using well‐accepted criteria for SEM: a non‐significant chi‐square value (CFI > 0.90, TLI > 0.90, RMSEA ≤ 0.07, and SRMR < 0.08; e.g., Cheung & Rensvold, 2002; Hooper et al., 2008; Schumacker & Lomax, 2016). The Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC) were used to compare model fit for models with different structures (lower values = better fit).RESULTSPairwise Pearson correlations on the age‐adjusted data for each performance‐based task and mean response scores for each item of the BASC2EF/BASC3EF subscales show many statistically significant correlations, although most are small effect sizes (Table 1; r < 0.30).1TABLECorrelations amongst the performance‐based tasks and teacher rating subscales (using age‐adjusted data)Stop signalContinuous performanceSpatial spanFigure matchingTower of HanoiHungry donkeyBehavioral controlAttentional controlProblem solvingEmotional controlStop signal‐ ‐0.050.18*0.47***0.10−0.04−0.050.060.09−0.10Continuous performance21*‐ ‐0.13*0.030.06−0.050.03−0.02−0.040.02Spatial span0.17*0.11‐ ‐0.34***0.38***−0.14*0.17**0.22***0.23***0.17*Figure matching0.120.080.13‐ ‐0.23**−0.070.130.31***0.32***0.02Tower of Hanoi0.050.040.17*0.12‐ ‐−0.120.050.120.14*0.05Hungry donkey0.090.17*0.20***0.000.20**‐ ‐−0.17*−0.20*−0.21**−0.18**Behavioral control0.21**0.090.29***0.19**0.090.00‐ ‐0.80***0.56***0.83***Attentional control0.19**0.120.35***0.23**0.10−0.020.76***‐ ‐0.81***0.62***Problem solving0.20**0.18**0.33***26***0.15*0.0059***0.81***‐ ‐0.49***Emotional control0.17*0.010.23***0.17*0.12−0.080.68***0.63***0.65**‐ ‐Notes. BASC2EF (n = 243) below the diagonal; BASC3EF (n = 229) above the diagonal, with * p < 0.05, ** p < 0.01, *** p < 0.001.Correlations are large effect sized for values > 0.50, medium effect sized for values > 0.30 and small effect sized for values >0 .10.Cronbach's alpha tests indicated high reliability between the age‐adjusted items for the BASC2EF (behavior control = 0.90, attentional control = 0.93, problem solving = 0.81, emotional control = 0.83) and BASC3EF factors (behavior control = 0.96, attentional control = 0.96, problem solving = 0.96, emotional control = 0.95).The age‐adjusted models with the full set of BASC2EF items (see Table 2) are a poor fit compared to the adapted BASC2EF models (Models 1a and 1b, Table 2 and Figure 2).2TABLEStructural equation models goodness‐of‐fit statisticsModelX2dfRMSEACFITLISRMRAICBICSample 1Individual performance‐based tasks with original BASC2EF1400.24<0.0016630.070.860.850.071981120348Single latent for performance‐based tasks with original BASC2EF1440.90<0.0016920.070.860.850.071979020227Model 1a: Individual performance‐based tasks with adapted BASC2EF565.33<0.0012850.070.910.890.051406914482Model 1b: Single latent for performance‐based tasks with adapted BASC2EF602.99<0.0013140.060.910.900.061404514358Sample 2Model 2a: Individual performance‐based tasks with BASC3EF1167.09<0.0015900.070.920.910.051569716210Model 2b: Single latent for performance‐based tasks with BASC3EF1220.80<0.0016190.070.920.910.071569216106Notes. Scaled values are reported for all fit indices because the data were non‐normally distributed.p‐values for X2 are listed as superscripts; df = degrees of freedom; RMSEA = scaled root mean square of approximation; CFI = scaled comparative fit indices; TLI = scaled Tucker and Lewis’ index; SRMR = Standardized Root Mean Square Residual using Bentler's formula; AIC = Akaike information criterion; BIC = Bayesian information criterion (sample‐size adjusted).Data presented here have been adjusted for age. Full results are openly available from https://osf.io/whzrg/.2FIGURELinks between performance‐based and behavior ratings of executive functionsBASC2EF and BASC3EF capture different components of performance‐based tasksSample 1Model 1a (Figure 2) indicates that many of the links between individual computerized performance‐based tasks and BASC2EF factors were not significant. Span task (working memory) efficiency scores were significantly linked with behavioral control (β = 0.26, p < 0.001), attentional control (β = 0.35, p < 0.001), problem solving (β = 0.33, p < 0.001), and emotional control (β = 0.25, p < 0.001). Figure matching (shifting) scores were significantly linked with behavioral control (β = 0.14, p = 0.01), attentional control (β = 0.19, p = 0.003), problem solving (β = 0.20, p = 0.001), but not emotional control (β = 0.12, p = 0.07). Hungry donkey (executive decision making) efficiency scores had significant inverse links with problem solving (β = ‐0.14, p = 0.03), and emotional control (β = ‐0.17, p = 0.02), but not attentional control (β = ‐0.12, p = 0.06) or behavioral control (β = ‐0.08, p = 0.19). Stop signal (inhibition) efficiency scores were significantly linked with emotional control (β = 0.16, p = 0.04), but not behavioral control (β = 0.13, p = 0.05), attentional control (β = 0.12, p = 0.09) or problem solving (β = 0.13, p = 0.10). Continuous performance (sustained attention) efficiency scores were not significantly linked to any BASC2EF factors (β = [0.03, 0.07, 0.12, 0.02] p = [0.58, 0.32, 0.12, 0.78]). Tower of Hanoi task (planning) efficiency scores were also not significantly linked to any BASC2EF factors (β = [0.04, 0.03, 0.08, 0.09] p = [0.19, 0.74, 0.29, 0.31]).In contrast, Model 1b (Figure 2) indicates strong links between the overall performance‐based latent and each BASC2EF factor: behavioral control (β = 0.48, p = 0.002), attentional control (β = 0.59, p = 0.001), problem solving (β = 0.64, p = 0.002), and emotional control (β = 0.44 p = 0.01).Sample 2Model 2a (Figure 2) indicates that many of the links between individual computerized performance‐based tasks and BASC3EF factors were not significant. Span task (working memory) efficiency scores were significantly linked with behavioral control (β = 0.15, p = 0.03) and emotional control (β = 0.15, p = 0.03), but not with problem solving (β = 0.12, p = 0.09) or attentional control (β = 0.13, p = 0.06). Hungry donkey (executive decision making) efficiency scores had significant inverse links with behavioral control (β = ‐0.16, p = 0.03), attentional control (β = ‐0.17, p = 0.01), problem solving (β = ‐0.18, p = ‐0.01), and emotional control (β = ‐0.15, p = 0.04). Figure matching (shifting) scores were significantly linked with behavioral control (β = 0.15, p = 0.04), attentional control (β = 0.32, p < 0.001), and problem solving (β = 0.30, p < 0.001), but not emotional control (β = 0.04, p = 0.55). Stop signal (inhibition) efficiency scores had significant inverse links with behavioral control (β = ‐0.15, p = 0.04) and emotional control (β = ‐0.16, p = 0.03) but not attentional control (β = ‐0.12, p = 0.11) and problem solving (β = ‐0.07, p = 0.30). Continuous performance (sustained attention) efficiency scores were not significantly linked to any BASC3EF factors (β = [0.01, ‐0.05, ‐0.07, ‐0.002] p = [0.87, 0.42, 0.18, 0.97]). The same was true for Tower of Hanoi task (planning) efficiency scores (β = [‐0.04, ‐0.005, 0.02, ‐0.03] p = [0.54, 0.93, 0.77, 0.69]).In contrast, Model 2b (Figure 2) indicates significant links between the overall performance‐based latent with attentional control (β = 0.37, p < 0.001) and problem solving (β = 0.39, p < 0.001), but not with emotional control (β = 0.03 p = 0.73) or behavioral control (β = 0.17, p = 0.08).DISCUSSIONThere is a dearth of executive function research with samples from high‐poverty, ethnic minority communities. The results of this two‐sample study are consistent with other studies using computerized performance‐based tasks in more affluent schools (e.g., Ellefson et al., 2020; Xu et al., 2020) and contributes two key findings. First, BASC2EF in its original form is an adequate, but not excellent measure of everyday executive function behaviors by children from schools in high‐poverty communities; restricting analyses to only items included in BASC3EF or using BASC3EF is best practice. Second, BASC3EF seems better able to capture the different components of performance‐driven tasks, whereas BASC2EF captures overall executive functioning better than individual tasks.BASC3EF is more appropriate for high‐poverty, ethnic minority samplesThe post‐hoc approach used with adapted BASC2EF indicated that the better fitting models are more aligned with the structure of BASC3EF, rather than BASC2EF. In addition to the removal of several items that appear in BASC2EF, BASC3EF has some new items that were not included in earlier versions. The inclusion of new items and the removal of specific items from BASC2EF improved the fit for data from these two age‐adjusted, high‐poverty, ethnic minority samples. This finding suggests that BASC2EF in its original form might be a good fit for some populations, but it is not the best option for high‐poverty samples.The original BASC2EF might not have an excellent fit for a variety of reasons. One option could be that the items do not apply equivalently across all groups of children. Another is that the BASC3EF items are more aligned with current understanding of behaviors related to everyday executive functioning skills. Even though the BASC‐3 is the most up‐to‐date version of the BASC itself, the BASC2EF is the more recent measure, established after BASC3EF. Karr and Garcia‐Barrera (2017) acknowledge that the existence of BASC3EF could mean that the derived BASC2EF is used less frequently than BASC3EF. They derived the BASC2EF because of its popularity and the likely existence of archived datasets that necessitate an embedded, psychometrically supported instrument for the assessment of executive functions.Even though BASC2EF is a reliable and valid measurement of executive function, those psychometric properties are based on a predominantly white, middle‐income sample. The current study suggests these findings do not generalize to ethnic minority children in high‐poverty contexts. Even though the BASC2EF might be an outdated measure, it will likely continue to show up in archived datasets and longitudinal work, making this line of research on BASC2EF important despite the existence of BASC3EF. Based on our findings, we recommend that analyses of the BASC2EF with archived or longitudinal datasets from high‐poverty settings should use similar procedures to remove BASC2EF, not found in BASC3EF. Taken together, these findings have implications for how executive function skills are conceptualized and measured.BASC3EF is better able to capture distinct elements of performance‐based tasksEveryday executive functioning skills measured by BASC2EF, are best linked to an overall latent of computerized performance‐based tasks. This finding corroborates Duckworth and Yeager (2015), suggesting that behavior rating scales are likely looking at behaviors that require a combination of different executive function skills, so they are represented by a global ability. It is likely that classroom behaviors are mediated by multiple executive functions, so some behavior rating scales may not be as good at capturing the distinctive skills as well as performance‐based tasks. For example, the attentional control subscale could involve working memory, inhibition, and switching skills. However, everyday settings do not make it obvious if those skills are working in harmony or tandem.BASC2EF seems best linked to an overall performance‐based task latent, whereas BASC3EF can capture distinctions across performance‐based tasks. The changing emphasis of the items in BASC3EF subscales could account for its ability to capture individual elements of everyday executive functioning better than BASC2EF. The new BASC3EF questions could be capturing more individual executive function elements, allowing the subscale to target self‐control and distractibility. The attentional control subscale added more items about following directions and keeping focused in BASC3EF. The BASC2EF problem solving subscale contained items about good study routines, cooperating with classmates, staying organized, making decisions, and working in stressful settings that were excluded from BASC3EF, allowing the subscale to target planning and analyzing problems. The BASC2EF emotional control subscale included items about adapting to changing circumstances that were eliminated in BASC3EF, shifting the subscale to target adjustment to stressful situations and propensity for anger.Results for models depicting individual executive function tasks (Models 1a and 2a) remain consistent between the BASC2EF and BASC3EF for the cognitive flexibility, sustained attention, and planning tasks. However, there are differences in the working memory, executive decision‐making, and inhibition tasks that could be explained by the changes in the subscales.There are significant positive links between working memory (i.e., spatial span task) and all BASC2EF subscales, but significant links only for the behavioral control and emotional control BASC3EF subscales. The broader scope of BASC2EF subscales could create stronger links with working memory, as each may be capturing multiple executive functions. Looking closely at how the items change between the BASC2EF and BASC3EF, it is possible that the items in the attentional control and problem solving BASC3EF subscales do not capture the behaviors linked to working memory in the same way as BASC2EF. In particular, the problem‐solving subscale excluded items that were not related to planning and problem solving, and the BASC3EF attentional control subscale included more specific items about following directions and maintaining focus. In both instances, the BASC3EF subscales are more targeted, potentially weakening the links between problem solving and attentional control with working memory.There are significant negative links between executive decision‐making (i.e., hungry donkey task) and the BASC2EF problem solving and emotional control subscales and non‐significant links for the BASC2EF behavioral control and attentional control subscales. In contrast, there were significant negative links between executive decision making and all BASC3EF subscales. The removal of BASC2EF items about disruptive and harmful behavior towards other children in the BASC3EF behavioral control subscale and new items about following directions and keeping focused in the BASC3EF attentional control subscale could have elicited stronger negative links to executive decision making.The links between inhibition (i.e., stop signal task) and BASC2EF subscales are positive and significant. In contrast, they are negative (or non‐significant) for BASC3EF. Looking at the specific items, some new BASC3EF items could capture post‐error slowing behaviors. Briefly, post‐error slowing is the inclination for participants to slow down on the current trial after committing an error on the previous trial (Rabbitt, 1966); this speed‐accuracy trade‐off reflects an intent to increase accuracy (e.g., Botvinick et al., 2001). The design of the stop signal task allows children to be more aware of their errors and the efficiency metric is more sensitive to post‐error slowing than the other performance‐based tasks in this study. Students might use a similar strategy in classrooms by reflecting on their mistakes, thinking critically, and controlling their negative emotions when making errors (de Mooij et al., 2022). Several new items in the BASC3EF target these types of behaviors. Specifically, attentional control (following directions and keeping focused) taps into reflecting on mistakes. Problem solving (planning and analyzing problems) taps into thinking critically. Emotional control (adjusting to stressful situations and propensity for anger) taps into controlling negative emotions when making errors. The inclusion of new items that emphasize these classroom behaviors may be the reason why there are both negative links between the stop signal task and these BASC3EF subscales and why the link between the overall performance‐based latent and these scales is smaller than for BASC3EF.Variation also exists between the overall executive function models and BASC subscales (Models 1b and 2b). All paths between overall executive function and the adapted BASC2EF subscales show significant links. Paths between overall performance‐based executive function and BASC3EF attentional control and problem solving subscales show significant links, but paths between overall executive function and behavioral control and emotional control are non‐significant. The changed BASC3EF subscale items may have weakened those links, as the target of each subscale became narrower and encompassed fewer classroom behaviors.Our findings suggest that researchers with archived or longitudinal BASC2EF datasets from high‐poverty settings wanting to compare BASC2EF responses to performance‐based tasks should use a composite score (or single latent variable) of that task performance rather than looking at links between individual tasks and BASC2EF subscales to reduce Type II errors. Researchers planning new projects with BASC3EF should be able to explore direct performance‐based task links or use composite scores. However, they should consider the specific performance‐based tasks they are measuring and how each task maps onto the BASC3EF items.Finally, these findings indicate that teachers are appropriately identifying where children are in terms of executive functioning based on their observations of everyday behaviors. As the field gets better at working with teachers to develop effective ways of incorporating executive function research into classroom practice, it is worth remembering that teachers already have knowledge surrounding everyday executive function behaviors, even if they don't have the same vocabulary that is used in cognitive psychology contexts (Perry et al., 2015). It is important for researchers to work with teachers collaboratively, rather than from the top down to build upon their existing expertise (Ellefson et al., 2019). Teachers are already in a good position to implement well‐designed curriculum centered on building executive function skills, especially when researchers recognize that teachers bring a sophisticated understanding of everyday executive functioning to the table (Faith et al., 2022).Limitations and future directionsAlthough these findings make a unique contribution to the field by comparing teacher ratings of everyday behaviors with computerized performance‐based tasks, there are some limitations and areas where further research is needed.Future research should test additional theoretical models that are suited to the complexity of the executive function constructs, investigating whether the current findings generalize to other computerized performance‐based executive function tasks and observer reports of executive function in children, completed by teachers and parents alike. The current battery of computerized performance‐based tasks are themselves only one metric of a different aspect of executive function skills (e.g., inhibition, working memory). Research designs using additional tasks could be used to test how robust these findings are for different types of performance‐based tasks measuring the same executive function skills. If three different performance‐based tasks were used for each of the three key aspects of executive function (i.e., inhibition, shifting, working memory) commonly presented in the literature (Miyake et al., 2000; Riggs et al., 2006), then a theoretical evaluation of the unity and diversity of executive function could be tested across performance‐based and behavior rating scales. Furthermore, it is important for computerized performance‐based tasks to be standardized and full psychometric evaluations conducted, especially with older children. Such work would enable better comparison with other executive function metrics developed in clinical, developmental, or neuropsychology settings.It is possible that some differences in the findings between the two samples could be driven by participant demographics. Future work could specifically sample across ethnic minority groups to investigate whether there are any systematic biases in behavior rating scales that might distinctively affect children from diverse minoritized backgrounds.Additionally, nested models were not included in the current study as the teacher was not an appropriate predictor and the sample lacked sufficient statistical power. Though much of the existing literature does not include nested models (e.g., McAuley et al., 2010; Miranda et al., 2015; Van Tetering & Jolles, 2017), it would be useful for future research with larger sample sizes to include nesting of teachers in schools to better understand whether there are any response biases, especially when studying children from high‐poverty and/or ethnic minority families.Recent work by Camerota et al. (2020) highlights the need for critical evaluation of the best way to represent executive functions in measurement models. While the current study aligns with a common reliance on confirmatory factor analysis or structural equation models to represent the construct of executive functions (e.g., Miyake & Friedman, 2012; Miyake et al., 2000; Wiebe et al., 2008), Camerota et al. warn against accepting these measurement models as the default without implementing a reasoned approach for selecting them because interpretational differences can occur depending on the measurement model selected to represent executive functions. Our BASC3EF results substantiate their advice. Future research would benefit from the implementation and comparison of different models (e.g., formative latent variable models) that may represent executive functions differently.CONCLUSIONSResults from the current study indicate a robust link between the performance‐based tasks and both versions of the BASCEF. Children's performance on these computerized performance‐based tasks does appear to be linked to the behaviors that teachers report on BASCEF rating scales. However, BASC2EF in its original form is not a good representation of everyday executive function behaviors by children from schools in high‐poverty communities. Given the popularity of BASC‐2, it is likely that future research using this measure will continue to extend from archived datasets and longitudinal work. As such, it is important to recognize that the derivation of BASC2EF could be useful as an embedded executive function screener for many archived BASC‐2 datasets. However, it is not an appropriate screener for data from children from ethnic minority families from low‐income communities without removing items that are not in BASC3EF. When given the choice, BASC3EF is more appropriate for children from high‐poverty, ethnic minority populations.This work opens a window for exploring the links between computerized executive function tasks and observer reports of children within the context of low‐income, urban communities. Our findings indicate that it may not be appropriate in some cases to use the same assessments with individuals of various racial/ethnic minority groups and socio‐economic backgrounds. This reaffirms the importance of including diverse samples in executive function research and exploring the utility of different versions of executive function screeners that continue to be used in longitudinal research or archived datasets.AUTHOR CONTRIBUTIONSSample 1: Ellefson, Serpell and Parr submitted the initial grant application and study concept. Ellefson oversaw the development of the Thinking Games website. Data collection was supervised by Ellefson, Serpell and Parr. Ellefson and Serpell managed and conducted the scoring, data entry, data cleaning and all other general data management for the computerized tasks and teacher surveys. Sample 2: Zonneveld and Ellefson developed the data collection plan as part of Zonneveld's doctoral research. Zonneveld managed and conducted all data analysis and data entry. Manuscript: Zonneveld and Ellefson wrote the R scripts for data wrangling and analysis; they interpreted the results here as well as Zonneveld for a thesis submitted as partial fulfillment of a doctoral degree at the University of Cambridge. Zonneveld and Ellefson prepared the open science data files. Zonneveld and Ellefson drafted the manuscript. All authors approved the final draft.ACKNOWLEDGMENTSThe research reported here for Sample 1 was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305A110932 to the University of Cambridge. The opinions expressed are those of the authors and do not represent views of the Institute or the U.S. Department of Education. Special thanks to Geoff Martin for Thinking Games website programming and William Harris for proofreading. Correspondence about the overall grant / project for Sample 1 should be directed to Michelle Ellefson (mre33@cam.ac.uk) An overall summary of the project deriving Sample 1 is available from https://osf.io/yac8e/.ADDITIONAL ACKNOWLEDGEMENTSThe data presented for Sample 1 are the result of work by a large group from the University of Cambridge, Virginia State University, Virginia Commonwealth University and Ashley‐Parr, LLC (listed alphabetically by last name): Temitope Adeoye, Mariah Adlawan, Amanda Aldercotte, Annabel Amodia‐Bidakowska, Cortney Anderson, Christopher Ashe, Joseph Beach, Aaron Blount, Alexander Borodin, Lyndani Boucher, Aaron Blount, Lakendra Butler, Yufei Cai, Tavon Carter, Emma Chatzispyridou, Parul Chaudhary, Laura Clarke, Taelor Clay, Jackson Collins, Brittany Cooper, Aiden Cope, Briana Coardes, Breanna Cohen, Aiden Cope, Amenah Darab, Moneshia Davis, Shakita Davis, Asha Earle, Mary Elyiace, Nadine Forrester, Sophie Farthing, Pippa Fielding Smith, Aysha Foster, Gill Francis, Kristine Gagarin, Amed Gahelrasoul, Marleny Gaitan, Summer Gamal, Katie Gilligan, Cynthia Gino, Reinaldo Golmia Dante, Zejian Guo, Aditi Gupta, Jennifer Hacker, Shanai Hairston, Khaylynn Hart, Donita Hay, Rachel Heeds, Sonia Ille, Joy Jones, Madhu Karamsetty, Spencer Kearns, Marianne Kiffin, Hyunji Kim, Wendy Lee, Steven Mallis, Dr. Geoff Martin, Tyler Mayes, Alexandria Merritt, Roshni Modhvadia, Dedrick Muhammad, Susana Nicolau, Christian Newson, Seth Ofosu, Esther Obadare, Jwalin Patel, Chloe Pickett, Tanya Pimienta, Connor Quinn, Kelsey Richardson, Michael Randall, Fran Riga, Tennisha Riley, Natalie Robles, Leah Saulter, Kristin Self, Tiera Siedle, Julian Smith, Abi Solomon, Adam Sukonick, Amelia Swafford, Krystal Thomas, Richard Thomas, John Thompson, Tris Thrower, Jr., Quai Travis, Maria Tsapali, Jorge Vargas, Tony Volley, Christopher Walton, Elexis White, Karrie Woodlon, Zhuoer Yang, Shamika Young, Sterling Young, Cheyenne Yucelen, Anne Zonneveld. These researchers were directed by the PI/Co‐PIs: Michelle Ellefson, Zewelanji Serpell, and Teresa Parr.CONFLICTS OF INTERESTThe authors have declared no conflict of interest. The authors have no conflicts of interest to report related to the research in this manuscript. This manuscript has been submitted for publication and is likely to be edited as part of the peer review process.DATA AVAILABILITY STATEMENTThis manuscript reflects a sub‐set of analyses conducted as part of the first author's doctoral dissertation. Preregistration, data/analyses, and a preprint are openly available for Sample 1 (preregistration: https://osf.io/4hrvj/ ; materials and data/analyses: https://osf.io/whzrg/) and Sample 2 (preregistration: https://osf.io/8n2e3; data/analyses: https://osf.io/fkw5m/). The manuscript preprint is available from https://psyarxiv.com/myr2s/.REFERENCESAbenavoli, R. M., Greenberg, M. T., & Bierman, K. L. (2015). Parent support for learning at school entry: Benefits for aggressive children in high‐risk urban contexts. Early Childhood Research Quarterly, 31, 9–18. https://doi.org/10.1016/j.ecresq.2014.12.003Allison, P. D. (2012). Handling missing data by maximum likelihood, 21. https://statisticalhorizons.com/wp‐content/uploads/MissingDataByML.pdfBailey, C. E. (2007). Cognitive accuracy and intelligent executive function in the brain and in business. Annals of the New York Academy of Sciences, 1118, 122–141. https://doi.org/10.1196/annals.1412.011Barkley, R. A., & Murphy, K. R. (2010). Impairment in occupational functioning and adult ADHD: The predictive utility of executive function (EF) ratings versus EF tests. Archives of Clinical Neuropsychology, 25, 157–173. https://doi.org/10.1093/arclin/acq014Bentler, P. M., & Yuan, K. ‐ H. (1999). Structural equation modeling with small samples: Test statistics. Multivariate Behavioral Research, 34, 181–197. https://doi.org/10.1207/S15327906Mb340203Bignardi, G., Dalmaijer, E. S., Anwyl‐Irvine, A., & Astle, D. E. (2021). Collecting big data with small screens: Group tests of children's cognition with touchscreen tablets are reliable and valid. Behavior Research Methods, 53, 1515–1529. https://doi.org/10.3758/s13428‐020‐01503‐3Blair, C., & Raver, C. C. (2016). Poverty, stress, and brain development: New directions for prevention and intervention. Academic Pediatrics, 16, S30–S36. https://doi.org/10.1016/j.acap.2016.01.010Botvinick, M. M., Braver, T. S., Barch, D. M., Carter, C. S., & Cohen, J. D. (2001). Conflict monitoring and cognitive control. Psychological Review, 108, 624–652. https://doi.org/10.1037/0033‐295x.108.3.624Brown, T. A. (2015). Confirmatory factor analysis for applied research (2nd ed.). Guilford.Burgess, P. W., Alderman, N., Forbes, C., Costello, A., Coates, L. M.‐A., Dawson, D. R., Anderson, N. D., Gilbert, S. J., Dumontheil, I., & Channon, S. (2006). The case for the development and use of “ecologically valid” measures of executive function in experimental and clinical neuropsychology. Journal of the International Neuropsychological Society, 12, 194–209. https://doi.org/10.1017/S1355617706060310Camerota, M., Willoughby, M. T., & Blair, C. B. (2020). Measurement models for studying child executive functioning: Questioning the status quo. Developmental Psychology, 56, 2236–2245. https://doi.org/10.1037/dev0001127Camerota, M., Willoughby, M. T., Kuhn, L. J., & Blair, C. B. (2018). The childhood executive functioning inventory (CHEXI): Factor structure, measurement invariance, and correlates in US preschoolers. Child Neuropsychology, 24, 322–337. https://doi.org/10.1080/09297049.2016.1247795Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness‐of‐fit indexes for testing measurement invariance. Structural Equation Modeling, 9, 233–255. https://doi.org/10.1207/S15328007SEM0902_5Children's Defense Fund (2021). The State of America's children. Children's Defense Fund: Available from: https://www.childrensdefense.org/wp‐content/uploads/2021/04/The‐State‐of‐Americas‐Children‐2021.pdfChoi, J. Y., Castle, S., Williamson, A. C., Young, E., Worley, L., Long, M., & Horm, D. M. (2016). Teacher‐child interactions and the development of executive function in preschool‐age children attending head start. Early Education and Development, 27, 751–769. https://doi.org/10.1080/10409289.2016.1129864Corsi, P. M. (1972). Human memory and the medial temporal region of the brain. Unpublished doctoral dissertation, McGill University.Crone, E. A., & van der Molen, M. W. (2004). Developmental changes in real life decision making: Performance on a gambling task previously shown to depend on the ventromedial prefrontal cortex. Developmental Neuropsychology, 25, 251–279. https://doi.org/10.1207/s15326942dn2503_2Davidson, F., Cherry, K., & Corkum, P. (2016). Validating the behavior rating inventory of executive functioning for children with ADHD and their typically developing peers. Applied Neuropsychology: Child, 5, 127–137. https://doi.org/10.1080/21622965.2015.1021957de Mooij, S., Dumontheil, I., Kirkham, N. Z., Raijmakers, M., & van der Maas, H. (2022). Post‐error slowing: Large scale study in an online learning environment for practising mathematics and language. Developmental Science, 25, e13174. https://doi.org/10.1111/desc.13174Diamond, A. (2001). A model system for studying the role of dopamine in the prefrontal cortex during early development in humans: Early and continuously treated phenylketonuria. In: C A. Nelson & M. Luciana (Eds.), Handbook of developmental cognitive neuroscience (pp. 433–472). MIT Press.Diamond, A. (2013). Executive functions. Annual Review of Psychology, 64, 135–168. https://doi.org/10.1146/annurev‐psych‐113011‐143750Diamond, A., Barnett, W. S., Thomas, J., & Munro, S. (2007). Preschool program improves cognitive control. Science, 318, 1387–1388. https://doi.org/10.1126/science.1151148Duckworth, A. L., & Yeager, D. S. (2015). Measurement matters: Assessing personal qualities other than cognitive ability for educational purposes. Educational Researcher, 44, 237–251. https://doi.org/10.3102/0013189x15584327Ellefson, M. R., Baker, S. T., & Gibson, J. (2019). Lessons for successful cognitive developmental science in educational settings: The case of executive functions. Journal of Cognition and Development, 20, 253–277. https://doi.org/10.1080/15248372.2018.1551219Ellefson, M. R., Ng, F. F.‐Y., Wang, Q., & Hughes, C. (2017). Efficiency of executive function: A two‐generation cross‐cultural comparison of samples from Hong Kong and the United Kingdom. Psychological Science, 28, 555–566. https://doi.org/10.1177/0956797616687812Ellefson, M. R., Shapiro, L., & Chater, N. (2006). Asymmetrical switch costs in children. Cognitive Development, 21, 108–130. https://doi.org/10.1016/J.COGDEV.2006.01.002Ellefson, M. R., Zachariou, A., Ng, F. F.‐Y., Wang, Q., & Hughes, C. (2020). Do executive functions mediate the link between socioeconomic status and numeracy skills? A cross‐site comparison of Hong Kong and the United Kingdom. Journal of Experimental Child Psychology, 194, 104734. https://doi.org/10.1016/j.jecp.2019.104734Enders, C. K. (2013). Dealing with missing data in developmental research. Child Development Perspectives, 7, 27–31. https://doi.org/10.1111/cdep.12008Espy, K. A., Kaufmann, P. M., Glisky, M. L., & McDiarmid, M. D. (2001). New procedures to assess executive functions in preschool children. Clinical Neuropsychologist, 15, 46–58. https://doi.org/10.1076/clin.15.1.46.1908Faith, L., Bush, C. ‐ A., & Dawson, P. (2022). Executive function skills in the classroom: Overcoming barriers, building strategies. The Guilford Press.Finders, J. K., McClelland, M. M., Geldhof, G. J., Rothwell, D. W., & Hatfield, B. E. (2021). Explaining achievement gaps in kindergarten and third grade: The role of self‐regulation and executive function skills. Early Childhood Research Quarterly, 54, 72–85. https://doi.org/10.1016/j.ecresq.2020.07.008Flores, I., Casaletto, K. B., Marquine, M. J., Umlauf, A., Moore, D. J., Mungas, D., Gershon, R. C., Beaumont, J. L., & Heaton, R. K. (2017). Performance of Hispanics and non‐Hispanic Whites on the NIH Toolbox Cognition Battery: The roles of ethnicity and language backgrounds. The Clinical Neuropsychologist, 31, 783–797. https://doi.org/10.1080/13854046.2016.1276216Garcia‐Barrera, M., Karr, J., Durán, V., Direnfeld, E., & Pineda, D. (2015). Cross‐cultural validation of a behavioral screener for executive functions: Guidelines for clinical use among Colombian children with and without ADHD. Psychological Assessment, 27, 1349–1363. https://doi.org/10.1037/pas0000117Garcia‐Barrera, M. A., Kamphaus, R. W., & Bandalos, D. (2011). Theoretical and statistical derivation of a screener for the behavioral assessment of executive functions in children. Psychological Assessment, 23, 64–79. https://doi.org/10.1037/a0021097Garcia‐Barrera, M. A., Karr, J. E., & Kamphaus, R. W. (2013). Longitudinal applications of a behavioral screener of executive functioning: Assessing factorial invariance and exploring latent growth. Psychological Assessment, 25, 1300–1313. https://doi.org/10.1037/a0034046Gioia, G. A., Isquith, P. K., Guy, S. C., & Kenworthy, L. (2000). Behavior rating inventory of executive function. Psychological Assessment.Goldstein, S., & Naglieri, J. A. (Eds.). (2014) Handbook of executive functioning. Springer.Gross, A. C., Deling, L. A., Wozniak, J. R., & Boys, C. J. (2015). Objective measures of executive functioning are highly discrepant with parent‐report in fetal alcohol spectrum disorders. Child Neuropsychology, 21, 531–538. https://doi.org/10.1080/09297049.2014.911271Gutierrez, M., Filippetti, V. A., & Lemos, V. (2021). The childhood executive functioning inventory (CHEXI) parent and teacher form: Factor structure and cognitive correlates in Spanish‐speaking children from Argentina. Developmental Neuropsychology, 46, 136–148. https://doi.org/10.1080/87565641.2021.1878175Hooper, D., Coughlan, J., & Mullen, M. R. (2008). Structural equation modelling: Guidelines for determining model fit, 6, 53–60. https://doi.org/10.21427/D7CF7RHosenbocus, S., & Chahal, R. (2012). A review of executive function deficits and pharmacological management in children and adolescents. Journal of the Canadian Academy of Child and Adolescent Psychiatry, 21, 223–229.Isquith, P. K., Gioia, G. A., & Espy, K. A. (2004). Executive function in preschool children: Examination through everyday behavior. Developmental Neuropsychology, 26, 403–422. https://doi.org/10.1207/s15326942dn2601_3Isquith, P. K., Roth, R. M., & Gioia, G. (2013). Contribution of rating scales to the assessment of executive functions. Applied Neuropsychology: Child, 2(2), 125–132. https://doi.org/10.1080/21622965.2013.748389Jacob, R., & Parkinson, J. (2015). The potential for school‐based interventions that target executive function to improve academic achievement: A review. Review of Educational Research, 85, 512–552. https://doi.org/10.3102/0034654314561338Jorgensen, T. D., Pornprasertmanit, S., Schoemann, A. M., & Rosseel, Y. (2021). semTools: Useful tools for structural equation modeling. R package version 0.5‐4. https://CRAN.R‐project.org/package=semToolsKarr, J. E., & Garcia‐Barrera, M. A. (2017). The assessment of executive functions using the BASC‐2. Psychological Assessment, 29, 1182–1187. https://doi.org/10.1037/pas0000418Kouklari, E. ‐ C., Tsermentseli, S., & Monks, C. P. (2018). Everyday executive function and adaptive skills in children and adolescents with autism spectrum disorder: Cross‐sectional developmental trajectories. Autism & Developmental Language Impairments, 3, 2396941518800775. https://doi.org/10.1177/2396941518800775Lawson, G. M., Duda, J. T., Avants, B. B., Wu, J., & Farah, M. J. (2013). Associations between children's socioeconomic status and prefrontal cortical thickness. Developmental Science, 16, 641–652. https://doi.org/10.1111/desc.12096Lawson, G. M., & Farah, M. J. (2017). Executive function as a mediator between SES and academic achievement throughout childhood. International Journal of Behavioral Development, 41, 94–104. https://doi.org/10.1177/0165025415603489Lawson, G. M., Hook, C. J., & Farah, M. J. (2018). A meta‐analysis of the relationship between socioeconomic status and executive function performance among children. Developmental Science, 21(2), e12529. https://doi.org/10.1111/desc.12529Lee, K., Bull, R., & Ho, M. ‐ H. (2013). Developmental changes in executive functioning. Child Development, 84, 1933–1953. https://doi.org/10.1111/cdev.12096Litkowski, E. C., Finders, J. K., Borriello, G. A., Purpura, D. J., & Schmitt, S. A. (2020). Patterns of heterogeneity in kindergarten children's executive function: Profile associations with third grade achievement. Learning and Individual Differences, 80, 101846. https://doi.org/10.1016/j.lindif.2020.101846Logan, G. D. (1994). On the ability to inhibit thought and action: A users’ guide to the stop signal paradigm. In: D. Dagenbach & T. H. Carr (Eds.), Inhibitory processes in attention, memory, and language (pp. 189–239). Academic Press.Luciana, M., & Nelson, C. A. (2002). Assessment of neuropsychological function through use of the Cambridge Neuropsychological Testing Automated Battery: Performance in 4‐ to 12‐year‐old children. Developmental Neuropsychology, 22, 595–624. https://doi.org/10.1207/S15326942DN2203_3Magimairaj, B. M. (2018). Parent‐rating vs performance‐based working memory measures: Association with spoken language measures in school‐age children. Journal of Communication Disorders, 76, 60–70. https://doi.org/10.1016/j.jcomdis.2018.09.001McAuley, T., Chen, S., Goos, L., Schachar, R., & Crosbie, J. (2010). Is the behavior rating inventory of executive function more strongly associated with measures of impairment or executive function? Journal of the International Neuropsychological Society, 16, 495–505. https://doi.org/10.1017/S1355617710000093McCoy, D. C. (2019). Measuring young children's executive function and self‐regulation in classrooms and other real‐world settings. Clinical Child and Family Psychology Review, 22, 63–74. https://doi.org/10.1007/s10567‐019‐00285‐1Miller‐Cotto, D., Smith, L. V., Wang, A. H., & Ribner, A. D. (2021). Changing the conversation: A Culturally responsive perspective on executive functions, minoritized children and their families. Infant and Child Development, 31(1), e2286. https://doi‐org.ezp.lib.cam.ac.uk/10.1002/icd.2286Miranda, A., Colomer, C., Mercader, J., Fernández, M. I., & Presentación, M. J. (2015). Performance‐based tests versus behavioral ratings in the assessment of executive functioning in preschoolers: Associations with ADHD symptoms and reading achievement. Frontiers in Psychology, 6, 545. https://doi.org/10.3389/fpsyg.2015.00545Miyake, A., & Friedman, N. P. (2012). The nature and organization of individual differences in executive functions: Four general conclusions. Current Directions in Psychological Science, 21, 8–14. https://doi.org/10.1177/0963721411429458Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., Howerter, A., & Wager, T. D. (2000). The unity and diversity of executive functions and their contributions to complex “frontal lobe” tasks: A latent variable analysis. Cognitive Psychology, 41, 49–100. https://doi.org/10.1006/cogp.1999.0734Monette, S., Bigras, M., & Guay, M. ‐ C. (2011). The role of the executive functions in school achievement at the end of grade 1. Journal of Experimental Child Psychology, 109, 158–173. https://doi.org/10.1016/j.jecp.2011.01.008Müller, U., Miller, M., Hutchison, S., & Eycke, K. T. (2017). Transition to school: Executive function, emergent academic skills, and early school achievement. In:(M. J. Hoskyn, G. Iarocci, & A. R. Young (Eds.), Executive functions in children's everyday lives: A handbook for professionals in applied psychology. (pp. 88–107). Oxford University Press. https://doi.org/10.1093/acprof:oSo/9780199980864.003.0007Nesbitt, K. T., Baker‐Ward, L., & Willoughby, M. T. (2013). Executive function mediates socio‐economic and racial differences in early academic achievement. Early Childhood Research Quarterly, 28(4), 774–783. https://doi.org/10.1016/j.ecresq.2013.07.005Nilsen, E. S., Huyder, V., McAuley, T., & Liebermann, D. (2017). Ratings of everyday executive functioning (REEF): A parent‐report measure of preschoolers’ executive functioning skills. Psychological Assessment, 29, 50–64. https://doi.org/10.1037/pas0000308Parsey, C. M., & Schmitter‐Edgecombe, M. (2013). Applications of technology in neuropsychological assessment. The Clinical Neuropsychologist, 27, 1328–1361. https://doi.org/10.1080/13854046.2013.834971Patel, J., Aldercotte, A., Tsapali, M., Serpell, Z. N., Parr, T., & Ellefson, M. R. (2021). The Zoo Task: A novel metacognitive problem‐solving task developed with a sample of African American children from schools in high poverty communities. Psychological Assessment, 33(8), 795–802. https://doi.org/10.1037/pas0001033Perry, N. E., Brenner, C. A., & Fusaro, N. (2015). Closing the gap between theory and practice in self‐regulated learning: Teacher learning teams as a framework for enhancing self‐regulated teaching and learning. In: T. J. Cleary (Ed.) Self‐regulated learning interventions with at risk populations: Academic, mental health, and contextual considerations (pp. 229–250). American Psychological Association.R core team (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R‐project.org/Rabbitt, P. M. (1966). Errors and error correction in choice‐response tasks. Journal of Experimental Psychology, 71(2), 264–272. https://doi.org/10.1037/h0022853Revelle, W. (2021). psych: Procedures for psychological, psychometric, and personality research. R package version 2.1.6. https://CRAN.R‐project.org/package=psychReynolds, C. R., & Kamphaus, R. W. (1992). BASC: Behavior assessment system for children. American Guidance Service.Reynolds, C. R., & Kamphaus, R. W. (2004). BASC‐2: Behavior assessment system for children. Pearson.Reynolds, C. R., & Kamphaus, R. W. (2015). BASC‐3: Behavior assessment system for children. Pearson.Riggs, N. R., Jahromi, L. B., Razza, R. P., Dillworth‐Bart, J. E., & Mueller, U. (2006). Executive function and the promotion of social‐emotional competence. Journal of Applied Developmental Psychology, 27, 300–309. https://doi.org/10.1016/j.appdev.2006.04.002Rogers, R., & Monsell, S. (1995). Costs of a predictable switch between simple cognitive tasks. Journal of Experimental Psychology: General, 124, 207–231. https://doi.org/10.1037/0096‐3445.124.2.207Rosseel, Y. (2021). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48, 1–36. http://www.jstatsoft.org/v48/i02/Rosvold, H. E., Mirsky, A. F., Sarason, I., Bransome, E. D. Jr., & Beck, L. H. (1956). A continuous performance test of brain damage. Journal of Consulting Psychology, 20, 343–350. https://doi.org/10.1037/h0043220Samuels, W. E., Tournaki, N., Blackman, S., & Zilinski, C. (2016). Executive functioning predicts academic achievement in middle school: A four‐year longitudinal study. The Journal of Educational Research, 109, 478–490. https://doi.org/10.1080/00220671.2014.979913Schumacker, R. E., & Lomax, R. G. (2016). A beginner's guide to structural equation modeling (4th ed.). Routledge.Servera, M., & Cardo, C. (2006). Children sustained attention task (CSAT): Normative, reliability, and validity data. International Journal of Clinical and Health Psychology, 6, 697–707.Sherman, E. M. S., & Brooks, B. L. (2010). Behavior Rating Inventory of Executive Function – Preschool Version (BRIEF‐P): Test review and clinical guidelines for use. Child Neuropsychology, 16, 503519. https://doi.org/10.1080/09297041003679344Soto, E. F., Kofler, M. J., Singh, L. J., Wells, E. L., Irwin, L. N., Groves, N. B., & Miller, C. E. (2020). Executive functioning rating scales: Ecologically valid or construct invalid? Neuropsychology, 34, 605–619. https://doi.org/10.1037/neu0000681Sulik, M. J., Huerta, S., Zerr, A. A., Eisenberg, N., Spinrad, T. L., Valiente, C., Di Giunta, L., Pina, A. A., Eggum, N. D., Sallquist, J., Edwards, A., Kupfer, A., Lonigan, C. J., Phillips, B. M., Wilson, S. B., Clancy‐Menchetti, J., Landry, S. H., Swank, P. R., Assel, M. A., & Taylor, H. B. (2010). The factor structure of effortful control and measurement invariance across ethnicity and sex in a high‐risk sample. Journal of Psychopathology and Behavioral Assessment, 32, 822. https://doi.org/10.1007/s10862‐009‐9164‐yThorell, L. B., & Nyberg, L. (2008). The childhood executive functioning inventory (CHEXI): A new rating instrument for parents and teachers. Developmental Neuropsychology, 33, 536–552. https://doi.org/10.1080/87565640802101516Tong, X., Zhang, Z., & Yuan, K. ‐ H. (2014). Evaluation of test statistics for robust structural equation modeling with nonnormal missing data. Structural Equation Modeling: A Multidisciplinary Journal, 21, 553565. https://doi.org/10.1080/10705511.2014.919820Toplak, M. E., West, R. F., & Stanovich, K. E. (2013). Practitioner review: Do performance‐based measures and ratings of executive function assess the same construct? Journal of Child Psychology and Psychiatry, 54, 131–143. https://doi.org/10.1111/jcpp.12001van Tetering, M. A. J., & Jolles, J. (2017). Teacher evaluations of executive functioning in schoolchildren aged 9–12 and the influence of age, sex, level of parental education. Frontiers in Psychology, 8, 481. https://doi.org/10.3389/fpsyg.2017.00481Weintraub, S., Dikmen, S. S., Heaton, R. K., Tulsky, D. S., Zelazo, P. D., Bauer, P. J., Carlozzi, N. E., Slotkin, J., Blitz, D., Wallner‐Allen, K., Fox, N. A., Beaumont, J. L., Mungas, D., Nowinski, C. J., Richler, J., Deocampo, J. A., Anderson, J. E., Manly, J. J., Borosh, B., …, & Gershon, R. C. (2013). Cognition assessment using the NIH Toolbox. Neurology, 80, S54S64. https://doi.org/10.1212/WNL.0b013e3182872dedWelsh, M. C. (1991). Rule‐guided behavior and self‐monitoring on the Tower of Hanoi disk‐transfer task. Cognitive Development, 6, 59–76. https://doi.org/10.1016/0885‐2014(91)90006‐YWiebe, S. A., Espy, K. A., & Charak, D. (2008). Using confirmatory factor analysis to understand executive control in preschool children: I. Latent structure. Developmental Psychology, 44, 575–587. https://doi.org/10.1037/0012‐1649.44.2.575Xu, C., Ellefson, M. R., Ng, F., Wang, Q., & Hughes, C. (2020). An East‐West contrast in executive function: Measurement invariance of computerized tasks in school‐aged children and adolescents. Journal of Experimental Child Psychology, 199, 104929. https://doi.org/10.1016/j.jecp.2020.104929Yuan, K. ‐ H., & Bentler, P. M. (1998). Normal theory based test statistics in structural equation modelling. British Journal of Mathematical and Statistical Psychology, 51, 289–309. https://doi.org/10.1111/j.2044‐8317.1998.tb00682.xZelazo, P. (2006). The Dimensional Change Card Sort (DCCS): A method of assessing executive function in children. Nature Protocols, 1, 297–301. https://doi.org/10.1038/nprot.2006.46Zelazo, P. D., Anderson, J. E., Richler, J., Wallner‐Allen, K., Beaumont, J. L., & Weintraub, S. (2013). ii. NIH toolbox cognition battery: Measuring executive function and attention. Monographs of the Society for Research in Child Development, 78, 16–33. https://doi.org/10.1111/mono.12032 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Developmental Science Wiley

Executive function measurement in urban schools: Exploring links between performance‐based metrics and teacher ratings

Loading next page...
 
/lp/wiley/executive-function-measurement-in-urban-schools-exploring-links-P7P06m9mNN

References (1)

Publisher
Wiley
Copyright
© 2022 John Wiley & Sons Ltd.
ISSN
1363-755X
eISSN
1467-7687
DOI
10.1111/desc.13319
Publisher site
See Article on Publisher Site

Abstract

INTRODUCTIONThere is a growing body of literature that investigates executive functions of children in ethnic minority families from high‐poverty communities (e.g., Choi et al., 2016; Lawson et al., 2018; 2013; Nesbitt et al., 2013). However, this population is often excluded from psychometric and cognitive development research. While some attempts have been made in recent years to norm standardized executive function tasks using nationally representative samples (e.g., Flores et al., 2017; Weintraub et al., 2013; Zelazo et al., 2013), much of the existing psychometric research overrepresents children from white, middle‐income, or affluent families. Far less is known about whether these findings can be generalized to children from ethnic minority families in low‐income communities. A critical part of understanding current views about the universality of executive function development must involve this important group of children (Miller‐Cotto et al., 2021). For example, recent data indicate that approximately 14% of children from the United States live in poverty, nearly 50% are from ethnic minority families, and approximately 71% of children living in poverty are from ethnic minority families (Children's Defense Fund, 2021). However, reaching ethnic minority families from low‐income communities for research is challenging. One way to promote their participation in developmental research is to run studies in collaboration with their local schools, providing excellent opportunities to include teacher ratings of executive function‐related behaviors in the classroom. However, relatively little is known about how well teacher rating scales capture executive functions in low‐income, ethnic minority communities compared to those of their White, more affluent peers.Executive functioning is often thought to involve higher‐order mental processes needed to facilitate goal‐oriented behavior through conscious control of thoughts and actions (Diamond, 2001; Miyake et al., 2000). However, authors often conceptualize these skills differently based on their own theoretical backgrounds and there is not yet a full consensus on a single definition (Jacob & Parkinson, 2015; Nilsen et al., 2017). Despite the widespread challenges in defining executive functions, common components include working memory, inhibition, and shifting (Miyake et al., 2000; Riggs et al., 2006). Each component provides an element of purposeful self‐directed behavior, making each important for cognitive development.Performance‐based tasksAn increasing number of studies are looking at the development of executive functions using performance‐based tasks. Generally, these studies find skills increase from early childhood into young adulthood (e.g., Lee et al., 2013). Individuals with better performance‐based scores (compared to their age‐matched peers) tend to do better on various academic metrics. For example, they have more successful transitions into school (e.g., Müller et al., 2017), better academic achievement in elementary (e.g., Monette et al., 2011) and secondary school (e.g., Samuels et al., 2016), better mental health (e.g., Diamond, 2013), and better long‐term employment potential (e.g., Bailey, 2007). Children entering school with better cognitive skills seem to be in a better position to grow academically and progress, while children with poor cognitive skill development seem to fall behind further over time (e.g., Abenavoli et al., 2015). Given the key role of executive functions in child development, it is important to understand the nature and course of its development across a wide range of children.There are variations across different groups of children that help to better understand the course of executive function development. Children facing great adversity like developmental disabilities (e.g., Hosenbocus & Chahal, 2012), extreme levels of stress (e.g., Blair & Raver, 2016), or high levels of poverty (e.g., Lawson et al., 2013) have lower skills. More recent large‐scale longitudinal studies afford a better look at the development of these skills across a wider range of children (e.g., NICHD Study of Early Child Care and Youth Development in the United States and the Millennial Cohort study in the United Kingdom), but performance‐based task data collection remains time/staff intensive.Behavior rating scalesGiven the importance of better understanding the development of executive functions and resource‐heavy constraints of performance‐based tasks, there have been attempts to develop alternatives. Commonly called, “everyday measures,” these ecologically valid metrics have been developed to assess executive function abilities through day‐to‐day behaviors in real‐world contexts (Kouklari et al., 2018). One useful approach involves metrics that can be completed by parents, caregivers, or teachers rather than trained researchers or clinical/educational psychologists. However, developing such metrics made it clear that although performance‐based tasks have strong links to everyday outcomes, they are not necessarily sensitive to day‐to‐day executive functioning, meaning that they tend to have low ecological validity (Burgess et al., 2006). Behavior rating scales became an effective way to gauge everyday executive functioning for younger populations. They can provide a unique perspective on different elements that performance‐based tasks may not measure (e.g., Isquith et al., 2004; Soto et al., 2020). In addition, they can be completed by parents or teachers because they have more extensive experience with the child.There are a few different examples of behavior rating scales that tap directly into everyday executive functioning (e.g., Behavior Rating Inventory of Executive Functions, Gioia et al., 2000; Child Executive Functioning Inventory, Thorell & Nyberg, 2008; Behavior Assessment System for Children, Reynolds & Kamphaus, 1992, 2004, 2015). The present study focuses on the Behavior Assessment System for Children (BASC; Reynolds & Kamphaus, 1992, 2004, 2015). The BASC has been used in a variety of developmental and clinical settings for a few decades, but it is only recently that a subset of items has been used for everyday executive functioning (Garcia‐Barrera et al., 2011; Karr & Garcia‐Barrera, 2017). It has been validated with clinical samples in the United States (e.g., Soto et al., 2020). There are parent and teacher versions for participants ages 2–25 years (teacher version only to 18 years).Comparing performance‐based and rating scalesThere are now a good number of studies comparing performance‐based tasks with rating scales, but the results are not consistent. Some studies have small sample sizes (n < 200; e.g., Choi et al., 2016; McAuley et al., 2010), often due to clinical samples that could lead to Type II errors. Large, sufficiently powered, nationally representative samples, from a variety of contexts (e.g., Finders et al., 2021; Litkowski et al., 2020) are important for a better understanding of how well these different metrics can be used to measure skill development. Additionally, some studies indicate that behavior rating scales have better reliability and ecological validity than performance‐based measures (Barkley & Murphy, 2010). Others suggest that performance‐based tasks and rating scales assess different aspects of executive functions and could be complementary (Karr & Garcia‐Barrera, 2017). Toplak et al. (2013) ran a meta‐analysis on 68 studies and found a median correlation of r = 0.18 between performance‐based and rating scales. However, they did not split their sample by type of rating scale or whether the study used parent or teacher versions. Studies that do split these two versions seem to suggest that the teacher version is more reliably correlated to performance‐based tasks than the parent version (e.g., Gutierrez et al., 2021). Teachers might be more able to assess a student more objectively and have a better grasp of the student's attitude towards learning and disruptive behavior that may signal executive functioning difficulties (van Tetering & Jolles, 2017).While there is a body of work that examines the relationships between performance‐based tasks and parent/teacher rating scales (e.g., McAuley et al., 2010; McCoy, 2019; Sherman & Brooks, 2010; Sulik et al., 2010), most of these studies of typically developing children include samples that are from predominantly White, affluent backgrounds (e.g., Magimairaj, 2018; Miranda et al., 2015). We could find only one study with a high proportion of children from ethnic minority or low‐income backgrounds by Camerota et al. (2018), who found good links between the CHEXI rating scale and a battery of computerized performance‐based tasks in a large sample (N = 844) of children ages 3–5 years, with 46% from families considered at or below the United States federal government poverty threshold and a good mix of racial and ethnic backgrounds (60% White, 31% African American, 20% Hispanic, 7% Asian American, 1% Native American, 1% Pacific Islander). However, although statistically significant, the correlations were small effect‐sized (r = 0.10). Establishing a universal theory about everyday executive functioning requires samples from a wide range of typically developing backgrounds. Furthermore, it is important that attention is placed on the metrics being used to assess cognitive skills as it may not be appropriate to use the same assessments with individuals of various racial/ethnic minority groups and socioeconomic backgrounds.The current studyThe current study is designed to address specific gaps in the evaluation of everyday executive functioning in children with the aim of understanding whether behavior rating scales capture distinctions between performance‐based measures.First, we include two large samples of older children largely from ethnic minority families living in high‐poverty communities because they are an important part of understanding the development of these essential cognitive skills.Second, we compare two versions of a behavior rating scale, the BASC‐2 and BASC‐3, completed by teachers with computerized performance‐based tasks. Despite the existence of a newer version of the BASC (BASC‐3), the BASC‐2 behavior rating scale is included in the present study for two reasons; (1) the BASC‐3 was not available when Sample 1 data collection began; and (2) many archived and longitudinal datasets use the BASC‐2. The behavior rating scales tap into different behaviors exhibited by students and fall into four sub‐scales: problem solving, behavioral control, attentional control, and emotional control. We selected computerized tasks because of their relevance to this age group and because they can be administered to larger groups of students, making the large sample size possible.Third, we compare models where the performance‐based tasks make individual predictions with those where they are mapped onto one latent construct to gain a sense of how well behavior rating scales capture the distinctions between performance‐based tasks. Studies using pairwise correlations between each rating scale and performance‐based task tend to show nonsignificant correlations (e.g., Davidson et al., 2016; Gross et al., 2015), whereas those using composite scores for performance‐based tasks sometimes show stronger correlations (e.g., r = 0.30 in Soto et al., 2020). In line with existing research, we hypothesize that behavior rating scales are reliant upon behaviors that use multiple executive function skills and might not fully capture the distinctions between performance‐based tasks (Isquith et al., 2013; Soto et al., 2020; Toplak et al., 2013).METHODParticipantsSample 1Sample 1 included 243 older children (N = 243; Mage = 9.28 years, SDage = 0.80; nfemale = 125), who are predominantly African American (n = 216), with students from Latin American (n = 14), Asian American (n = 6), Pacific Islander (n = 5) and White (n = 3) backgrounds. Most qualified for free and reduced lunch (n = 241). Data were collected during after‐school hours from elementary schools in high‐poverty urban areas in the eastern United States. Participants were recruited from 12 public schools where administrators agreed to participate and host the program. The data reported here are from the baseline testing of a larger sample of third to fifth‐grade students participating in an after‐school program to learn how to play chess (see https://osf.io/yac8e/ for a summary of the funded project). The overall project received ethical review from multiple institutions: University of Cambridge's Psychology Research Ethics Committee (IRB 2011.39), Virginia State University (IRB 1011–37), and Virginia Commonwealth University (IRB HM20000017).Sample 2The second sample included 229 older children (N = 229; Mage = 10.02 years, SDage = 1.01; nfemale = 120), who are predominantly African American (n = 132), with students from Latin American (n = 92), White (n = 3), and Pacific Islander (n = 1) backgrounds. All students qualified for free and reduced lunch (n = 229). Data were collected during school hours from elementary schools in a high‐poverty urban area in the midwestern United States. Participants were recruited from three public schools where administrators and teachers agreed to participate during school hours. Data collection for Sample 2 received ethical review from the University of Cambridge's Faculty of Education Ethics Committee.Sample recruitmentRecruitment procedures were similar for both samples; study invitations included flyers sent home, teacher announcements, and advertising during after‐school professional development sessions. Parental consent and child assent were collected from each participant. In Sample 1, children received a $10 gift card and a small prize for completing the computerized tasks. In Sample 2, children received two small prizes for completing the computerized tasks. BASC rating forms were presented to teachers in a secure envelope; teachers completed one rating scale for each student and returned them to the researcher to maintain confidentiality. Teachers were asked to complete ratings based on how students behaved in school over the previous several months. As an incentive, teachers in Sample 1 were compensated $10 for each completed student rating form. For Sample 1, BASC2EF and performance‐based data were collected within 4 weeks, and within 1 week for Sample 2.Materials and proceduresTeacher behavior rating scalesSample 1BASC2EF (Karr & Garcia‐Barrera, 2017) is part of an ecologically valid (BASC‐2; Reynolds & Kamphaus, 2004), psychometrically sound teacher rating scale that includes 33 items about executive functioning in a classroom setting verified in cross‐cultural and clinical samples (Garcia‐Barrera et al., 2011; 2013; 2015). Classroom teachers completed a full BASC2 for each child, but only BASC2EF items are used here. Karr and Garcia‐Barrera (2017) evaluated the psychometric properties of the BASC2EF teacher version and found strong internal consistency (Cronbach's α range 0.81 to 0.89) for four subscales: (1) behavioral control includes items on distractibility and disruptive/harmful behavior towards peers; (2) attentional control includes items on following directions and keeping focused; (3) problem solving includes items on a child's ability to practice good study routines, cooperate with classmates, stay organized, make decisions, solve problems, and work in stressful settings; and (4) emotional control includes items on the propensity for anger or getting upset as well as adjustment to changing/stressful circumstances. The subscales contained 12, 7, 9, and 5 items, respectively. Responses to each item are given using a four‐point Likert scale: never, sometimes, often, and always.Sample 2Like BASC2EF, BASC3EF is a psychometrically sound (BASC‐3; Garcia‐Barrera et al., 2011; 2013; 2015; Goldstein & Naglieri, 2014) teacher rating scale that includes 31 items about executive functioning in a classroom setting and includes the same subscales as BASC2EF (Reynolds & Kamphaus, 2015). The items in each subscale changed slightly from the BASC2EF to the BASC3EF, with several items updated or removed. There is strong internal consistency (Cronbach's α range 0.90 to 0.94) for the four subscales: (1) behavioral control did not include new items, but continued to include items on self‐control and distractibility, and had fewer items about harmful and disruptive behavior; (2) attentional control added more items about following directions and keeping focused; (3) problem solving removed items on good study routines, cooperating with classmates, staying organized, making decisions, and working in stressful settings, and instead targeted planning and analyzing problems; and (4) emotional control removed items about adapting to changing circumstances while centering on adjustment to stressful situations and propensity for anger. The subscales contained 7, 8, 9, and 7 items, respectively.Computerized performance‐based tasksChildren from both samples completed the same six performance‐based tasks using the secured Thinking Games website (http://instructlab.educ.cam.ac.uk/TGsummary/) to allow for in‐school administration of tasks to large groups and streamlined data management (materials openly available from https://osf.io/whzrg/). Each task took about 5–7 min to complete, with task order determined by the logistics for the wider projects. Sample 1 participants completed tasks in varied orders over multiple days as part of a large battery of tasks administered during the after‐school club, Sample 2 completed tasks in the same order and in one session during the regular school day.Participants were encouraged to respond as quickly as possible while still being accurate. Accuracy and response time were collected for each trial on every task. Overall accuracy and response times to correct items were used to compute an efficiency score for each task (Ellefson et al., 2017; see Equation 1).1AccuracynumbercorrectresponsesTimeinseconds,tomakecorrectresponse\begin{equation}\frac{{Accuracy\;\left( {number\;correct\;responses} \right)}}{{Time\;\left( {in\;seconds,\;to\;make\;correct\;response} \right)}}\end{equation}Efficiency is a way to incorporate both accuracy and speed while avoiding extensively non‐normal distributions usually seen in these tasks for accuracy (ceiling effects) and response time (long tails). The expected response times and number of trials make the efficacy scores vary substantially across tasks, as such z‐scores are computed for each task and used in the analyses.These computerized tasks have been used extensively in cognitive psychology, cognitive development, and cognitive neuroscience. They have not been standardized in the same way that clinical neuropsychology tasks might be, as they have been developed for slightly different purposes. There is a large literature base that supports the use of performance‐based tasks derived from a cognitive psychology perspective for measuring executive function skills (Bignardi et al., 2021; Diamond et al., 2007; Espy et al., 2001; Zelazo, 2006). The tasks described below are commonly used tasks to measure the specific executive function skills linked to them (Luciana & Nelson, 2002; Parsey & Schmitter‐Edgecombe, 2013; Patel et al., 2021).Inhibition – Stop signal taskThis child‐friendly version of the original stop signal task (Logan, 1994) presents an image of a soccer field with a ball centrally positioned either on the left‐ or right‐hand side of the computer screen. The task includes 108 trials (presented in three blocks). Participants are instructed to use their keyboards to click the left arrow key when the ball is on the left‐hand side of the screen and the right arrow key when the soccer ball is on the right‐hand side of the screen. Left‐hand and right‐hand trials are divided equally (54 trials each). Participants are instructed to refrain from clicking either arrow when they hear a referee's whistle. The referee's whistle is played randomly on 20% of the total trials. In line with standard stop signal procedures, the time gap between the presentation of the image and the presentation of the whistle sound is either increased or decreased depending on participant accuracy.Sustained attention – Continuous performance taskThis child‐friendly version of the children sustained attention task (Servera & Cardo, 2006) is a computerized task based on the continuous performance tests paradigm (Rosvold et al., 1956). The current version included 300 trials. During each trial, a random numeral between one and nine was presented in the middle of the screen. Participants pressed the spacebar for each numeral, except when they see the number four. Each numeral appeared with equal probability.Working memory – Spatial span taskThis child‐friendly adaptation of the Corsi blocks tasks (Corsi, 1972) is split into two parts: forward patterns (presented first) and backwards patterns. The computer screen displays an array of nine boxes for each trial. Boxes briefly light up in a pre‐selected order. Participants are asked to click on the boxes either in the same order (forwards) or the reverse order (backwards) that they appeared. Only the backwards version is included here. After two practice items (each with two boxes lighting up), participants receive trials of increasing length, completing two trials each with lengths of 3‐ to 7‐box sequences. The task stops automatically after five consecutively incorrect trials.Shifting – Figure matching taskThis task is a slightly modified presentation of Ellefson et al. (2006) and is suitable for children. The task includes 128 trials, each with four simultaneous events. A target figure is presented in the center of the screen and varies by shape (triangle or circle) and/or color (blue or red). In each lower corner of the screen, there is a small figure; one matched the shape of the target and the other matched the color of the target. Participants follow the cue to sort by shape or color by pressing one of two keys on the computer keyboard. The 128 trials are presented randomly within four 32‐trial sets, counter‐balanced between participants. There are two pure sets (either all color trials or all shape trials) and two mixed sets with color and shape trials presented using an alternating‐runs sequencing (Rogers & Monsell, 1995) that changes tasks every two trials (e.g., color‐color‐shape‐shape‐color‐color‐shape‐shape).Planning – Tower of Hanoi taskThis is a computerized version of the task used by Welsh (1991). The Tower of Hanoi task is often referred to as a “disk transfer” task, as participants are asked to transform a start state of disks into a goal configuration in as few moves as possible with a set of rules imposed that restrain the way disks may be moved (i.e., larger disks cannot be placed on smaller disks). The minimum number of moves needed to transform the bottom set to match the top set increases with each successful trial. The increased number of minimum moves increases the overall task difficulty. Participants are given a practice problem and offered feedback for illegal moves. Once the practice trial has been completed, participants are given six 3‐disk problems, including 2‐ to 7‐move problems. This is followed by three 4‐disk items, including 7‐, 11‐, and 15‐move problems. To progress onto the more difficult problems, participants must make two consecutive minimum‐move solutions. Participants have a maximum of 20 moves to match the goal arrangement before being offered a new problem (with a maximum of six attempts to achieve two consecutive minimum‐move solutions). The task ends when participants have either successfully solved all problems or when they reached a problem that they could not solve twice in a row within six attempts.Executive decision making – Hungry donkey taskThis child‐friendly version of the Iowa Gambling task measures participants' ability to make decisions under uncertain circumstances, using logic‐based, cost‐benefit analyses (Crone & van der Molen, 2004). Participants help a hungry donkey collect as many apples as possible by choosing one of the four doors on the screen for each trial. Doors open by pressing the ‘a’, ‘s’, ‘d,’ or ‘f' keys. Participants win or lose apples when they open a door. Two doors are advantageous (win more apples than lose) and two are disadvantageous (lose more apples than win). For each set of advantageous/disadvantageous doors, one door produces small gains and small losses, while the other large rewards and large losses. The assignment of the doors was counterbalanced across participants. Accuracy is measured here as the total number of advantageous doors selected.DATA ANALYSISWe used structural equation modeling (SEM) to test links between BASC2EF and computerized performance‐based tasks (Figure 1). We ran analyses using psych (Revelle, 2021), lavaan (Rosseel, 2021) and SEMtools (Jorgensen et al., 2021) packages for R (R core team, 2019). The analyses were pre‐registered; the raw data and R scripts are openly available (https://osf.io/whzrg/). To identify the factor structure for the computerized performance‐based tasks and BASC2EF rating subscales, we ran reliability tests using Cronbach's alpha and measurement models using exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). We ran additional models that had poor fits (including ones preregistered for Sample 1); they are not reported here. Due to space limitations, descriptive statistics, correlations amongst individual items on the BASC2EF and BASC3EF, measurement models for the performance‐based tasks and teacher ratings (Samples 1 and 2), goodness‐of‐fit statistics for the measurement models, and the results of any other models tested that had poor fits are not reported here but are openly available (https://osf.io/whzrg/).1FIGURESEM models testing the links between performance‐based computerized tasks and behavior rating scalesInitial models run with all BASC2EF items were poor fits. We decided to look at the specific items in BASC3EF and noticed that many of the problematic items from BASC2EF were not in BASC3EF. BASC2EF model fits improved after removing the items not in BASC3EF. BASC2EF models for which items not appearing in the BASC3EF have been removed are labeled ‘adapted’ models. These models are reported here. BASC3EF was not available when we started the Sample 1 study. Based on the post‐hoc, BASC2EF results, we collected Sample 2 data using BASC3EF so that we could test how well BASC2EF models applied to BASC3EF. We report BASC3EF models with all items (note – BASC3EF models with only items in both versions had similar fits).Before running the analyses, we took a few preliminary steps. We converted Likert scale responses from the rating scales to numerical values, with 1 for never, 2 for sometimes, 3 for often, and 4 for always. Reverse coding was used so that all items used positive valent scales so that higher scores represented frequent positive behaviors. Efficiency scores were computed for each of the computerized performance‐based tasks. We converted efficiency scores for each performance‐based task to standardized z‐scores because the variability in scores due to the number of trials and time needed to complete the items across tasks created is problematic for SEM. Following procedures adopted by Lawson and Farrah (2017), we created age‐adjusted data by taking residuals from a linear regression with age as the predictor and each individual performance‐based efficiency score and BASC item as an outcome. Missing data were classified as missing at random and were replaced using full information maximum likelihood estimation (FIML; Allison, 2012; Enders, 2013). We adjusted for non‐normal distribution in some of the measured variables using the Yuan‐Bentler MLR estimation and reported only the scaled values for the fit indices (Bentler & Yuan, 1999; Brown, 2015; Tong et al., 2014; Yuan & Bentler, 1998). The goodness‐of‐fit of each model was assessed using well‐accepted criteria for SEM: a non‐significant chi‐square value (CFI > 0.90, TLI > 0.90, RMSEA ≤ 0.07, and SRMR < 0.08; e.g., Cheung & Rensvold, 2002; Hooper et al., 2008; Schumacker & Lomax, 2016). The Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC) were used to compare model fit for models with different structures (lower values = better fit).RESULTSPairwise Pearson correlations on the age‐adjusted data for each performance‐based task and mean response scores for each item of the BASC2EF/BASC3EF subscales show many statistically significant correlations, although most are small effect sizes (Table 1; r < 0.30).1TABLECorrelations amongst the performance‐based tasks and teacher rating subscales (using age‐adjusted data)Stop signalContinuous performanceSpatial spanFigure matchingTower of HanoiHungry donkeyBehavioral controlAttentional controlProblem solvingEmotional controlStop signal‐ ‐0.050.18*0.47***0.10−0.04−0.050.060.09−0.10Continuous performance21*‐ ‐0.13*0.030.06−0.050.03−0.02−0.040.02Spatial span0.17*0.11‐ ‐0.34***0.38***−0.14*0.17**0.22***0.23***0.17*Figure matching0.120.080.13‐ ‐0.23**−0.070.130.31***0.32***0.02Tower of Hanoi0.050.040.17*0.12‐ ‐−0.120.050.120.14*0.05Hungry donkey0.090.17*0.20***0.000.20**‐ ‐−0.17*−0.20*−0.21**−0.18**Behavioral control0.21**0.090.29***0.19**0.090.00‐ ‐0.80***0.56***0.83***Attentional control0.19**0.120.35***0.23**0.10−0.020.76***‐ ‐0.81***0.62***Problem solving0.20**0.18**0.33***26***0.15*0.0059***0.81***‐ ‐0.49***Emotional control0.17*0.010.23***0.17*0.12−0.080.68***0.63***0.65**‐ ‐Notes. BASC2EF (n = 243) below the diagonal; BASC3EF (n = 229) above the diagonal, with * p < 0.05, ** p < 0.01, *** p < 0.001.Correlations are large effect sized for values > 0.50, medium effect sized for values > 0.30 and small effect sized for values >0 .10.Cronbach's alpha tests indicated high reliability between the age‐adjusted items for the BASC2EF (behavior control = 0.90, attentional control = 0.93, problem solving = 0.81, emotional control = 0.83) and BASC3EF factors (behavior control = 0.96, attentional control = 0.96, problem solving = 0.96, emotional control = 0.95).The age‐adjusted models with the full set of BASC2EF items (see Table 2) are a poor fit compared to the adapted BASC2EF models (Models 1a and 1b, Table 2 and Figure 2).2TABLEStructural equation models goodness‐of‐fit statisticsModelX2dfRMSEACFITLISRMRAICBICSample 1Individual performance‐based tasks with original BASC2EF1400.24<0.0016630.070.860.850.071981120348Single latent for performance‐based tasks with original BASC2EF1440.90<0.0016920.070.860.850.071979020227Model 1a: Individual performance‐based tasks with adapted BASC2EF565.33<0.0012850.070.910.890.051406914482Model 1b: Single latent for performance‐based tasks with adapted BASC2EF602.99<0.0013140.060.910.900.061404514358Sample 2Model 2a: Individual performance‐based tasks with BASC3EF1167.09<0.0015900.070.920.910.051569716210Model 2b: Single latent for performance‐based tasks with BASC3EF1220.80<0.0016190.070.920.910.071569216106Notes. Scaled values are reported for all fit indices because the data were non‐normally distributed.p‐values for X2 are listed as superscripts; df = degrees of freedom; RMSEA = scaled root mean square of approximation; CFI = scaled comparative fit indices; TLI = scaled Tucker and Lewis’ index; SRMR = Standardized Root Mean Square Residual using Bentler's formula; AIC = Akaike information criterion; BIC = Bayesian information criterion (sample‐size adjusted).Data presented here have been adjusted for age. Full results are openly available from https://osf.io/whzrg/.2FIGURELinks between performance‐based and behavior ratings of executive functionsBASC2EF and BASC3EF capture different components of performance‐based tasksSample 1Model 1a (Figure 2) indicates that many of the links between individual computerized performance‐based tasks and BASC2EF factors were not significant. Span task (working memory) efficiency scores were significantly linked with behavioral control (β = 0.26, p < 0.001), attentional control (β = 0.35, p < 0.001), problem solving (β = 0.33, p < 0.001), and emotional control (β = 0.25, p < 0.001). Figure matching (shifting) scores were significantly linked with behavioral control (β = 0.14, p = 0.01), attentional control (β = 0.19, p = 0.003), problem solving (β = 0.20, p = 0.001), but not emotional control (β = 0.12, p = 0.07). Hungry donkey (executive decision making) efficiency scores had significant inverse links with problem solving (β = ‐0.14, p = 0.03), and emotional control (β = ‐0.17, p = 0.02), but not attentional control (β = ‐0.12, p = 0.06) or behavioral control (β = ‐0.08, p = 0.19). Stop signal (inhibition) efficiency scores were significantly linked with emotional control (β = 0.16, p = 0.04), but not behavioral control (β = 0.13, p = 0.05), attentional control (β = 0.12, p = 0.09) or problem solving (β = 0.13, p = 0.10). Continuous performance (sustained attention) efficiency scores were not significantly linked to any BASC2EF factors (β = [0.03, 0.07, 0.12, 0.02] p = [0.58, 0.32, 0.12, 0.78]). Tower of Hanoi task (planning) efficiency scores were also not significantly linked to any BASC2EF factors (β = [0.04, 0.03, 0.08, 0.09] p = [0.19, 0.74, 0.29, 0.31]).In contrast, Model 1b (Figure 2) indicates strong links between the overall performance‐based latent and each BASC2EF factor: behavioral control (β = 0.48, p = 0.002), attentional control (β = 0.59, p = 0.001), problem solving (β = 0.64, p = 0.002), and emotional control (β = 0.44 p = 0.01).Sample 2Model 2a (Figure 2) indicates that many of the links between individual computerized performance‐based tasks and BASC3EF factors were not significant. Span task (working memory) efficiency scores were significantly linked with behavioral control (β = 0.15, p = 0.03) and emotional control (β = 0.15, p = 0.03), but not with problem solving (β = 0.12, p = 0.09) or attentional control (β = 0.13, p = 0.06). Hungry donkey (executive decision making) efficiency scores had significant inverse links with behavioral control (β = ‐0.16, p = 0.03), attentional control (β = ‐0.17, p = 0.01), problem solving (β = ‐0.18, p = ‐0.01), and emotional control (β = ‐0.15, p = 0.04). Figure matching (shifting) scores were significantly linked with behavioral control (β = 0.15, p = 0.04), attentional control (β = 0.32, p < 0.001), and problem solving (β = 0.30, p < 0.001), but not emotional control (β = 0.04, p = 0.55). Stop signal (inhibition) efficiency scores had significant inverse links with behavioral control (β = ‐0.15, p = 0.04) and emotional control (β = ‐0.16, p = 0.03) but not attentional control (β = ‐0.12, p = 0.11) and problem solving (β = ‐0.07, p = 0.30). Continuous performance (sustained attention) efficiency scores were not significantly linked to any BASC3EF factors (β = [0.01, ‐0.05, ‐0.07, ‐0.002] p = [0.87, 0.42, 0.18, 0.97]). The same was true for Tower of Hanoi task (planning) efficiency scores (β = [‐0.04, ‐0.005, 0.02, ‐0.03] p = [0.54, 0.93, 0.77, 0.69]).In contrast, Model 2b (Figure 2) indicates significant links between the overall performance‐based latent with attentional control (β = 0.37, p < 0.001) and problem solving (β = 0.39, p < 0.001), but not with emotional control (β = 0.03 p = 0.73) or behavioral control (β = 0.17, p = 0.08).DISCUSSIONThere is a dearth of executive function research with samples from high‐poverty, ethnic minority communities. The results of this two‐sample study are consistent with other studies using computerized performance‐based tasks in more affluent schools (e.g., Ellefson et al., 2020; Xu et al., 2020) and contributes two key findings. First, BASC2EF in its original form is an adequate, but not excellent measure of everyday executive function behaviors by children from schools in high‐poverty communities; restricting analyses to only items included in BASC3EF or using BASC3EF is best practice. Second, BASC3EF seems better able to capture the different components of performance‐driven tasks, whereas BASC2EF captures overall executive functioning better than individual tasks.BASC3EF is more appropriate for high‐poverty, ethnic minority samplesThe post‐hoc approach used with adapted BASC2EF indicated that the better fitting models are more aligned with the structure of BASC3EF, rather than BASC2EF. In addition to the removal of several items that appear in BASC2EF, BASC3EF has some new items that were not included in earlier versions. The inclusion of new items and the removal of specific items from BASC2EF improved the fit for data from these two age‐adjusted, high‐poverty, ethnic minority samples. This finding suggests that BASC2EF in its original form might be a good fit for some populations, but it is not the best option for high‐poverty samples.The original BASC2EF might not have an excellent fit for a variety of reasons. One option could be that the items do not apply equivalently across all groups of children. Another is that the BASC3EF items are more aligned with current understanding of behaviors related to everyday executive functioning skills. Even though the BASC‐3 is the most up‐to‐date version of the BASC itself, the BASC2EF is the more recent measure, established after BASC3EF. Karr and Garcia‐Barrera (2017) acknowledge that the existence of BASC3EF could mean that the derived BASC2EF is used less frequently than BASC3EF. They derived the BASC2EF because of its popularity and the likely existence of archived datasets that necessitate an embedded, psychometrically supported instrument for the assessment of executive functions.Even though BASC2EF is a reliable and valid measurement of executive function, those psychometric properties are based on a predominantly white, middle‐income sample. The current study suggests these findings do not generalize to ethnic minority children in high‐poverty contexts. Even though the BASC2EF might be an outdated measure, it will likely continue to show up in archived datasets and longitudinal work, making this line of research on BASC2EF important despite the existence of BASC3EF. Based on our findings, we recommend that analyses of the BASC2EF with archived or longitudinal datasets from high‐poverty settings should use similar procedures to remove BASC2EF, not found in BASC3EF. Taken together, these findings have implications for how executive function skills are conceptualized and measured.BASC3EF is better able to capture distinct elements of performance‐based tasksEveryday executive functioning skills measured by BASC2EF, are best linked to an overall latent of computerized performance‐based tasks. This finding corroborates Duckworth and Yeager (2015), suggesting that behavior rating scales are likely looking at behaviors that require a combination of different executive function skills, so they are represented by a global ability. It is likely that classroom behaviors are mediated by multiple executive functions, so some behavior rating scales may not be as good at capturing the distinctive skills as well as performance‐based tasks. For example, the attentional control subscale could involve working memory, inhibition, and switching skills. However, everyday settings do not make it obvious if those skills are working in harmony or tandem.BASC2EF seems best linked to an overall performance‐based task latent, whereas BASC3EF can capture distinctions across performance‐based tasks. The changing emphasis of the items in BASC3EF subscales could account for its ability to capture individual elements of everyday executive functioning better than BASC2EF. The new BASC3EF questions could be capturing more individual executive function elements, allowing the subscale to target self‐control and distractibility. The attentional control subscale added more items about following directions and keeping focused in BASC3EF. The BASC2EF problem solving subscale contained items about good study routines, cooperating with classmates, staying organized, making decisions, and working in stressful settings that were excluded from BASC3EF, allowing the subscale to target planning and analyzing problems. The BASC2EF emotional control subscale included items about adapting to changing circumstances that were eliminated in BASC3EF, shifting the subscale to target adjustment to stressful situations and propensity for anger.Results for models depicting individual executive function tasks (Models 1a and 2a) remain consistent between the BASC2EF and BASC3EF for the cognitive flexibility, sustained attention, and planning tasks. However, there are differences in the working memory, executive decision‐making, and inhibition tasks that could be explained by the changes in the subscales.There are significant positive links between working memory (i.e., spatial span task) and all BASC2EF subscales, but significant links only for the behavioral control and emotional control BASC3EF subscales. The broader scope of BASC2EF subscales could create stronger links with working memory, as each may be capturing multiple executive functions. Looking closely at how the items change between the BASC2EF and BASC3EF, it is possible that the items in the attentional control and problem solving BASC3EF subscales do not capture the behaviors linked to working memory in the same way as BASC2EF. In particular, the problem‐solving subscale excluded items that were not related to planning and problem solving, and the BASC3EF attentional control subscale included more specific items about following directions and maintaining focus. In both instances, the BASC3EF subscales are more targeted, potentially weakening the links between problem solving and attentional control with working memory.There are significant negative links between executive decision‐making (i.e., hungry donkey task) and the BASC2EF problem solving and emotional control subscales and non‐significant links for the BASC2EF behavioral control and attentional control subscales. In contrast, there were significant negative links between executive decision making and all BASC3EF subscales. The removal of BASC2EF items about disruptive and harmful behavior towards other children in the BASC3EF behavioral control subscale and new items about following directions and keeping focused in the BASC3EF attentional control subscale could have elicited stronger negative links to executive decision making.The links between inhibition (i.e., stop signal task) and BASC2EF subscales are positive and significant. In contrast, they are negative (or non‐significant) for BASC3EF. Looking at the specific items, some new BASC3EF items could capture post‐error slowing behaviors. Briefly, post‐error slowing is the inclination for participants to slow down on the current trial after committing an error on the previous trial (Rabbitt, 1966); this speed‐accuracy trade‐off reflects an intent to increase accuracy (e.g., Botvinick et al., 2001). The design of the stop signal task allows children to be more aware of their errors and the efficiency metric is more sensitive to post‐error slowing than the other performance‐based tasks in this study. Students might use a similar strategy in classrooms by reflecting on their mistakes, thinking critically, and controlling their negative emotions when making errors (de Mooij et al., 2022). Several new items in the BASC3EF target these types of behaviors. Specifically, attentional control (following directions and keeping focused) taps into reflecting on mistakes. Problem solving (planning and analyzing problems) taps into thinking critically. Emotional control (adjusting to stressful situations and propensity for anger) taps into controlling negative emotions when making errors. The inclusion of new items that emphasize these classroom behaviors may be the reason why there are both negative links between the stop signal task and these BASC3EF subscales and why the link between the overall performance‐based latent and these scales is smaller than for BASC3EF.Variation also exists between the overall executive function models and BASC subscales (Models 1b and 2b). All paths between overall executive function and the adapted BASC2EF subscales show significant links. Paths between overall performance‐based executive function and BASC3EF attentional control and problem solving subscales show significant links, but paths between overall executive function and behavioral control and emotional control are non‐significant. The changed BASC3EF subscale items may have weakened those links, as the target of each subscale became narrower and encompassed fewer classroom behaviors.Our findings suggest that researchers with archived or longitudinal BASC2EF datasets from high‐poverty settings wanting to compare BASC2EF responses to performance‐based tasks should use a composite score (or single latent variable) of that task performance rather than looking at links between individual tasks and BASC2EF subscales to reduce Type II errors. Researchers planning new projects with BASC3EF should be able to explore direct performance‐based task links or use composite scores. However, they should consider the specific performance‐based tasks they are measuring and how each task maps onto the BASC3EF items.Finally, these findings indicate that teachers are appropriately identifying where children are in terms of executive functioning based on their observations of everyday behaviors. As the field gets better at working with teachers to develop effective ways of incorporating executive function research into classroom practice, it is worth remembering that teachers already have knowledge surrounding everyday executive function behaviors, even if they don't have the same vocabulary that is used in cognitive psychology contexts (Perry et al., 2015). It is important for researchers to work with teachers collaboratively, rather than from the top down to build upon their existing expertise (Ellefson et al., 2019). Teachers are already in a good position to implement well‐designed curriculum centered on building executive function skills, especially when researchers recognize that teachers bring a sophisticated understanding of everyday executive functioning to the table (Faith et al., 2022).Limitations and future directionsAlthough these findings make a unique contribution to the field by comparing teacher ratings of everyday behaviors with computerized performance‐based tasks, there are some limitations and areas where further research is needed.Future research should test additional theoretical models that are suited to the complexity of the executive function constructs, investigating whether the current findings generalize to other computerized performance‐based executive function tasks and observer reports of executive function in children, completed by teachers and parents alike. The current battery of computerized performance‐based tasks are themselves only one metric of a different aspect of executive function skills (e.g., inhibition, working memory). Research designs using additional tasks could be used to test how robust these findings are for different types of performance‐based tasks measuring the same executive function skills. If three different performance‐based tasks were used for each of the three key aspects of executive function (i.e., inhibition, shifting, working memory) commonly presented in the literature (Miyake et al., 2000; Riggs et al., 2006), then a theoretical evaluation of the unity and diversity of executive function could be tested across performance‐based and behavior rating scales. Furthermore, it is important for computerized performance‐based tasks to be standardized and full psychometric evaluations conducted, especially with older children. Such work would enable better comparison with other executive function metrics developed in clinical, developmental, or neuropsychology settings.It is possible that some differences in the findings between the two samples could be driven by participant demographics. Future work could specifically sample across ethnic minority groups to investigate whether there are any systematic biases in behavior rating scales that might distinctively affect children from diverse minoritized backgrounds.Additionally, nested models were not included in the current study as the teacher was not an appropriate predictor and the sample lacked sufficient statistical power. Though much of the existing literature does not include nested models (e.g., McAuley et al., 2010; Miranda et al., 2015; Van Tetering & Jolles, 2017), it would be useful for future research with larger sample sizes to include nesting of teachers in schools to better understand whether there are any response biases, especially when studying children from high‐poverty and/or ethnic minority families.Recent work by Camerota et al. (2020) highlights the need for critical evaluation of the best way to represent executive functions in measurement models. While the current study aligns with a common reliance on confirmatory factor analysis or structural equation models to represent the construct of executive functions (e.g., Miyake & Friedman, 2012; Miyake et al., 2000; Wiebe et al., 2008), Camerota et al. warn against accepting these measurement models as the default without implementing a reasoned approach for selecting them because interpretational differences can occur depending on the measurement model selected to represent executive functions. Our BASC3EF results substantiate their advice. Future research would benefit from the implementation and comparison of different models (e.g., formative latent variable models) that may represent executive functions differently.CONCLUSIONSResults from the current study indicate a robust link between the performance‐based tasks and both versions of the BASCEF. Children's performance on these computerized performance‐based tasks does appear to be linked to the behaviors that teachers report on BASCEF rating scales. However, BASC2EF in its original form is not a good representation of everyday executive function behaviors by children from schools in high‐poverty communities. Given the popularity of BASC‐2, it is likely that future research using this measure will continue to extend from archived datasets and longitudinal work. As such, it is important to recognize that the derivation of BASC2EF could be useful as an embedded executive function screener for many archived BASC‐2 datasets. However, it is not an appropriate screener for data from children from ethnic minority families from low‐income communities without removing items that are not in BASC3EF. When given the choice, BASC3EF is more appropriate for children from high‐poverty, ethnic minority populations.This work opens a window for exploring the links between computerized executive function tasks and observer reports of children within the context of low‐income, urban communities. Our findings indicate that it may not be appropriate in some cases to use the same assessments with individuals of various racial/ethnic minority groups and socio‐economic backgrounds. This reaffirms the importance of including diverse samples in executive function research and exploring the utility of different versions of executive function screeners that continue to be used in longitudinal research or archived datasets.AUTHOR CONTRIBUTIONSSample 1: Ellefson, Serpell and Parr submitted the initial grant application and study concept. Ellefson oversaw the development of the Thinking Games website. Data collection was supervised by Ellefson, Serpell and Parr. Ellefson and Serpell managed and conducted the scoring, data entry, data cleaning and all other general data management for the computerized tasks and teacher surveys. Sample 2: Zonneveld and Ellefson developed the data collection plan as part of Zonneveld's doctoral research. Zonneveld managed and conducted all data analysis and data entry. Manuscript: Zonneveld and Ellefson wrote the R scripts for data wrangling and analysis; they interpreted the results here as well as Zonneveld for a thesis submitted as partial fulfillment of a doctoral degree at the University of Cambridge. Zonneveld and Ellefson prepared the open science data files. Zonneveld and Ellefson drafted the manuscript. All authors approved the final draft.ACKNOWLEDGMENTSThe research reported here for Sample 1 was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305A110932 to the University of Cambridge. The opinions expressed are those of the authors and do not represent views of the Institute or the U.S. Department of Education. Special thanks to Geoff Martin for Thinking Games website programming and William Harris for proofreading. Correspondence about the overall grant / project for Sample 1 should be directed to Michelle Ellefson (mre33@cam.ac.uk) An overall summary of the project deriving Sample 1 is available from https://osf.io/yac8e/.ADDITIONAL ACKNOWLEDGEMENTSThe data presented for Sample 1 are the result of work by a large group from the University of Cambridge, Virginia State University, Virginia Commonwealth University and Ashley‐Parr, LLC (listed alphabetically by last name): Temitope Adeoye, Mariah Adlawan, Amanda Aldercotte, Annabel Amodia‐Bidakowska, Cortney Anderson, Christopher Ashe, Joseph Beach, Aaron Blount, Alexander Borodin, Lyndani Boucher, Aaron Blount, Lakendra Butler, Yufei Cai, Tavon Carter, Emma Chatzispyridou, Parul Chaudhary, Laura Clarke, Taelor Clay, Jackson Collins, Brittany Cooper, Aiden Cope, Briana Coardes, Breanna Cohen, Aiden Cope, Amenah Darab, Moneshia Davis, Shakita Davis, Asha Earle, Mary Elyiace, Nadine Forrester, Sophie Farthing, Pippa Fielding Smith, Aysha Foster, Gill Francis, Kristine Gagarin, Amed Gahelrasoul, Marleny Gaitan, Summer Gamal, Katie Gilligan, Cynthia Gino, Reinaldo Golmia Dante, Zejian Guo, Aditi Gupta, Jennifer Hacker, Shanai Hairston, Khaylynn Hart, Donita Hay, Rachel Heeds, Sonia Ille, Joy Jones, Madhu Karamsetty, Spencer Kearns, Marianne Kiffin, Hyunji Kim, Wendy Lee, Steven Mallis, Dr. Geoff Martin, Tyler Mayes, Alexandria Merritt, Roshni Modhvadia, Dedrick Muhammad, Susana Nicolau, Christian Newson, Seth Ofosu, Esther Obadare, Jwalin Patel, Chloe Pickett, Tanya Pimienta, Connor Quinn, Kelsey Richardson, Michael Randall, Fran Riga, Tennisha Riley, Natalie Robles, Leah Saulter, Kristin Self, Tiera Siedle, Julian Smith, Abi Solomon, Adam Sukonick, Amelia Swafford, Krystal Thomas, Richard Thomas, John Thompson, Tris Thrower, Jr., Quai Travis, Maria Tsapali, Jorge Vargas, Tony Volley, Christopher Walton, Elexis White, Karrie Woodlon, Zhuoer Yang, Shamika Young, Sterling Young, Cheyenne Yucelen, Anne Zonneveld. These researchers were directed by the PI/Co‐PIs: Michelle Ellefson, Zewelanji Serpell, and Teresa Parr.CONFLICTS OF INTERESTThe authors have declared no conflict of interest. The authors have no conflicts of interest to report related to the research in this manuscript. This manuscript has been submitted for publication and is likely to be edited as part of the peer review process.DATA AVAILABILITY STATEMENTThis manuscript reflects a sub‐set of analyses conducted as part of the first author's doctoral dissertation. Preregistration, data/analyses, and a preprint are openly available for Sample 1 (preregistration: https://osf.io/4hrvj/ ; materials and data/analyses: https://osf.io/whzrg/) and Sample 2 (preregistration: https://osf.io/8n2e3; data/analyses: https://osf.io/fkw5m/). The manuscript preprint is available from https://psyarxiv.com/myr2s/.REFERENCESAbenavoli, R. M., Greenberg, M. T., & Bierman, K. L. (2015). Parent support for learning at school entry: Benefits for aggressive children in high‐risk urban contexts. Early Childhood Research Quarterly, 31, 9–18. https://doi.org/10.1016/j.ecresq.2014.12.003Allison, P. D. (2012). Handling missing data by maximum likelihood, 21. https://statisticalhorizons.com/wp‐content/uploads/MissingDataByML.pdfBailey, C. E. (2007). Cognitive accuracy and intelligent executive function in the brain and in business. Annals of the New York Academy of Sciences, 1118, 122–141. https://doi.org/10.1196/annals.1412.011Barkley, R. A., & Murphy, K. R. (2010). Impairment in occupational functioning and adult ADHD: The predictive utility of executive function (EF) ratings versus EF tests. Archives of Clinical Neuropsychology, 25, 157–173. https://doi.org/10.1093/arclin/acq014Bentler, P. M., & Yuan, K. ‐ H. (1999). Structural equation modeling with small samples: Test statistics. Multivariate Behavioral Research, 34, 181–197. https://doi.org/10.1207/S15327906Mb340203Bignardi, G., Dalmaijer, E. S., Anwyl‐Irvine, A., & Astle, D. E. (2021). Collecting big data with small screens: Group tests of children's cognition with touchscreen tablets are reliable and valid. Behavior Research Methods, 53, 1515–1529. https://doi.org/10.3758/s13428‐020‐01503‐3Blair, C., & Raver, C. C. (2016). Poverty, stress, and brain development: New directions for prevention and intervention. Academic Pediatrics, 16, S30–S36. https://doi.org/10.1016/j.acap.2016.01.010Botvinick, M. M., Braver, T. S., Barch, D. M., Carter, C. S., & Cohen, J. D. (2001). Conflict monitoring and cognitive control. Psychological Review, 108, 624–652. https://doi.org/10.1037/0033‐295x.108.3.624Brown, T. A. (2015). Confirmatory factor analysis for applied research (2nd ed.). Guilford.Burgess, P. W., Alderman, N., Forbes, C., Costello, A., Coates, L. M.‐A., Dawson, D. R., Anderson, N. D., Gilbert, S. J., Dumontheil, I., & Channon, S. (2006). The case for the development and use of “ecologically valid” measures of executive function in experimental and clinical neuropsychology. Journal of the International Neuropsychological Society, 12, 194–209. https://doi.org/10.1017/S1355617706060310Camerota, M., Willoughby, M. T., & Blair, C. B. (2020). Measurement models for studying child executive functioning: Questioning the status quo. Developmental Psychology, 56, 2236–2245. https://doi.org/10.1037/dev0001127Camerota, M., Willoughby, M. T., Kuhn, L. J., & Blair, C. B. (2018). The childhood executive functioning inventory (CHEXI): Factor structure, measurement invariance, and correlates in US preschoolers. Child Neuropsychology, 24, 322–337. https://doi.org/10.1080/09297049.2016.1247795Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness‐of‐fit indexes for testing measurement invariance. Structural Equation Modeling, 9, 233–255. https://doi.org/10.1207/S15328007SEM0902_5Children's Defense Fund (2021). The State of America's children. Children's Defense Fund: Available from: https://www.childrensdefense.org/wp‐content/uploads/2021/04/The‐State‐of‐Americas‐Children‐2021.pdfChoi, J. Y., Castle, S., Williamson, A. C., Young, E., Worley, L., Long, M., & Horm, D. M. (2016). Teacher‐child interactions and the development of executive function in preschool‐age children attending head start. Early Education and Development, 27, 751–769. https://doi.org/10.1080/10409289.2016.1129864Corsi, P. M. (1972). Human memory and the medial temporal region of the brain. Unpublished doctoral dissertation, McGill University.Crone, E. A., & van der Molen, M. W. (2004). Developmental changes in real life decision making: Performance on a gambling task previously shown to depend on the ventromedial prefrontal cortex. Developmental Neuropsychology, 25, 251–279. https://doi.org/10.1207/s15326942dn2503_2Davidson, F., Cherry, K., & Corkum, P. (2016). Validating the behavior rating inventory of executive functioning for children with ADHD and their typically developing peers. Applied Neuropsychology: Child, 5, 127–137. https://doi.org/10.1080/21622965.2015.1021957de Mooij, S., Dumontheil, I., Kirkham, N. Z., Raijmakers, M., & van der Maas, H. (2022). Post‐error slowing: Large scale study in an online learning environment for practising mathematics and language. Developmental Science, 25, e13174. https://doi.org/10.1111/desc.13174Diamond, A. (2001). A model system for studying the role of dopamine in the prefrontal cortex during early development in humans: Early and continuously treated phenylketonuria. In: C A. Nelson & M. Luciana (Eds.), Handbook of developmental cognitive neuroscience (pp. 433–472). MIT Press.Diamond, A. (2013). Executive functions. Annual Review of Psychology, 64, 135–168. https://doi.org/10.1146/annurev‐psych‐113011‐143750Diamond, A., Barnett, W. S., Thomas, J., & Munro, S. (2007). Preschool program improves cognitive control. Science, 318, 1387–1388. https://doi.org/10.1126/science.1151148Duckworth, A. L., & Yeager, D. S. (2015). Measurement matters: Assessing personal qualities other than cognitive ability for educational purposes. Educational Researcher, 44, 237–251. https://doi.org/10.3102/0013189x15584327Ellefson, M. R., Baker, S. T., & Gibson, J. (2019). Lessons for successful cognitive developmental science in educational settings: The case of executive functions. Journal of Cognition and Development, 20, 253–277. https://doi.org/10.1080/15248372.2018.1551219Ellefson, M. R., Ng, F. F.‐Y., Wang, Q., & Hughes, C. (2017). Efficiency of executive function: A two‐generation cross‐cultural comparison of samples from Hong Kong and the United Kingdom. Psychological Science, 28, 555–566. https://doi.org/10.1177/0956797616687812Ellefson, M. R., Shapiro, L., & Chater, N. (2006). Asymmetrical switch costs in children. Cognitive Development, 21, 108–130. https://doi.org/10.1016/J.COGDEV.2006.01.002Ellefson, M. R., Zachariou, A., Ng, F. F.‐Y., Wang, Q., & Hughes, C. (2020). Do executive functions mediate the link between socioeconomic status and numeracy skills? A cross‐site comparison of Hong Kong and the United Kingdom. Journal of Experimental Child Psychology, 194, 104734. https://doi.org/10.1016/j.jecp.2019.104734Enders, C. K. (2013). Dealing with missing data in developmental research. Child Development Perspectives, 7, 27–31. https://doi.org/10.1111/cdep.12008Espy, K. A., Kaufmann, P. M., Glisky, M. L., & McDiarmid, M. D. (2001). New procedures to assess executive functions in preschool children. Clinical Neuropsychologist, 15, 46–58. https://doi.org/10.1076/clin.15.1.46.1908Faith, L., Bush, C. ‐ A., & Dawson, P. (2022). Executive function skills in the classroom: Overcoming barriers, building strategies. The Guilford Press.Finders, J. K., McClelland, M. M., Geldhof, G. J., Rothwell, D. W., & Hatfield, B. E. (2021). Explaining achievement gaps in kindergarten and third grade: The role of self‐regulation and executive function skills. Early Childhood Research Quarterly, 54, 72–85. https://doi.org/10.1016/j.ecresq.2020.07.008Flores, I., Casaletto, K. B., Marquine, M. J., Umlauf, A., Moore, D. J., Mungas, D., Gershon, R. C., Beaumont, J. L., & Heaton, R. K. (2017). Performance of Hispanics and non‐Hispanic Whites on the NIH Toolbox Cognition Battery: The roles of ethnicity and language backgrounds. The Clinical Neuropsychologist, 31, 783–797. https://doi.org/10.1080/13854046.2016.1276216Garcia‐Barrera, M., Karr, J., Durán, V., Direnfeld, E., & Pineda, D. (2015). Cross‐cultural validation of a behavioral screener for executive functions: Guidelines for clinical use among Colombian children with and without ADHD. Psychological Assessment, 27, 1349–1363. https://doi.org/10.1037/pas0000117Garcia‐Barrera, M. A., Kamphaus, R. W., & Bandalos, D. (2011). Theoretical and statistical derivation of a screener for the behavioral assessment of executive functions in children. Psychological Assessment, 23, 64–79. https://doi.org/10.1037/a0021097Garcia‐Barrera, M. A., Karr, J. E., & Kamphaus, R. W. (2013). Longitudinal applications of a behavioral screener of executive functioning: Assessing factorial invariance and exploring latent growth. Psychological Assessment, 25, 1300–1313. https://doi.org/10.1037/a0034046Gioia, G. A., Isquith, P. K., Guy, S. C., & Kenworthy, L. (2000). Behavior rating inventory of executive function. Psychological Assessment.Goldstein, S., & Naglieri, J. A. (Eds.). (2014) Handbook of executive functioning. Springer.Gross, A. C., Deling, L. A., Wozniak, J. R., & Boys, C. J. (2015). Objective measures of executive functioning are highly discrepant with parent‐report in fetal alcohol spectrum disorders. Child Neuropsychology, 21, 531–538. https://doi.org/10.1080/09297049.2014.911271Gutierrez, M., Filippetti, V. A., & Lemos, V. (2021). The childhood executive functioning inventory (CHEXI) parent and teacher form: Factor structure and cognitive correlates in Spanish‐speaking children from Argentina. Developmental Neuropsychology, 46, 136–148. https://doi.org/10.1080/87565641.2021.1878175Hooper, D., Coughlan, J., & Mullen, M. R. (2008). Structural equation modelling: Guidelines for determining model fit, 6, 53–60. https://doi.org/10.21427/D7CF7RHosenbocus, S., & Chahal, R. (2012). A review of executive function deficits and pharmacological management in children and adolescents. Journal of the Canadian Academy of Child and Adolescent Psychiatry, 21, 223–229.Isquith, P. K., Gioia, G. A., & Espy, K. A. (2004). Executive function in preschool children: Examination through everyday behavior. Developmental Neuropsychology, 26, 403–422. https://doi.org/10.1207/s15326942dn2601_3Isquith, P. K., Roth, R. M., & Gioia, G. (2013). Contribution of rating scales to the assessment of executive functions. Applied Neuropsychology: Child, 2(2), 125–132. https://doi.org/10.1080/21622965.2013.748389Jacob, R., & Parkinson, J. (2015). The potential for school‐based interventions that target executive function to improve academic achievement: A review. Review of Educational Research, 85, 512–552. https://doi.org/10.3102/0034654314561338Jorgensen, T. D., Pornprasertmanit, S., Schoemann, A. M., & Rosseel, Y. (2021). semTools: Useful tools for structural equation modeling. R package version 0.5‐4. https://CRAN.R‐project.org/package=semToolsKarr, J. E., & Garcia‐Barrera, M. A. (2017). The assessment of executive functions using the BASC‐2. Psychological Assessment, 29, 1182–1187. https://doi.org/10.1037/pas0000418Kouklari, E. ‐ C., Tsermentseli, S., & Monks, C. P. (2018). Everyday executive function and adaptive skills in children and adolescents with autism spectrum disorder: Cross‐sectional developmental trajectories. Autism & Developmental Language Impairments, 3, 2396941518800775. https://doi.org/10.1177/2396941518800775Lawson, G. M., Duda, J. T., Avants, B. B., Wu, J., & Farah, M. J. (2013). Associations between children's socioeconomic status and prefrontal cortical thickness. Developmental Science, 16, 641–652. https://doi.org/10.1111/desc.12096Lawson, G. M., & Farah, M. J. (2017). Executive function as a mediator between SES and academic achievement throughout childhood. International Journal of Behavioral Development, 41, 94–104. https://doi.org/10.1177/0165025415603489Lawson, G. M., Hook, C. J., & Farah, M. J. (2018). A meta‐analysis of the relationship between socioeconomic status and executive function performance among children. Developmental Science, 21(2), e12529. https://doi.org/10.1111/desc.12529Lee, K., Bull, R., & Ho, M. ‐ H. (2013). Developmental changes in executive functioning. Child Development, 84, 1933–1953. https://doi.org/10.1111/cdev.12096Litkowski, E. C., Finders, J. K., Borriello, G. A., Purpura, D. J., & Schmitt, S. A. (2020). Patterns of heterogeneity in kindergarten children's executive function: Profile associations with third grade achievement. Learning and Individual Differences, 80, 101846. https://doi.org/10.1016/j.lindif.2020.101846Logan, G. D. (1994). On the ability to inhibit thought and action: A users’ guide to the stop signal paradigm. In: D. Dagenbach & T. H. Carr (Eds.), Inhibitory processes in attention, memory, and language (pp. 189–239). Academic Press.Luciana, M., & Nelson, C. A. (2002). Assessment of neuropsychological function through use of the Cambridge Neuropsychological Testing Automated Battery: Performance in 4‐ to 12‐year‐old children. Developmental Neuropsychology, 22, 595–624. https://doi.org/10.1207/S15326942DN2203_3Magimairaj, B. M. (2018). Parent‐rating vs performance‐based working memory measures: Association with spoken language measures in school‐age children. Journal of Communication Disorders, 76, 60–70. https://doi.org/10.1016/j.jcomdis.2018.09.001McAuley, T., Chen, S., Goos, L., Schachar, R., & Crosbie, J. (2010). Is the behavior rating inventory of executive function more strongly associated with measures of impairment or executive function? Journal of the International Neuropsychological Society, 16, 495–505. https://doi.org/10.1017/S1355617710000093McCoy, D. C. (2019). Measuring young children's executive function and self‐regulation in classrooms and other real‐world settings. Clinical Child and Family Psychology Review, 22, 63–74. https://doi.org/10.1007/s10567‐019‐00285‐1Miller‐Cotto, D., Smith, L. V., Wang, A. H., & Ribner, A. D. (2021). Changing the conversation: A Culturally responsive perspective on executive functions, minoritized children and their families. Infant and Child Development, 31(1), e2286. https://doi‐org.ezp.lib.cam.ac.uk/10.1002/icd.2286Miranda, A., Colomer, C., Mercader, J., Fernández, M. I., & Presentación, M. J. (2015). Performance‐based tests versus behavioral ratings in the assessment of executive functioning in preschoolers: Associations with ADHD symptoms and reading achievement. Frontiers in Psychology, 6, 545. https://doi.org/10.3389/fpsyg.2015.00545Miyake, A., & Friedman, N. P. (2012). The nature and organization of individual differences in executive functions: Four general conclusions. Current Directions in Psychological Science, 21, 8–14. https://doi.org/10.1177/0963721411429458Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., Howerter, A., & Wager, T. D. (2000). The unity and diversity of executive functions and their contributions to complex “frontal lobe” tasks: A latent variable analysis. Cognitive Psychology, 41, 49–100. https://doi.org/10.1006/cogp.1999.0734Monette, S., Bigras, M., & Guay, M. ‐ C. (2011). The role of the executive functions in school achievement at the end of grade 1. Journal of Experimental Child Psychology, 109, 158–173. https://doi.org/10.1016/j.jecp.2011.01.008Müller, U., Miller, M., Hutchison, S., & Eycke, K. T. (2017). Transition to school: Executive function, emergent academic skills, and early school achievement. In:(M. J. Hoskyn, G. Iarocci, & A. R. Young (Eds.), Executive functions in children's everyday lives: A handbook for professionals in applied psychology. (pp. 88–107). Oxford University Press. https://doi.org/10.1093/acprof:oSo/9780199980864.003.0007Nesbitt, K. T., Baker‐Ward, L., & Willoughby, M. T. (2013). Executive function mediates socio‐economic and racial differences in early academic achievement. Early Childhood Research Quarterly, 28(4), 774–783. https://doi.org/10.1016/j.ecresq.2013.07.005Nilsen, E. S., Huyder, V., McAuley, T., & Liebermann, D. (2017). Ratings of everyday executive functioning (REEF): A parent‐report measure of preschoolers’ executive functioning skills. Psychological Assessment, 29, 50–64. https://doi.org/10.1037/pas0000308Parsey, C. M., & Schmitter‐Edgecombe, M. (2013). Applications of technology in neuropsychological assessment. The Clinical Neuropsychologist, 27, 1328–1361. https://doi.org/10.1080/13854046.2013.834971Patel, J., Aldercotte, A., Tsapali, M., Serpell, Z. N., Parr, T., & Ellefson, M. R. (2021). The Zoo Task: A novel metacognitive problem‐solving task developed with a sample of African American children from schools in high poverty communities. Psychological Assessment, 33(8), 795–802. https://doi.org/10.1037/pas0001033Perry, N. E., Brenner, C. A., & Fusaro, N. (2015). Closing the gap between theory and practice in self‐regulated learning: Teacher learning teams as a framework for enhancing self‐regulated teaching and learning. In: T. J. Cleary (Ed.) Self‐regulated learning interventions with at risk populations: Academic, mental health, and contextual considerations (pp. 229–250). American Psychological Association.R core team (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R‐project.org/Rabbitt, P. M. (1966). Errors and error correction in choice‐response tasks. Journal of Experimental Psychology, 71(2), 264–272. https://doi.org/10.1037/h0022853Revelle, W. (2021). psych: Procedures for psychological, psychometric, and personality research. R package version 2.1.6. https://CRAN.R‐project.org/package=psychReynolds, C. R., & Kamphaus, R. W. (1992). BASC: Behavior assessment system for children. American Guidance Service.Reynolds, C. R., & Kamphaus, R. W. (2004). BASC‐2: Behavior assessment system for children. Pearson.Reynolds, C. R., & Kamphaus, R. W. (2015). BASC‐3: Behavior assessment system for children. Pearson.Riggs, N. R., Jahromi, L. B., Razza, R. P., Dillworth‐Bart, J. E., & Mueller, U. (2006). Executive function and the promotion of social‐emotional competence. Journal of Applied Developmental Psychology, 27, 300–309. https://doi.org/10.1016/j.appdev.2006.04.002Rogers, R., & Monsell, S. (1995). Costs of a predictable switch between simple cognitive tasks. Journal of Experimental Psychology: General, 124, 207–231. https://doi.org/10.1037/0096‐3445.124.2.207Rosseel, Y. (2021). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48, 1–36. http://www.jstatsoft.org/v48/i02/Rosvold, H. E., Mirsky, A. F., Sarason, I., Bransome, E. D. Jr., & Beck, L. H. (1956). A continuous performance test of brain damage. Journal of Consulting Psychology, 20, 343–350. https://doi.org/10.1037/h0043220Samuels, W. E., Tournaki, N., Blackman, S., & Zilinski, C. (2016). Executive functioning predicts academic achievement in middle school: A four‐year longitudinal study. The Journal of Educational Research, 109, 478–490. https://doi.org/10.1080/00220671.2014.979913Schumacker, R. E., & Lomax, R. G. (2016). A beginner's guide to structural equation modeling (4th ed.). Routledge.Servera, M., & Cardo, C. (2006). Children sustained attention task (CSAT): Normative, reliability, and validity data. International Journal of Clinical and Health Psychology, 6, 697–707.Sherman, E. M. S., & Brooks, B. L. (2010). Behavior Rating Inventory of Executive Function – Preschool Version (BRIEF‐P): Test review and clinical guidelines for use. Child Neuropsychology, 16, 503519. https://doi.org/10.1080/09297041003679344Soto, E. F., Kofler, M. J., Singh, L. J., Wells, E. L., Irwin, L. N., Groves, N. B., & Miller, C. E. (2020). Executive functioning rating scales: Ecologically valid or construct invalid? Neuropsychology, 34, 605–619. https://doi.org/10.1037/neu0000681Sulik, M. J., Huerta, S., Zerr, A. A., Eisenberg, N., Spinrad, T. L., Valiente, C., Di Giunta, L., Pina, A. A., Eggum, N. D., Sallquist, J., Edwards, A., Kupfer, A., Lonigan, C. J., Phillips, B. M., Wilson, S. B., Clancy‐Menchetti, J., Landry, S. H., Swank, P. R., Assel, M. A., & Taylor, H. B. (2010). The factor structure of effortful control and measurement invariance across ethnicity and sex in a high‐risk sample. Journal of Psychopathology and Behavioral Assessment, 32, 822. https://doi.org/10.1007/s10862‐009‐9164‐yThorell, L. B., & Nyberg, L. (2008). The childhood executive functioning inventory (CHEXI): A new rating instrument for parents and teachers. Developmental Neuropsychology, 33, 536–552. https://doi.org/10.1080/87565640802101516Tong, X., Zhang, Z., & Yuan, K. ‐ H. (2014). Evaluation of test statistics for robust structural equation modeling with nonnormal missing data. Structural Equation Modeling: A Multidisciplinary Journal, 21, 553565. https://doi.org/10.1080/10705511.2014.919820Toplak, M. E., West, R. F., & Stanovich, K. E. (2013). Practitioner review: Do performance‐based measures and ratings of executive function assess the same construct? Journal of Child Psychology and Psychiatry, 54, 131–143. https://doi.org/10.1111/jcpp.12001van Tetering, M. A. J., & Jolles, J. (2017). Teacher evaluations of executive functioning in schoolchildren aged 9–12 and the influence of age, sex, level of parental education. Frontiers in Psychology, 8, 481. https://doi.org/10.3389/fpsyg.2017.00481Weintraub, S., Dikmen, S. S., Heaton, R. K., Tulsky, D. S., Zelazo, P. D., Bauer, P. J., Carlozzi, N. E., Slotkin, J., Blitz, D., Wallner‐Allen, K., Fox, N. A., Beaumont, J. L., Mungas, D., Nowinski, C. J., Richler, J., Deocampo, J. A., Anderson, J. E., Manly, J. J., Borosh, B., …, & Gershon, R. C. (2013). Cognition assessment using the NIH Toolbox. Neurology, 80, S54S64. https://doi.org/10.1212/WNL.0b013e3182872dedWelsh, M. C. (1991). Rule‐guided behavior and self‐monitoring on the Tower of Hanoi disk‐transfer task. Cognitive Development, 6, 59–76. https://doi.org/10.1016/0885‐2014(91)90006‐YWiebe, S. A., Espy, K. A., & Charak, D. (2008). Using confirmatory factor analysis to understand executive control in preschool children: I. Latent structure. Developmental Psychology, 44, 575–587. https://doi.org/10.1037/0012‐1649.44.2.575Xu, C., Ellefson, M. R., Ng, F., Wang, Q., & Hughes, C. (2020). An East‐West contrast in executive function: Measurement invariance of computerized tasks in school‐aged children and adolescents. Journal of Experimental Child Psychology, 199, 104929. https://doi.org/10.1016/j.jecp.2020.104929Yuan, K. ‐ H., & Bentler, P. M. (1998). Normal theory based test statistics in structural equation modelling. British Journal of Mathematical and Statistical Psychology, 51, 289–309. https://doi.org/10.1111/j.2044‐8317.1998.tb00682.xZelazo, P. (2006). The Dimensional Change Card Sort (DCCS): A method of assessing executive function in children. Nature Protocols, 1, 297–301. https://doi.org/10.1038/nprot.2006.46Zelazo, P. D., Anderson, J. E., Richler, J., Wallner‐Allen, K., Beaumont, J. L., & Weintraub, S. (2013). ii. NIH toolbox cognition battery: Measuring executive function and attention. Monographs of the Society for Research in Child Development, 78, 16–33. https://doi.org/10.1111/mono.12032

Journal

Developmental ScienceWiley

Published: Nov 1, 2022

Keywords: behavior rating scales; ethnic minority; executive functions; late childhood; socioeconomic status; teacher rating scales

There are no references for this article.