Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Observer variation in chest radiography of acute lower respiratory infections in children: a systematic review

Observer variation in chest radiography of acute lower respiratory infections in children: a... Background: Knowledge of the accuracy of chest radiograph findings in acute lower respiratory infection in children is important when making clinical decisions. Methods: I conducted a systematic review of agreement between and within observers in the detection of radiographic features of acute lower respiratory infections in children, and described the quality of the design and reporting of studies, whether included or excluded from the review. Included studies were those of observer variation in the interpretation of radiographic features of lower respiratory infection in children (neonatal nurseries excluded) in which radiographs were read independently and a clinical population was studied. I searched MEDLINE, HealthSTAR and HSRPROJ databases (1966 to 1999), handsearched the reference lists of identified papers and contacted authors of identified studies. I performed the data extraction alone. Results: Ten studies of observer interpretation of radiographic features of lower respiratory infection in children were identified. Seven of the studies satisfied four or more of the seven design and reporting criteria. Six studies met the inclusion criteria for the review. Inter-observer agreement varied with the radiographic feature examined. Kappa statistics ranged from around 0.80 for individual radiographic features to 0.27–0.38 for bacterial vs viral etiology. Conclusions: Little information was identified on observer agreement on radiographic features of lower respiratory tract infections in children. Agreement varied with the features assessed from "fair" to "very good". Aspects of the quality of the methods and reporting need attention in future studies, particularly the description of criteria for radiographic features. ponents of diagnostic accuracy. Observer variation is Background Chest radiography is a very common investigation in however not sufficient for diagnostic accuracy. The key children with lower respiratory infection, and knowledge element of such accuracy is the concordance of the radi- of the diagnostic accuracy of radiograph interpretation is ological interpretation with the presence or absence of consequently important when basing clinical decisions pneumonia. Unfortunately there is seldom a suitable on the findings. Inter- and intra-observer agreement in available reference standard for pneumonia (such as his- the interpretation of the radiographs are necessary com- tological or gross anatomical findings) against which to BMC Medical Imaging 2001, 1:1 http://www.biomedcentral.com/1471-2342/1/1 compare radiographic findings. Diagnostic accuracy 5. Studies of a clinical population with a spectrum of dis- thus needs to be examined indirectly, including assess- ease in which radiographic assessment is likely to be ing observer agreement. used (as opposed to separate groups of normal children and those known to have the condition of interest). Observer variation in chest radiograph interpretation in acute lower respiratory infections in children has not Literature search been systematically reviewed. Studies were identified by a computerized search of MEDLINE from 1966 to 1999 using the following search The purpose of this study was to quantify the agreement terms: observer variation, or intraobserver (text word), between and within observers in the detection of radio- or interobserver (text word); and radiography, thoracic, graphic features associated with acute lower respiratory or radiography or bronchiolitis/ra, or pneumonia, viral/ infections in children. A secondary objective was to as- ra, or pneumonia, bacterial/ra, or respiratory tract infec- sess the quality of the design and reporting of studies of tions/ra. The search was limited to human studies of this topic, whether or not the studies met the quality in- children up to the age of 18 years. The author reviewed clusion criteria for the review. the titles and abstracts of the identified articles in Eng- lish or with English abstracts (and the full text of those judged to be potentially eligible). A similar search was Methods Inclusion criteria performed of HealthSTAR, a former on-line database of Studies meeting the following criteria were included in published health service research, and the HSRPROJ the systematic review: (Health Services Research Projects in Progress) data- base. Reference lists of articles retrieved from the above 1. An assessment of observer variation in interpretation searches were examined. Authors of studies of agree- of radiographic features of lower respiratory infection, or ment between independent observers on chest radio- of the radiographic diagnosis of pneumonia. graph findings in acute lower respiratory infections in children were contacted with an inquiry about the exist- 2. Studies of children aged 15 years or younger or studies ence of additional studies, published or unpublished. from which data on children 15 years or younger could be extracted. Studies of infants in neonatal nurseries were Data collection and analysis excluded. The author evaluated for inclusion potentially relevant studies identified in the above search. Characteristics of 3. Data presented that enabled the assessment of agree- study design and reporting listed in Table 1 were record- ment between observers. ed in all studies of observer variation in the interpreta- tion of radiographic features of lower respiratory 4. Independent reading of radiographs by two or more infection in children aged 15 years or younger (except in- observers. fants in neonatal nurseries). The criteria for validity were Table 1: Characteristics of study design and reporting a b c Present Absent Unclear Validity eligibility criteria Independent assessment of radiographs 9 1 0 Relevant clinical population (not case-control design) 7 3 0 Other validity characteristics Description of study population (3 of age, M:F ratio, clinical features and eligibility criteria) 6 4 0 Description of criteria for radiological signs 4 6 0 Presentation of indeterminate results 72 1 Applicability Meaningful measures of agreement (kappa or equivalent) 8 2 0 Confidence intervals for measures of agreement 1 9 0 Assessment of intra-observer variability 3 7 0 a b c Study characteristic present, according to research report Study characteristic absent, according to research report Insufficient information to determine whether the characteristic was present. BMC Medical Imaging 2001, 1:1 http://www.biomedcentral.com/1471-2342/1/1 Table 2: Characteristics of included studies Author Subjects Observers Simpson et al 1974 [14] 330 children under 14 years hospitalized with acute lower respiratory infection 2 radiologists McCarthy et al 1981 [15] 128 of 1566 children seen in a pediatric emergency room with a pulmonary infiltrate 2 radiologists in chest radiography (as judged by the duty radiologist) Crain et al 1991 [9] 230 of 242 febrile infants under 8 weeks evaluated in an emergency room and who 2 radiologists received a chest radiograph Kramer et al 1992 [12] 287 unreferred febrile children, aged 3–24 months, in an emergency unit 1 pediatrician, 1 duty radiologist, 1 "blind" pediatric radiologist Davies et al 1996 [10] 40 children under 6 months, 25 with pneumonia and 15 with bronchiolitis, admitted 3 pediatric radiologists to a tertiary care pediatric hospital Coakley et al 1996 [8] 113 previously well children under 3 years hospitalized with acute respiratory infec- 2 radiologists tions and no focal abnormality on radiography a b Kappa calculated from data extracted from the report Average weighted kappa Table 3: Observer agreement: kappa statistics (95% confidence intervals) Radiographic features Davies 1996 Simpson 1974 Coakley 1996 Kramer 1992 Crain 1991 McCarthy [10] [14] [8] [12] [9] 1981 [15] Inter-observer variation Consolidation 0.79 Pneumonia 0.46 (0.34–0.58) 0.47 (0.35–0.60) Collapse/consolidation 0.83 (0.72–0.94) Collapse/atelectasis 0.78 Hyperinflation/air trapping 0.83 0.78 (0.67–0.89) Peribronchial/ bronchial wall 0.55 0.55 (0.44–0.66) 0.43 (0.25–0.61) thickening Perihilar linear opacities 0.82 Abnormal 0.61 (0.48–0.74) Bacterial vs. viral etiology 0.27–0.38 Intra-observer variation Consolidation 0.91 Collapse/atelectasis 0.86 Hyperinflation/air trapping 0.85 Peribronchial /bronchial wall 0.76 thickening Perihilar linear opacities 0.87 Two observer pairs those for which empirical evidence exists for their impor- ed by at least four out of five sources) as the methodolog- tance in the avoidance of bias in comparisons of diagnos- ical inclusion criteria [1–5]. No attempt was made to tic tests with reference standards, and which were derive a quality score. relevant to tests of observer agreement. The selected cri- teria for applicability were those featured by at least two In studies meeting all the inclusion criteria for the re- of five sources of such recommendations. No weighting view, the author extracted the following additional infor- was applied to the criteria, except the use of the two most mation: number and characteristics of the observers and frequently recommended validity criteria (recommend- children studied, and measures of agreement. When no BMC Medical Imaging 2001, 1:1 http://www.biomedcentral.com/1471-2342/1/1 or more of the seven design and reporting criteria. Four studies described criteria for the radiological signs. Six of the studies satisfied the inclusion criteria for the system- atic review [8–10,12,14,15]. Of the remaining four stud- ies, three were excluded because a clinical spectrum of patients had not been used [7,13,16] and one because ob- servers were not independent [11]. The characteristics of included studies are shown in Table 2. A kappa statistic was calculated from data extracted from one report [14], and confidence intervals in three studies in which they were not reported but for which sufficient data were available in the report [8,9,14]. A summary of kappa statistics is shown in Table 3. Inter-observer agreement varied with the radiographic feature exam- ined. Kappas for individual radiographic features were around 0.80, and lower for composite assessments such as the presence of pneumonia (0.47), radiographic nor- mality (0.61) and bacterial vs. viral etiology (0.27–0.38). Findings were similar in the two instances in which more than one study examined the same radiographic feature (hyperinflation/air trapping and peribronchial/bronchi- al wall thickening). When reported, kappa statistics for intra-observer agreement were 0.10–0.20 higher than for inter-observer agreement. Discussion The quality of the methods and reporting of studies was not consistently high. Only six of 10 studies satisfied the Figure 1 inclusion criteria for the review. The absence of any of Review profile the validity criteria used in this study (independent read- ing of radiographs, the use of a clinical population with measures of agreement were reported, data were extract- an appropriate spectrum of disease, description of the ed from the reports and kappa statistics were calculated study population and of criteria for a test result) has been using the method described by Fleiss [6]. Kappa is a found empirically to overestimate test accuracy, on aver- measure of the degree of agreement between observa- age, when a test is compared with a reference standard tions, over and above that expected by chance. If agree- [1]. A similar effect may apply to the estimation of inter- ment is complete, kappa = 1; if there is only chance observer agreement, in that two observers may agree concordance, kappa = 0. with each other more often when aware of each other's assessment, and radiographs drawn from separate pop- ulations of normal and known affected children will ex- Results A review profile is shown in Figure 1. For a list of rejected clude many of the equivocal radiographs in a usual studies, with reasons for rejection, see Additional file 1: clinical population, thereby possibly falsely increasing Rejected studies. Ten studies of observer variation in the agreement. Only four of ten studies described criteria for interpretation of radiographic features of lower respira- the radiological signs, with potential negative implica- tory infection in children aged 15 years or younger were tions for both the validity and the applicability of the re- identified [7–16]. Contact was established with five of maining studies. nine authors in whom it was attempted. No additional studies were included in the systematic review as a result The data from the included studies suggest a pattern of of this contact. kappas in the region of 0.80 for individual radiographic features and 0.30–0.60 for composite assessments of The characteristics of the study design and reporting of features. Kappa of 0.80 (i.e. 80% agreement after adjust- the 10 studies of observer interpretation of radiographic ment for chance) is regarded as "good" or "very good" features of lower respiratory infection in children are and 0.30–0.60 as "fair" to "moderate" [17]. The small summarized in Table 1. Seven of the studies satisfied four number of studies in this review however makes the de- BMC Medical Imaging 2001, 1:1 http://www.biomedcentral.com/1471-2342/1/1 2. Jaeschke R, Guyatt G, Sackett DL, for the Evidence-Based Medicine tection and interpretation of patterns merely specula- Working Group: Users' guides to the medical literature. III. tive. Only two radiographic features were examined by How to use an article about a diagnostic test. A. Are the re- more than one study. There is thus insufficient informa- sults of the study valid? JAMA 1994, 271:389-391 3. Greenhalgh T: How to read a paper. Papers that report diag- tion to comment on heterogeneity of observer variation nostic or screening tests. BMJ 1997, 315:540-543 in different clinical settings. 4. Reid MC, Lachs MS, Feinstein AR: Use of methodological stand- ards in diagnostic test research. Getting better but still not good. JAMA 1995, 274:645-651 The range of kappas overall is similar to that found by 5. Cochrane Methods Working Group on Systematic Reviews other authors for a range of radiographic diagnoses . of Screening and Diagnostic Tests: Recommended methods, 6 June 1996 [http://wwwsom.fmc.flinders.edu.au/FUSA/COCHRANE/co- However, "good" and "very good" agreement does not chrane/sadtdoc1.htm] necessarily imply high validity (closeness to the truth). 6. Fleiss JL: Statistical methods for rates and proportions, 2nd Observer agreement is necessary for validity, but observ- edn. New York, John Wiley & Sons 1981212-225 7. Coblentz CL, Babcook CJ, Alton D, Riley BJ, Norman G: Observer ers may agree and nevertheless both be wrong. variation in detecting the radiographic features associated with bronchiolitis. Invest Radiol 1991, 26:115-118 8. Coakley FV, Green J, Lamont AC, Rickett AB: An investigation Conclusions into perihilar inflammatory change on the chest radiographs Little information was identified on inter-observer of children admitted with acute respiratory symptoms. Clin agreement in the assessment of radiographic features of Radiol 1996, 51:614-617 9. Crain EF, Bulas D, Bijur PE, Goldman HS: Is a chest radiograph lower respiratory tract infections in children. When necessary in the evaluation of every febrile infant less than 8 available, it varied from "fair" to "very good" according to weeks of age? Pediatrics 1991, 88:821-824 10. Davies HD, Wang EE, Manson D, Babyn P, Shuckett B: Reliability of the features assessed. Insufficient information was iden- the chest radiograph in the diagnosis of lower respiratory in- tified to assess heterogeneity of agreement in different fections in young children. Pediatr Infect Dis J 1996, 15:600-604 clinical settings. 11. Kiekara O, Korppi M, Tanska S, Soimakallio S: Radiographic diag- nosis of pneumonia in children. Ann Med 1996, 28:69-72 12. Kramer MM, Roberts-Brauer R, Williams RL: Bias and "overcall" Aspects of the quality of methods and reporting that in interpreting chest radiographs in young febrile children. need attention in future studies are independent assess- Pediatrics 1992, 90:11-13 13. Norman GR, Brooks LR, Coblentz CL, Babcook CJ: The correla- ment of radiographs, the study of a usual clinical popula- tion of feature identification and category judgments in diag- tion of patients and description of that population, nostic radiology. Mem Cognit 1992, 20:344-355 14. Simpson W, Hacking PM, Court SDM, Gardner PS: The radio- description of the criteria for radiographic features, as- graphic findings in respiratory syncitial virus infection in chil- sessment of intra-observer variation and reporting of dren. Part I. Definitions and interobserver variation in confidence intervals around estimates of agreement. assessment of abnormalities on the chest x-ray. Pediatr Radiol 1974, 2:155-160 Specific description of criteria for radiographic features 15. McCarthy PL, Spiesel SZ, Stashwick CA, Ablow RC, Masters SJ, Dolan is particularly important, not only because of its associa- TF: Radiographic findings and etiologic diagnosis in ambula- tion with study validity but also to enable comparison be- tory childhood pneumonias. Clin Pediatr (Phila) 1981, 20:686-691 16. Stickler GB, Hoffman AD, Taylor WF: Problems in the clinical tween studies and application in clinical practice. and roentgenographic diagnosis of pneumonia in young chil- dren. Clin Pediatr (Phila) 1984, 23:398-399 17. Altman DG: Practical statistics for medical research. London, Competing interests Chapman & Hall 1991404 None declared Additional material Additional file 1 Studies excluded during the literature search and study selection, listed accord- Publish with BioMed Central and every ing to reason for exclusion. Click here for file scientist can read your work free of charge [http://www.biomedcentral.com/content/supplementary/1471-2342-1- "BioMedcentral will be the most significant development for 1-S1.doc] disseminating the results of biomedical research in our lifetime." Paul Nurse, Director-General, Imperial Cancer Research Fund Publish with BMC and your research papers will be: Acknowledgements available free of charge to the entire biomedical community Financial support from the University of Cape Town and the Medical Re- peer reviewed and published immediately upon acceptance search Council of South Africa is acknowledged. cited in PubMed and archived on PubMed Central References yours - you keep the copyright 1. Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, van der BioMedcentral.com Submit your manuscript here: Meulen JH, Bossuyt PM: Empirical evidence of design-related http://www.biomedcentral.com/manuscript/ editorial@biomedcentral.com bias in studies of diagnostic tests. JAMA 1999, 282:1061-1066 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png BMC Medical Imaging Springer Journals

Observer variation in chest radiography of acute lower respiratory infections in children: a systematic review

BMC Medical Imaging , Volume 1 (1) – Nov 12, 2001

Loading next page...
 
/lp/springer-journals/observer-variation-in-chest-radiography-of-acute-lower-respiratory-8IMPt1H73z
Publisher
Springer Journals
Copyright
Copyright © 2001 by Swingler; licensee BioMed Central Ltd.
Subject
Medicine & Public Health; Imaging / Radiology
eISSN
1471-2342
DOI
10.1186/1471-2342-1-1
Publisher site
See Article on Publisher Site

Abstract

Background: Knowledge of the accuracy of chest radiograph findings in acute lower respiratory infection in children is important when making clinical decisions. Methods: I conducted a systematic review of agreement between and within observers in the detection of radiographic features of acute lower respiratory infections in children, and described the quality of the design and reporting of studies, whether included or excluded from the review. Included studies were those of observer variation in the interpretation of radiographic features of lower respiratory infection in children (neonatal nurseries excluded) in which radiographs were read independently and a clinical population was studied. I searched MEDLINE, HealthSTAR and HSRPROJ databases (1966 to 1999), handsearched the reference lists of identified papers and contacted authors of identified studies. I performed the data extraction alone. Results: Ten studies of observer interpretation of radiographic features of lower respiratory infection in children were identified. Seven of the studies satisfied four or more of the seven design and reporting criteria. Six studies met the inclusion criteria for the review. Inter-observer agreement varied with the radiographic feature examined. Kappa statistics ranged from around 0.80 for individual radiographic features to 0.27–0.38 for bacterial vs viral etiology. Conclusions: Little information was identified on observer agreement on radiographic features of lower respiratory tract infections in children. Agreement varied with the features assessed from "fair" to "very good". Aspects of the quality of the methods and reporting need attention in future studies, particularly the description of criteria for radiographic features. ponents of diagnostic accuracy. Observer variation is Background Chest radiography is a very common investigation in however not sufficient for diagnostic accuracy. The key children with lower respiratory infection, and knowledge element of such accuracy is the concordance of the radi- of the diagnostic accuracy of radiograph interpretation is ological interpretation with the presence or absence of consequently important when basing clinical decisions pneumonia. Unfortunately there is seldom a suitable on the findings. Inter- and intra-observer agreement in available reference standard for pneumonia (such as his- the interpretation of the radiographs are necessary com- tological or gross anatomical findings) against which to BMC Medical Imaging 2001, 1:1 http://www.biomedcentral.com/1471-2342/1/1 compare radiographic findings. Diagnostic accuracy 5. Studies of a clinical population with a spectrum of dis- thus needs to be examined indirectly, including assess- ease in which radiographic assessment is likely to be ing observer agreement. used (as opposed to separate groups of normal children and those known to have the condition of interest). Observer variation in chest radiograph interpretation in acute lower respiratory infections in children has not Literature search been systematically reviewed. Studies were identified by a computerized search of MEDLINE from 1966 to 1999 using the following search The purpose of this study was to quantify the agreement terms: observer variation, or intraobserver (text word), between and within observers in the detection of radio- or interobserver (text word); and radiography, thoracic, graphic features associated with acute lower respiratory or radiography or bronchiolitis/ra, or pneumonia, viral/ infections in children. A secondary objective was to as- ra, or pneumonia, bacterial/ra, or respiratory tract infec- sess the quality of the design and reporting of studies of tions/ra. The search was limited to human studies of this topic, whether or not the studies met the quality in- children up to the age of 18 years. The author reviewed clusion criteria for the review. the titles and abstracts of the identified articles in Eng- lish or with English abstracts (and the full text of those judged to be potentially eligible). A similar search was Methods Inclusion criteria performed of HealthSTAR, a former on-line database of Studies meeting the following criteria were included in published health service research, and the HSRPROJ the systematic review: (Health Services Research Projects in Progress) data- base. Reference lists of articles retrieved from the above 1. An assessment of observer variation in interpretation searches were examined. Authors of studies of agree- of radiographic features of lower respiratory infection, or ment between independent observers on chest radio- of the radiographic diagnosis of pneumonia. graph findings in acute lower respiratory infections in children were contacted with an inquiry about the exist- 2. Studies of children aged 15 years or younger or studies ence of additional studies, published or unpublished. from which data on children 15 years or younger could be extracted. Studies of infants in neonatal nurseries were Data collection and analysis excluded. The author evaluated for inclusion potentially relevant studies identified in the above search. Characteristics of 3. Data presented that enabled the assessment of agree- study design and reporting listed in Table 1 were record- ment between observers. ed in all studies of observer variation in the interpreta- tion of radiographic features of lower respiratory 4. Independent reading of radiographs by two or more infection in children aged 15 years or younger (except in- observers. fants in neonatal nurseries). The criteria for validity were Table 1: Characteristics of study design and reporting a b c Present Absent Unclear Validity eligibility criteria Independent assessment of radiographs 9 1 0 Relevant clinical population (not case-control design) 7 3 0 Other validity characteristics Description of study population (3 of age, M:F ratio, clinical features and eligibility criteria) 6 4 0 Description of criteria for radiological signs 4 6 0 Presentation of indeterminate results 72 1 Applicability Meaningful measures of agreement (kappa or equivalent) 8 2 0 Confidence intervals for measures of agreement 1 9 0 Assessment of intra-observer variability 3 7 0 a b c Study characteristic present, according to research report Study characteristic absent, according to research report Insufficient information to determine whether the characteristic was present. BMC Medical Imaging 2001, 1:1 http://www.biomedcentral.com/1471-2342/1/1 Table 2: Characteristics of included studies Author Subjects Observers Simpson et al 1974 [14] 330 children under 14 years hospitalized with acute lower respiratory infection 2 radiologists McCarthy et al 1981 [15] 128 of 1566 children seen in a pediatric emergency room with a pulmonary infiltrate 2 radiologists in chest radiography (as judged by the duty radiologist) Crain et al 1991 [9] 230 of 242 febrile infants under 8 weeks evaluated in an emergency room and who 2 radiologists received a chest radiograph Kramer et al 1992 [12] 287 unreferred febrile children, aged 3–24 months, in an emergency unit 1 pediatrician, 1 duty radiologist, 1 "blind" pediatric radiologist Davies et al 1996 [10] 40 children under 6 months, 25 with pneumonia and 15 with bronchiolitis, admitted 3 pediatric radiologists to a tertiary care pediatric hospital Coakley et al 1996 [8] 113 previously well children under 3 years hospitalized with acute respiratory infec- 2 radiologists tions and no focal abnormality on radiography a b Kappa calculated from data extracted from the report Average weighted kappa Table 3: Observer agreement: kappa statistics (95% confidence intervals) Radiographic features Davies 1996 Simpson 1974 Coakley 1996 Kramer 1992 Crain 1991 McCarthy [10] [14] [8] [12] [9] 1981 [15] Inter-observer variation Consolidation 0.79 Pneumonia 0.46 (0.34–0.58) 0.47 (0.35–0.60) Collapse/consolidation 0.83 (0.72–0.94) Collapse/atelectasis 0.78 Hyperinflation/air trapping 0.83 0.78 (0.67–0.89) Peribronchial/ bronchial wall 0.55 0.55 (0.44–0.66) 0.43 (0.25–0.61) thickening Perihilar linear opacities 0.82 Abnormal 0.61 (0.48–0.74) Bacterial vs. viral etiology 0.27–0.38 Intra-observer variation Consolidation 0.91 Collapse/atelectasis 0.86 Hyperinflation/air trapping 0.85 Peribronchial /bronchial wall 0.76 thickening Perihilar linear opacities 0.87 Two observer pairs those for which empirical evidence exists for their impor- ed by at least four out of five sources) as the methodolog- tance in the avoidance of bias in comparisons of diagnos- ical inclusion criteria [1–5]. No attempt was made to tic tests with reference standards, and which were derive a quality score. relevant to tests of observer agreement. The selected cri- teria for applicability were those featured by at least two In studies meeting all the inclusion criteria for the re- of five sources of such recommendations. No weighting view, the author extracted the following additional infor- was applied to the criteria, except the use of the two most mation: number and characteristics of the observers and frequently recommended validity criteria (recommend- children studied, and measures of agreement. When no BMC Medical Imaging 2001, 1:1 http://www.biomedcentral.com/1471-2342/1/1 or more of the seven design and reporting criteria. Four studies described criteria for the radiological signs. Six of the studies satisfied the inclusion criteria for the system- atic review [8–10,12,14,15]. Of the remaining four stud- ies, three were excluded because a clinical spectrum of patients had not been used [7,13,16] and one because ob- servers were not independent [11]. The characteristics of included studies are shown in Table 2. A kappa statistic was calculated from data extracted from one report [14], and confidence intervals in three studies in which they were not reported but for which sufficient data were available in the report [8,9,14]. A summary of kappa statistics is shown in Table 3. Inter-observer agreement varied with the radiographic feature exam- ined. Kappas for individual radiographic features were around 0.80, and lower for composite assessments such as the presence of pneumonia (0.47), radiographic nor- mality (0.61) and bacterial vs. viral etiology (0.27–0.38). Findings were similar in the two instances in which more than one study examined the same radiographic feature (hyperinflation/air trapping and peribronchial/bronchi- al wall thickening). When reported, kappa statistics for intra-observer agreement were 0.10–0.20 higher than for inter-observer agreement. Discussion The quality of the methods and reporting of studies was not consistently high. Only six of 10 studies satisfied the Figure 1 inclusion criteria for the review. The absence of any of Review profile the validity criteria used in this study (independent read- ing of radiographs, the use of a clinical population with measures of agreement were reported, data were extract- an appropriate spectrum of disease, description of the ed from the reports and kappa statistics were calculated study population and of criteria for a test result) has been using the method described by Fleiss [6]. Kappa is a found empirically to overestimate test accuracy, on aver- measure of the degree of agreement between observa- age, when a test is compared with a reference standard tions, over and above that expected by chance. If agree- [1]. A similar effect may apply to the estimation of inter- ment is complete, kappa = 1; if there is only chance observer agreement, in that two observers may agree concordance, kappa = 0. with each other more often when aware of each other's assessment, and radiographs drawn from separate pop- ulations of normal and known affected children will ex- Results A review profile is shown in Figure 1. For a list of rejected clude many of the equivocal radiographs in a usual studies, with reasons for rejection, see Additional file 1: clinical population, thereby possibly falsely increasing Rejected studies. Ten studies of observer variation in the agreement. Only four of ten studies described criteria for interpretation of radiographic features of lower respira- the radiological signs, with potential negative implica- tory infection in children aged 15 years or younger were tions for both the validity and the applicability of the re- identified [7–16]. Contact was established with five of maining studies. nine authors in whom it was attempted. No additional studies were included in the systematic review as a result The data from the included studies suggest a pattern of of this contact. kappas in the region of 0.80 for individual radiographic features and 0.30–0.60 for composite assessments of The characteristics of the study design and reporting of features. Kappa of 0.80 (i.e. 80% agreement after adjust- the 10 studies of observer interpretation of radiographic ment for chance) is regarded as "good" or "very good" features of lower respiratory infection in children are and 0.30–0.60 as "fair" to "moderate" [17]. The small summarized in Table 1. Seven of the studies satisfied four number of studies in this review however makes the de- BMC Medical Imaging 2001, 1:1 http://www.biomedcentral.com/1471-2342/1/1 2. Jaeschke R, Guyatt G, Sackett DL, for the Evidence-Based Medicine tection and interpretation of patterns merely specula- Working Group: Users' guides to the medical literature. III. tive. Only two radiographic features were examined by How to use an article about a diagnostic test. A. Are the re- more than one study. There is thus insufficient informa- sults of the study valid? JAMA 1994, 271:389-391 3. Greenhalgh T: How to read a paper. Papers that report diag- tion to comment on heterogeneity of observer variation nostic or screening tests. BMJ 1997, 315:540-543 in different clinical settings. 4. Reid MC, Lachs MS, Feinstein AR: Use of methodological stand- ards in diagnostic test research. Getting better but still not good. JAMA 1995, 274:645-651 The range of kappas overall is similar to that found by 5. Cochrane Methods Working Group on Systematic Reviews other authors for a range of radiographic diagnoses . of Screening and Diagnostic Tests: Recommended methods, 6 June 1996 [http://wwwsom.fmc.flinders.edu.au/FUSA/COCHRANE/co- However, "good" and "very good" agreement does not chrane/sadtdoc1.htm] necessarily imply high validity (closeness to the truth). 6. Fleiss JL: Statistical methods for rates and proportions, 2nd Observer agreement is necessary for validity, but observ- edn. New York, John Wiley & Sons 1981212-225 7. Coblentz CL, Babcook CJ, Alton D, Riley BJ, Norman G: Observer ers may agree and nevertheless both be wrong. variation in detecting the radiographic features associated with bronchiolitis. Invest Radiol 1991, 26:115-118 8. Coakley FV, Green J, Lamont AC, Rickett AB: An investigation Conclusions into perihilar inflammatory change on the chest radiographs Little information was identified on inter-observer of children admitted with acute respiratory symptoms. Clin agreement in the assessment of radiographic features of Radiol 1996, 51:614-617 9. Crain EF, Bulas D, Bijur PE, Goldman HS: Is a chest radiograph lower respiratory tract infections in children. When necessary in the evaluation of every febrile infant less than 8 available, it varied from "fair" to "very good" according to weeks of age? Pediatrics 1991, 88:821-824 10. Davies HD, Wang EE, Manson D, Babyn P, Shuckett B: Reliability of the features assessed. Insufficient information was iden- the chest radiograph in the diagnosis of lower respiratory in- tified to assess heterogeneity of agreement in different fections in young children. Pediatr Infect Dis J 1996, 15:600-604 clinical settings. 11. Kiekara O, Korppi M, Tanska S, Soimakallio S: Radiographic diag- nosis of pneumonia in children. Ann Med 1996, 28:69-72 12. Kramer MM, Roberts-Brauer R, Williams RL: Bias and "overcall" Aspects of the quality of methods and reporting that in interpreting chest radiographs in young febrile children. need attention in future studies are independent assess- Pediatrics 1992, 90:11-13 13. Norman GR, Brooks LR, Coblentz CL, Babcook CJ: The correla- ment of radiographs, the study of a usual clinical popula- tion of feature identification and category judgments in diag- tion of patients and description of that population, nostic radiology. Mem Cognit 1992, 20:344-355 14. Simpson W, Hacking PM, Court SDM, Gardner PS: The radio- description of the criteria for radiographic features, as- graphic findings in respiratory syncitial virus infection in chil- sessment of intra-observer variation and reporting of dren. Part I. Definitions and interobserver variation in confidence intervals around estimates of agreement. assessment of abnormalities on the chest x-ray. Pediatr Radiol 1974, 2:155-160 Specific description of criteria for radiographic features 15. McCarthy PL, Spiesel SZ, Stashwick CA, Ablow RC, Masters SJ, Dolan is particularly important, not only because of its associa- TF: Radiographic findings and etiologic diagnosis in ambula- tion with study validity but also to enable comparison be- tory childhood pneumonias. Clin Pediatr (Phila) 1981, 20:686-691 16. Stickler GB, Hoffman AD, Taylor WF: Problems in the clinical tween studies and application in clinical practice. and roentgenographic diagnosis of pneumonia in young chil- dren. Clin Pediatr (Phila) 1984, 23:398-399 17. Altman DG: Practical statistics for medical research. London, Competing interests Chapman & Hall 1991404 None declared Additional material Additional file 1 Studies excluded during the literature search and study selection, listed accord- Publish with BioMed Central and every ing to reason for exclusion. Click here for file scientist can read your work free of charge [http://www.biomedcentral.com/content/supplementary/1471-2342-1- "BioMedcentral will be the most significant development for 1-S1.doc] disseminating the results of biomedical research in our lifetime." Paul Nurse, Director-General, Imperial Cancer Research Fund Publish with BMC and your research papers will be: Acknowledgements available free of charge to the entire biomedical community Financial support from the University of Cape Town and the Medical Re- peer reviewed and published immediately upon acceptance search Council of South Africa is acknowledged. cited in PubMed and archived on PubMed Central References yours - you keep the copyright 1. Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, van der BioMedcentral.com Submit your manuscript here: Meulen JH, Bossuyt PM: Empirical evidence of design-related http://www.biomedcentral.com/manuscript/ editorial@biomedcentral.com bias in studies of diagnostic tests. JAMA 1999, 282:1061-1066

Journal

BMC Medical ImagingSpringer Journals

Published: Nov 12, 2001

References