Endoscopic Evaluation of Adenoids: Reproducibility Analysis of Current Methods
Article information
Abstract
Objectives
To investigate intra- and interexaminers' reproducibility of usual adenoid hypertrophy assessment methods, according to nasofiberendoscopic examination.
Methods
Forty children of both sexes, ages ranging between 4 and 14 years, presenting with nasal obstruction and oral breathing suspected to be caused by adenoid hypertrophy, were enrolled in this study. Patients were evaluated by nasofiberendoscopy, and records were referred to and evaluated by two experienced otolaryngologists. Examiners analysed the records according to different evaluation methods; i.e., estimated, and measured percentage of choanal occlusion; as well as subjective and objective classificatory systems of adenoid hypertrophy.
Results
Data disclosed excellent intraexaminer reproducibility for both estimated and measured choanal occlusion. analysis revealed lower reproducibility rates of estimated in relation to measured choanal occlusion. Measured choanal occlusion also demonstrated less agreement among evaluations made through the right and left sides of the nasal cavity. Alternatively, intra- and interexaminers reliability analysis revealed higher agreement for subjective than objective classificatory system. Besides, subjective method demonstrated higher agreement than the objective classificatory system, when opposite sides were compared.
Conclusion
Our results suggest that measured is superior to estimated percentage of choanal occlusion, particularly if employed bilaterally, diminishing the lack of agreement between sides. When adenoid categorization is used instead, the authors recommend subjective rather than objective classificatory system of adenoid hypertrophy.
INTRODUCTION
Adenoid hypertrophy is known to be associated with several harmful clinical conditions [1-3]. Due to the relevance of this issue, a great deal of interest has been given to diverse methods of examinations and parameters for identification and evaluation of adenoid hypertrophy [2,4-6].
Among various examination methods, nasofiberendoscopy (NFE) has been currently considered the "gold standard" exam for adenoid evaluation [7]. Moreover, NFE is more effective when identifying adenoid hypertrophy [8], and has been indicated as the main diagnostic tool when adenoidectomy is considered [5].
Therefore, several methods of adenoid size assessment by means of NFE have been introduced [9-16], and largely disseminated [5-8,17-21]. However, several of these diagnostic methods [5-9,12-14,18,19] are subjective, or occasionally, poorly described. Even among researchers that employ objective evaluation methods of the adenoid size [10,11,15-17,20,21], several have failed to perform intra- or interexaminers reproducibility tests and reliability analysis [10,15,17,21].
In view of the relevance of the reliability of measurement tools designed for adenoid hypertrophy evaluation [22], the main objective of this study was to test 4 of the most usual NFE evaluation methods, concerning their intra- and interexaminers reproducibility [9,11,13,16]. Secondarily, this study also intended to verify the relationship between readings recorded from the NFE view of the nasal cavity left and right sides, according to the same assessment methods [9,11,13,16].
MATERIALS AND METHODS
This research was approved by the Ethics Review Board of the institution it was developed (protocol 0181/08).
Forty children of both sexes, ages ranging between 4 and 14 years, were selected from the Institutional Paediatric Otolaryngology Referral Centre. In order to meet inclusion criteria, patients should have or present complaints of nasal obstruction or oral breathing, with suspected diagnosis of adenoid hypertrophy. Children with syndromes or head and neck malformations were excluded. Subjects with acute infection of the respiratory tract, or with history of previous adenoidectomy, were also dismissed. Informed consent was obtained from all the participants.
The selected sample was then submitted to flexible NFE examination. All the exams were performed after topical anesthesia application (lidocaine 2%) at both nostrils. All exams were recorded, and the digital file derived from the video was edited, so the identification of the patient was kept preserved. The edited clips were then handed to two independent, "blind" examiners, both experienced otolaryngologists, and distinct from the ones involved with the NFE recording. Both examiners were consultant physicians, which have been practicing otolaryngologic specialty for, at least, 5 years.
In order to evaluate the clips, both examiners employed four assessment methods [9,11,13,16]. Two of them [13,16] are designed to categorize adenoid hypertrophy on four levels according to objective [16] (objective adenoid classification [Ob-C]), or subjective criteria [13] (subjective adenoid classification [Sub-C]). The other two assessment methods [9,11] refer to quantitative measurements of nasopharyngeal obstruction, which could be subjectively estimated [9] (estimated choanal occlusion [ECO]), or objectively measured [11] (measured choanal occlusion [MCO]).
Examiners were oriented to choose the frame sequence that would provide the best view of the adenoid in relation to the choana, obtained from the most distal portion of the inferior turbinate. At these frames, the patient should be performing inspiration exclusively through the nose, with no evidence of the soft palate elevation. The assessment methods (Ob-C, Sub-C, ECO, and MCO) were applied on different periods of time, which permitted truly independent evaluations.
MCO (%)
In order to employ this method [11], the examiner selected a single clip frame. The selected frame was then converted into a digital file (JPEG format), and MCO was finally calculated by ImageJ [23], an image processing software, as the percentage of the choanal area occupied by the adenoid tissue (Fig. 1).
ECO (%)
According to this method [9], examiners estimated the degree of nasopharyngeal obstruction relying exclusively upon subjective perception.
Ob-C
According to this method [16], adenoid hypertrophy is classified according to its anatomical relationship with adjacent structures such as vomer, soft palate and torus tubaris: 1) grade 1, none of the above-cited structures contact with the adenoid tissue; 2) grade 2, the adenoid tissue contacts with the torus tubaris; 3) grade 3, the adenoid tissue contacts with torus tubaris and vomer; 4) grade 4, the adenoid tissue contacts with torus tubaris, vomer and soft palate in resting position.
Sub-C
It relies on the examiners' subjective perception, employing the following system of adenoid hypertrophy classification 1) grade 1, adenoid occupying less than 25% of the choanal area; 2) grade 2, adenoid occupying 25-50% of the choanal area; 3) grade 3, adenoid occupying 50-75% of the choanal area; 4) grade 4, adenoid occupying 75-100% of the choanal area [13].
Statistical analysis
Reliability of the NFE methods of evaluation was determined by intra- and interexaminers reproducibility analysis. Regarding quantitative variables (MCO and ECO), analysis was accomplished by calculating the intraclass coefficient correlation (ICC), as well as the mean differences between paired readings. Kappa (κ) coefficient, as well as overall percentage of agreement, which includes agreement occurrences by chance, were employed to analyze reproducibility of the classificatory variables (Ob-C and Sub-C). The relationship between nasal cavity right and left sides' readings was carried out using the same statistical means.
The ICC was interpreted according to Weir [24], which classifies reliability as "poor" (ICC≤0.20), "reasonable" (0.20<ICC≤0.40), "good" (0.40<ICC≤0.60), "very good" (0.60<ICC≤0.80) or "excellent" (0.80<ICC≤1.00). Kappa coefficient was interpreted according criteria described by Landis and Koch [25], whereby the reliability could be characterized as "slight" (κ≤0.20), "fair" (0.20<κ≤0.40), "moderate" (0.40<κ≤0.60), "substantial" (0.60<κ≤0.80) or "almost perfect" (0.80<κ≤1.00). The level of significance for the statistical tests was 5%.
RESULTS
Our research included 20 (50.0%) females and 20 (50.0%) males. Mean age was 9.5 years (range, 4.1 to 14.3 years; standard deviation [SD], 2.4). Clinically, they were all suspected to have adenoid hypertrophy (40/40, 100.00%). Most of the patients complained of mixed (19/40, 47.5%), or exclusively oral breathing (17/40, 42.5%).
According to both evaluations of examiner 1, and examiner 2, MCO mean readings were nearly 70% (71.70% to 73.83%); while ECO mean readings varied from 61.20% to 67.89% (Table 1). Regarding the classificatory parameters (Ob-C and Sub-C), most of the patients demonstrated grade 3 or 4 adenoid hypertrophy (Table 1). The reproducibility tests were calculated over 71 evaluations (31/40 bilateral records; 9/40 unilateral records), which were randomly ordered before being evaluated by both examiners.
Quantitative diagnostic tools (MCO and ECO), were highly reproducible when employed by the same examiner (ICC=0.883, P<0.001; ICC=0.885, P<0.001, respectively). Interexaminers analysis also showed "excellent" reliability (ICC=0.854, P<0.001) for MCO. ECO presented "very good" reliability when performed by distinct examiners (ICC=0.728, P<0.001). On average, the same examiner demonstrated 8.09% of variation between paired readings (SD, 8.16%) for ECO; and 4.82% (SD, 5.95%) for MCO. Different examiners demonstrated variation of 10.14% (SD, 10.75%) for ECO, and 5.38% (SD, 6.27%) for MCO.
Regarding the categorical parameters (Ob-C and Sub-C), intraexaminer analysis revealed "substantial" agreement (κ=0.732, P<0.001) for Sub-C, and "moderate" agreement (κ=0.457, P<0.001) for Ob-C. Overall percentage of agreement was 81.69% (58/71) for Sub-C and 60.56% (43/71) for Ob-C.
In relation to interexaminers analysis, overall percentage of agreement was 56.33% (40/71) for Sub-C and 50.70% (36/71) for Ob-C. Kappa coefficient calculation showed "fair" agreement for Ob-C (κ=0.291, P<0.001). Sub-C interexaminers kappa coefficient could not be calculated, since one of the examiners did not classify any patient as grade 1 adenoid hypertrophy. Comparison tests between nasal cavity sides of NFE examination were performed by analyzing exclusively the main examiner (examiner 1) readings. Only patients who had bilateral inspections (31/40) were considered for such analysis.
An "excellent" agreement between sides (ICC=0.933, P<0.001) was observed for subjective evaluation ECO. Objective MCO showed "reasonable" agreement (ICC=0.404, P=0.010). Moreover, MCO demonstrated larger variation (mean, 10.4%; SD, 11.1%) than ECO (mean, 4.4%; SD, 6.0%) between right and left side readings.
Overall agreement percentage between right and left side evaluations was 90.32% (28/31) for Sub-C, and 67.74% (21/31) for Ob-C. In addition, kappa coefficient revealed "almost perfect" agreement between bilateral evaluations according to Sub-C (κ=0.842, P<0.001), and "moderate" agreement for Ob-C (κ=0.550, P<0.001).
DISCUSSION
The literature reveals large variability concerning NFE methods of adenoid evaluation [5-21]. Among all parameters, four representative diagnostic tools [9,11,13,16] were selected, so their reproducibility could be analyzed. Therefore, further methodological studies are still warranted, so additional assessment methods [10,12,14,15] might as well be evaluated regarding its reproducibility.
The mean age of our sample study is slightly higher (9.5 years) than most of the studies addressing reproducibility of adenoid diagnostic methods [5,6,9,11,16,19,20]. Their sample mean ages varied from 1.25 years [19] to 10.9 years [16]. Any comparison between our results and further literature should consider the differences regarding age groups.
ECO and MCO
Although both methods showed excellent intraexaminer reliability, interexaminers analysis revealed ECO to have lower rates of reproducibility. In addition, ECO also demonstrated higher intra- and interexaminers differences among paired readings, when compared to MCO. This picture confirms the inherent reliability that is usually expected from objective methods of investigation, and also points to a preferential choice for MCO over ECO, particularly when it comes to the production of scientific evidence. Nevertheless, when MCO is preferred as the method of adenoid evaluation, the authors recommend NFE inspection through both nostrils, since this method revealed lower agreement, and higher variation between opposite sides readings.
In our study, ECO performance was poorer than previously demonstrated [9]. Such study [9] reported maximum (not on average) variation of 10% among examiners. This difference may be related to sampling discrepancies.
Regarding MCO, Demain and Goetz [11] reported only 0.6% of variation between measurements of choanal and adenoid areas, whereas in our study the mean variations were 4.82% to 5.38%. Demain and Goetz [11] measurement instruments (planimetry over projected transparencies) were distinct from which we employed (software), what may explain the discrepancies. Yet, both studies revealed acceptable levels of "error" involving this method of evaluation (MCO), reinforcing its recommendation over ECO.
Ob-C and Sub-C
Concerning the objective method (Ob-C), the authors of such a method [16] reported significant degrees of reliability (overall percentage of agreement, 70.48%, κ=0.71; κ=0.62 for medical residents, and κ=0.83 for experienced otolaryngologists). A subsequent study [20], confirmed this method to be dependent on the level of experience of the examiner (κ=0.574 for medical residents, κ=0.718 for experienced otolaryngologists). Overall, our results clearly showed poorer performance. Considering the differences associated with the level of experience of the examiner [16,20], and the low rates of reliability obtained by our study, the authors recommend specific training strategies, whenever Ob-C is chosen.
Subjective Sub-C presented higher rates of agreement than Ob-C, which is based on objective criteria. Bravo et al. [19] and Ysunza et al. [6] reported even better interexaminers performance (95% of agreement) than the present study. The results provided by this research and available literature [6,19] reinforce the recommendation of this method (Sub-C). Considering Sub-C simplicity and its straightforward use, the authors endorse this method, principally on clinical settings, which demand ease of communication among professionals and prompt diagnosis. In addition, Sub-C method reveals excellent rates of agreement between sides when compared to Ob-C. In that case, the possibility of "one side only" evaluation is recommended if adenoid hypertrophy is the single purpose of NFE examination.
Despite the fact that MCO and Sub-C methods have provided better reliability results, they cannot be accredited as definitive diagnostic methods of adenoid hypertrophy. Diagnostic methods must also include other requirements, such as accuracy, feasibility and, above all, it must positively affect clinical decisions and patient outcome [22].
Future research should then associate reliable (MCO and Sub-C), accurate and practical methods available to a collection of obstructive respiratory symptoms as an effort of systematization of the diagnostic process for adenoid hypertrophy, leading to wise therapeutic management.
Our results suggest that measured is superior to estimated percentage of choanal occlusion, particularly if employed bilaterally, diminishing the lack of agreement between sides. When adenoid categorization is used instead, the authors recommend subjective rather than objective classificatory system of adenoid hypertrophy.
ACKNOWLEDGMENTS
This research was financially supported by the State of São Paulo Research Foundation (FAPESP), under the process number 08/53538-0.
Notes
No potential conflict of interests relevant to this article was reported.