General Approach to Spirometry Quality Reviews

The following is a summary of the “approach” used to review spirometry test results in a research study or the judgment the reviewer brings to the process.

The QC reviewer software automatically calculates an FVC and FEV₁ quality grade but allows the reviewer to grade the FVC and FEV₁ with Grades A thru F. In general, tests with grades above a D are useable and tests with grades D and below are highly suspect and probably should not be used in an analysis. If a study wants to use only good tests, tests with either an A or B grade should be used. Similarly, questionable tests could be included by including tests with a D grade. Since there are separate grades for FVC and FEV₁, a study may use only good FVC or FEV₁ tests separately.

It is important to note that there are two separate acceptable or valid test requirements. One used during test performance or a goal during test performance where three acceptable maneuvers and a repeatable FVC and FEV₁ are required. The consequence of failure to meet these criteria is that another maneuver is performed, provided no more than 8 maneuvers have been performed. A second definition, used during interpretation, requires only 2-acceptable maneuvers regardless of the test repeatability. However, the number of acceptable maneuvers and the test repeatability may be an important consideration in whether a test is acceptable or valid.

With any study, care must be taken in grading tests with a D or F grades as these subjects will likely be excluded from the study and their results not reported. Sometimes poor performance, particularly the lack of a repeatable test or good end of test (plateau), may be the result in obstructive lung disease. So, elimination of these subjects could result in the elimination of the very subjects most of interest. In addition, statistical results could be biased towards higher values.

Perhaps the most relevant question for a reviewer is: “does this test represent a subject’s best effort to within the repeatability criteria of 150 ml" and could the lack of a repeatable test be due to lung disease. The reviewer should determine if a test is valid based on all available information, even from unacceptable maneuvers.

One example occurs when the subject has only 2 acceptable curves which are not repeatable and the other curves are unacceptable due to early termination and large extrapolated volumes. However, the curves with the large extrapolated volumes do confirm the FVC repeatability – so the test could be graded a C rather than a D or F. Strict technical application of the acceptability/valid test criteria can always be done by a computer algorithm; thus the reviewer role is to apply some judgment to the process.

A second common example is for children and adolescents who do not exhale for the require length of time (3 or 6 seconds). If these subjects have a 1 second plateau and the reviewer judges that these curves represent a maximum volume (FVC), the grade should be adjusted upwards.

A third example is a subject with COPD and a low FEV₁/FVC say less than 0.45. Because of COPD, the subject cannot provide a repeatable FVC with even 8 maneuvers. So, the grade should be at least a C or better as the lack of an acceptable curve due to end of test failures is not a sufficient reason to exclude or grade the subject’s results with a D or F. This requires the judgment of the reviewer.

One note, all values are used to compute best values, even from unacceptable maneuvers, except for large extrapolated volumes or coughs. This approach is allowed in the ATS/ERS-2005 statement where they recognize that events after one second (end of test), may not affect the FEV₁ results.

Please send comments to: comments@occspiro.com

Return