Some issues require discussion and these are marked with an asterisk.
Reviewers will be listed the reviewers, but not link their names to the any particular grade set. This will allow a more scientific discussion on grading.
1. Some scales are small and hard to read making it difficult to track individual curves. Can the QC reviewers get options to scale graphics?
This is a practical issue. The curve displays are images and it would take some work to add more images for scaling purposes. I attempted to scale the graphs when I created the images and the box at the bottom of the screen changes color and words to indicate a re-scaled image. For example, those with smaller volumes images have been enlarge using a different scale. If you send me a list of the test numbers where one additional scaled image is needed, it might be practical to do a few tests with one additional specific scale.
2. The biggest differences are by far the grading of FVC as an F by the computer, when it is apparent (to me) that EOT was reached.
I very much agree and it is because the end of test is not always clean and perfect. A slight (> 30 ml) increase in volume will cancel the plateau achievement if it is not sustained for one second. This is why I created the EOT definitions link on the site which I hope you have viewed. If you have any comments on the EOT definition link, let me know.
3. There are some curves with accepted EOT and hence FVC, but with slowish starts with some evidence of sub-maximal effort through the test leading to higher FEV than warranted. In my practice I would exclude these efforts so as not to over estimate FEV1. Exclusion is tricky, so we need to consider how to handle these efforts.
Exclusion is very tricky which is why the ATS, and later I think, the ATS/ERS-2005 never put that requirement in their recommendations. It has generally been felt that the problem of slightly elevated FEV1 (negative effort dependency) with sub-maximal efforts is less than the potential problems of someone arbitrarily deleting maneuvers. Usually the increase in FEV1 is not much more than the repeatability criterion. The extrapolated volume should detect really slow starts and the time to peak flow is available. The problem with time to peak flow is that it is very instrument dependent, it sometimes does not work, and no data are available for setting a limit. In my experience in teaching and reviewing tests, we have significant problems getting good test and I think having folks judging and rejecting curves based on their determination of a sub-maximal effort would be problematic.
4. There are some tests I have graded FVC as an A or B that could be considered the other. That is, I have graded it B as there are 2 good tests (good start and EOT etc) leading to repeatable FVC. But equally they could be A. As the other test was repeatable for FVC, but the start of test was less "good".
There are separate grades for FVC and FEV1 as they are really looking a different quality issues. The important grades are those below C which will result in the exclusion of a subject.
5. Sometimes when you back to a test you have graded the FVC grade is blank. it's saved (when you check in navigate section) but disconcerting!
I think I fixed the problem and you may have noticed this early in your reviews but it should not have happened later – I discovered the problem after you started your reviews. I also added the viewing of your comments when going back to a test. I did a lot of programming or web-site typing in a short period of time, so there will be some bugs.
6. In some tests there are a lot of efforts leading to the last effort being obscured within the graphics and un-readable.
If you send me the test numbers (or ID), I can try and provide a better image of the tests – perhaps highlighting a particular curve. There is a trade-off between the complexity of the reviewing modality and convenience of a web-site. I have reviewer software that allows much more scaling, highlighting, etc than cannot be easily done on a web-site. I could add more images for scaling for other display purposed, but these are images that must be created individually. So, I need specific IDs and specific imaging recommendations. I could add a link for the reviewer to click and the link appears when an additional image is available. It would be difficult, and probably not necessary, to add these additional images for all tests .
7. What are the definitions of PEF time and Low PEF?
PEF time calculations are very dependent on the spirometer. The ATS/ERS-2005 has not established any particular cutoff value for an acceptable curve and I provide it for information purposes. PEF Time is the time it takes to reach peak flow. The cutoff used for the analysis in this effort is 120 milliseconds.
A low peak flow is indicate for any curve that has a peak flow less than 20% of the largest observed peak flow.
8. Re. graphs: Is the term ("Non) Reproducible Test NA:x" given in the graphs based on your optical impression (or on repeatability of the values) and indicates the number of re curves?
Non-reproducibility message is independent of the number of acceptable curves as long as there are two curves without coughs or large extrapolated volumes.
The columns headed by # or A with their different colors (e.g. blue or black) are also not clear to me.
The # is the curve number and it is red if the curve is rejected and lime (green) if it is the best curve. The colors under the A are lime if there is a good end of test, yellow if either there is < 6 sec or no plateau, and red if there was both no plateau and < 6-sec.
9. In some figures there is a curve given by a thick blue line whereas in most this is not the case (even if there is a blue color box given in column #).
The thick blue line is used to indicate that the curve is highlighted when you click on it using the software and is not relevant for the web-site reviews. I just forgot to remove the highlight function when I copied the figures.
10. When defining a curve as acceptable or unacceptable we have to take into consideration our decision that 6 s exhalation time is not a prerequisite for acceptability (as opposed to No. 3 in the "Curve quality bars" and in the "End of test quality bar" (the latter is still not quite clear to me; e.g. why is it red in curve 4 of test No. 3 where PEF Time was not met?).
There are several factors that are presented but not required for an acceptable curve: "6-seconds, time to peak flow, low peak flow, etc." and are present for the reviewer to consider but are not required.
11. Did I understand you correctly that as an additional criterium a curve is not acceptable for FVC and FEV1 grading if its FVC or FEV1 differs by more than 300 mL from the highest FVC re FEV1? If so, a corresponding bar in the column "Curve quality bars" would be helpful.
You are correct, but I have not had time to add this feature. One problem is the database structure is such that it cannot be easily added.
12. Some tests contain several nearly identical curves, e.g. test no. 17, however do not meet the plateau criterium. We should discuss whether we should really discard them.
I am not sure what you mean by discard them. The ATS/ERS-2005 clearly states that the FEV1 can be used from these curves. One of the curves has a 0.9 sec plateau and is so close that you could argue it is acceptable. The repeatability if very impressive for this subject to the point that you could argue that the test is valid or represents to within 150 ml the subjects best effort values. Indeed these curves, and ones that will help delineate good and bad tests. So, at least for the purposes of this comparison, I would like to keep them.
13. There are frequently curves without a maximal effort throughout the manoeuvre; you mostly defined them as acceptable.
The shape of curves, in terms of whether it is acceptable, is more of a judgment than a technical measurement that can be made. So, judgment of good/bad curves should be at the discretion of the reviewer. Sometimes sub-maximal efforts will have the largest FEV1 and whether those curves should be excluded is controversial. Also, the result of including curves with poor effort should have no impact on the FVC and FEV1 values since they would be lower and correspondingly not reported since the largest values are reported. I have arbitrary chosen to provide this curves to the reviewer for consideration in the interest of not providing information that I have arbitrary chosen not to provide to reviewers.
14. My grades frequently differ from yours, see my comments (unfortunately the program does not allow to review comments I made). I would greatly appreciate how you see them.
The grades you see are the computer determined grades, and will probably be lower, but sometimes than your grades. These grades are for information purposes. My grades should not be available to reviewers yet to avoid bias but can be if you wish. I will see if I can add the capability of seeing your comments after they are submitted.
Curves are either acceptable or unacceptable and only the tests overall is provided a grade. Individual curves are not assigned a specific grade.
This is even more complicated by the fact that some curves are unacceptable for consideration for determining the best values (those with coughs during the first second and those with large extrapolated volumes). In contrast, some curves are unacceptable in terms of counting towards the number of acceptable curves, but CAN be used to derived the best values (curves with early termination, etc.). There are a few curves that I as a reviewer have marked as unacceptable but not many.
The grades are for the test and there is a separate grade for FVC and FEV1. The computer determined values for FVC and FEV1 test grades are based on the number of acceptable curves and the repeatability of the FVC and FEV1 as described in our definitions.
Perhaps some confusion arose from the definition of an incomplete inhalation: "For FVC and FEV1 grades calculations, an acceptable curve's FVC or FEV1 should not differ by more than 300 mL (250 mL) from the largest observed FVC or FEV1, otherwise it is considered unacceptable due to an incomplete inhalation." This sentence would probably be clearer if the phrase "For FVC and FEV1 grades calculations" was omitted. This definition was intended to be a definition for the vague unacceptable curve term "incomplete inhalation." So, there is a separate comparison of each curves values with the largest FVC and FEV1 values to determine if the curve has an "complete inhalation" and there an acceptable curve.
In my opinion, the individual reviewers can use whatever information is available to determine the FVC and FEV1 grades. Information from unacceptable and acceptable curves can be used. For example, if there are several curves with large extrapolated, it may be that the FVCs from these curves can be used to help validate the one or two acceptable curve FVCs.
Similarly, if there technically no plateau but the curve has an "obvious" plateau, the reviewer could judge the curve acceptable for determining the number of acceptable curves. There were many tests where I felt there was an adequate plateau where the computer did not technically find a good end of tests. Therefore some test grades were elevated to "C" that the computer labeled as an "F".
Concerning curves without a computer determined plateau, the values from these curves can still be used. The reviewer must judge whether the computer determined plateau was accurate or if the subject obviously had completed his/her exhalation. This is probably one area where there will be the most disagreement. You might want to look at the EOT definition page I created for a better explanation of the issues: End of Test Definition