Resolution modeling enhances PET imagingстатья из журнала
Аннотация: One of the methods frequently employed to enhance PET images is resolution modeling (RM). Resolution modeling is known to visually enhance images. Some argue, however, that such improvements are deceptive and that RM leads to degradations elsewhere, whereas others claim that the enhancements are real and overall beneficial. This is the premise debated in this monthˈs Point/Counterpoint. Image generation in a wide-variety of fields, from microscopy1 to astronomy,2 employs resolution modeling techniques to reduce the degradations inherent in their imperfect imaging systems. In all of these applications, resolution degradation is a consequence of many factors including the physics of the signal, the sensors, and the electronics. In PET, the physical degradations directly related to spatial resolution loss, in order of origination from signal to final measurement, include random positron range, photon pair noncollinearity, attenuation, intercrystal penetration, intercrystal scatter, detector inefficiencies, and electronics mispositioning. In total, a PET system has a spatially variant resolution loss. Resolution modeling and compensation techniques have been proposed for over two decades to account for the limitations in localization of radio-tracer distributions.3 To support the claim that "resolution modeling enhances PET imaging," it is important to clarify the term "enhances." In general, image restoration attempts to improve the image signal and/or reduce noise. In practice, restoration (or enhancement) needs to improve the task-based image quality (not simply signal or noise). It is well appreciated that two common tasks in clinical PET imaging include detection and quantification. In specific circumstances, resolution modeling in PET can genuinely improve hot-feature detection4 and quantification.5 Methods that enhance an image rarely provide improvement in all metrics and tasks—for example, detection may improve at the expense of quantitative accuracy. At the risk of oversimplifying the issue in the interest of a terse argument, my view is that most resolution modeling techniques provide some contrast enhancement with some apparent noise reduction (although true noise is minimally changed).6 Both trends lead to demonstrably better detection performance,4 leaving little room to question the assertion that resolution modeling enhances detection. An interesting debate is whether resolution modeling enhances quantification. In PET quantification, we need both accurate and reproducible estimates of activity concentrations. The NEMA IEC phantom is commonly used to measure contrast recovery curves (CRC) for a system showing increasing partial volume errors for features below 2-3 cm. One holy grail in PET imaging is to develop a system and image generation method with a flat CRC curve (no partial volume effect errors with no size-dependent bias) with small error bars (reproducible). It has been shown that resolution modeling can cause unpredictable edge artifacts and these artifacts are generally exaggerated when the resolution model overestimates degradations.7 Modest resolution models, which do not try to recover more frequency content that the sampling will support, have manageable edge artifacts and lead to genuine, albeit modest, contrast-to-noise improvements.8 Current resolution modeling methodology is far from achieving the flat CRC curve. Future methods with better system modeling and convergent algorithms hold promise to improve the CRC further. These comments lead to a tempered statement that appropriate application of current resolution modeling techniques enhances, but does not solve all the challenges with, detection and quantification in PET imaging. Resolution modeling in PET has attracted considerable interest especially in the past decade.3 Unlike post-reconstruction partial volume correction (PVC) methods, RM models resolution-degrading phenomena within the reconstruction. It is a natural approach as one aims to design the system matrix to faithfully reproduce the true probabilities of detection, and is an attractive alternative to a range of PVC methods that make simplifying assumptions. In fact, RM produces images that are clearly enhanced visually, but it is my contention that RM is remarkable in its ability to deceive! RM improves resolution (and contrast), and it is unfortunately not uncommon to see studies only characterizing this aspect. RM also reduces noise when defined as intensity variations within a region-of-interest (ROI), i.e., image roughness (σspatial). An alternative noise metric that assesses reproducibility is the ensemble standard deviation of ROI mean uptake (σensemble). RM has been shown to reduce voxel variances but increase intervoxel correlations.6,9 The first effect decreases both σspatial and σensemble, while the latter further decreases σspatial, but shifts σensemble in the opposite direction.6 Subsequently, σspatial is reduced in RM, but σensemble can increase especially for small ROIs.10 This explains why RM can generate images assessed visually to be of higher quality, as it enhances contrast and reduces σspatial. However, it can degrade reproducibility and thus adversely impact quantitative imaging tasks as in pharmacokinetic imaging10 or treatment response monitoring. RM may actually improve reproducibility for the Maximum Standardized Uptake Value (SUVmax) (Ref. 11) (which we attribute to reduced voxel variances) though increasing its range of values across the population (similar to PVC), and can degrade reproducibility for SUVmean, especially for small volumes (similar to PVC).12 In detection tasks, another note of caution is in order. Dual-metric resolution (contrast) vs. noise trade-off analyses commonly depict improved curves for RM whether noise is defined as σspatial or σensemble (though to a lesser extent in the latter case, as explained above). Nonetheless, as demonstrated recently,9 such simplified analyses do not properly capture the impact of the modified noise texture in PET images. In fact, detection task performance can be expressed as a function of the noise power spectrum (NPS), which is amplified at midfrequencies with RM and competes against the RM-enhanced modulation transfer function (MTF). One then must not make any conclusions of RM superiority based on dual-metric analysis and appropriate task-based performance assessment is required. A few detection studies have been performed for RM in PET,13,14 and the results indicated statistically significant improvements for the designed studies, especially in the presence of time-of-flight. Whether or not RM will be clinically significant is another question. I believe that RM has the definitive potential to improve PET imaging in the context of diagnostic imaging, especially in oncology, but is likely to degrade performance in other contexts. By no means do I intend to discourage the application of RM, but wish to draw attention to its strengths and pitfalls, and to encourage research into its usage in a balanced and thoughtful manner. My colleague raises valid concerns that resolution modeling (RM) can be deceptive. RM must be assessed with rigorous methods that analyze more than the simple metrics of: Resolution or quantification defined with hot features in a cold background; Noise defined as voxel-to-voxel variance; and Detection defined as contrast to noise. Some papers employ these simple metrics and report overly optimistic performance with RM. The concern about the clinical significance of RM improvements is valid. A detection performance evaluation by Kadrmas et al.4 demonstrated, through observer studies on over 400 measured phantom images, that the area under the ROC increases by ∼30% when using better RM. While this was a statistically significant improvement, the authors acknowledged that they could not conclude about the clinical significance of such a gain. In medical imaging, we rarely perform studies that prove real clinical improvements because these often require numerous patient exams, application in multiple sites, and knowledge of patient outcomes (all challenging!). We usually test our methods with much more limited evaluations. I would advocate that many of these limited evaluations, while not necessitating full clinical trials, should be improved. In PET image generation, the customary approach for proving a method is to show improvement in a couple of reasonable metrics. This leads to the common expectation that performance on all other fronts stays consistent, i.e., the method only helps. For clinical acceptance, we need to quantify the good and bad performance of our methods to ensure they are applied in the appropriate context. In the future, I hope that clinical PET practice will use images tailored for the task at hand, as opposed to trying to garner all the necessary information from a single image. In the clinic, RM methods incorporated into convergent algorithms could provide more consistent, accurate quantification, but may not be appropriate for detection due to noise correlations. Conversely, RM methods designed to accentuate hot features of clinically relevant sizes could be used solely for tumor detection tasks. In this context, RM offers the potential to enhance clinical PET. Flattening the CRC curve is valuable but may incur other costs. What my esteemed colleague refers to as true noise [i.e., σensemble, or the coefficient-of-variation (COV) when expressed as a percentage] may change little with RM,2 or instead be amplified twofold4 or even more6 in small ROIs for some RM implementations. As such, the issue of reproducibility in quantitative imaging tasks merits special attention. This is a reason some sites with the HRRT scanner (including ours) pursuing quantitative pharmacokinetic imaging have discontinued usage of RM. RM also results in increased mean uptake variability across the population,15 attributed to true intersubject differences that are less suppressed in RM but may also be partly due to the degraded reproducibility. Another issue is the impact of RM on the predictive and prognostic value of PET: PVC had no significant effect on the prediction of response following treatment16 and in fact degraded performance in two studies.17,18 This was attributed to the fact that PVC (similar to what RM does) removes the volume information implicit in SUVmean values, increasing them by greater amounts for complete-responders (which are associated with smaller tumors) than for partial-/nonresponders, thus actually diminishing intergroup differences. One could easily imagine that RM produces a similar detrimental effect on the discrimination power, though it is very meaningful to investigate explicit usage of volume information in addition to corrected SUV values. Finally, I note that I would have again taken the counterpoint position if the proposition was instead that "RM does not enhance PET imaging" (!) for these two statements are not logical complements, and there is a third real one, namely that RM may enhance PET imaging in certain tasks and degrade others. The community needs to achieve careful and comprehensive assessment of these issues and propose solutions appropriately sensitive to the various imaging tasks.
Год издания: 2013
Авторы: Adam Alessio, Arman Rahmim, Colin G. Orton
Издательство: Wiley
Источник: Medical Physics
Ключевые слова: Medical Imaging Techniques and Applications, Radiomics and Machine Learning in Medical Imaging, Advanced X-ray and CT Imaging
Другие ссылки: Medical Physics (PDF)
Medical Physics (HTML)
PubMed (HTML)
Medical Physics (HTML)
PubMed (HTML)
Открытый доступ: bronze
Том: 40
Выпуск: 12