Mind the Gaps: Comparing Vascular Quality Initiative and National Surgical Quality Improvement Program Populations & Outcomes in Open Infrainguinal Bypass
Lily E Johnston, Margaret C Tracci, John A Kern, Kenneth J Cherry, Irving L Kron, Gilbert R Upchurch, Jr.
University of Virginia, Charlottesville, VA
Both the Vascular Quality Initiative (VQI) and the National Surgical Quality Improvement Program Procedure Targeted (NSQIP-PT) databases aim to prospectively capture key variables related to vascular surgical procedures nationwide in order to track outcomes and improve quality. Both of these databases however are subject to bias based on voluntary hospital participation, and only VQI captures all consecutive procedures within a procedural area (e.g. infrainguinal bypass, suprainguinal bypass) whereas NSQIP uses a sample of the total cases by institution. Only one published study has directly compared VQI and NSQIP data, and this was a single institution study conducted before the NSQIP procedure targeted vascular modules were available. As registry data becomes increasingly important in assessing institutional and provider-level performance, it is critical to understand what data are available, how reliable those data are, and how best to use our limited resources to efficiently and effectively measure both risk and performance. The objective of this study was to compare the populations and outcomes captured in VQI and NSQIP for the same procedures and identify areas of similarity and discrepancy, as well as describe the unique elements available in each dataset.
De-identified data from a regional, multi-state patient safety organization were provided by the Vascular Quality Initiative. Both general and vascular procedure-targeted national deidentified Public Use Files (PUFs) were provided by NSQIP. All patients captured in the VQI infrainguinal file and the NSQIP “lower extremity, open” file were eligible for inclusion. For NSQIP outcomes captured separately in the procedure-targeted and the general PUF files, concordance was assessed using the kappa statistic. Preoperative demographic variables and comorbidities as well as outcomes including 30-day mortality, reoperation, bleeding, wound infection, myocardial infarction (MI) and stroke were analyzed. For variables with different definitions between VQI and NSQIP-PT, a standardized definition was created to permit comparison across databases. Differences in categorical variables were tested using Pearson’s chi-squared test, binary variables with two-sample tests of proportions, and in continuous variables using the Wilcoxon rank-sum test. Measures of central tendency are presented as medians with interquartile range.
To account for any baseline differences that may be due to regional variability relative to national averages rather than a true difference between the datasets, VQI records were propensity-matched 1:1 with NSQIP-PT records. Creating these matched cohorts allows a comparison of outcomes between databases that in theory adjusts for any case-mix differences. Propensity scores were derived from a model that included all demographic and comorbidity variables that were significantly different between the two datasets in unadjusted analyses. Standardized bias was used as the metric for adequate balance. Differences in outcomes were tested as previously described. Statistical significance for all tests was set at α=0.05.
Sample. The VQI database contained 1,358 records from 2011-2015. The NSQIP-PT database contained 5,613 records, of which 340 could not be matched to the general NSQIP PUFs giving a total of 5,273 complete NSQIP records including cases from 2011-2013.
Intra-database reliability. Several outcomes are captured twice in NSQIP records, once as part of the procedure-targeted module and once in the general PUF. Using standardized definitions, patient-level correlation between NSQIP-PT and NSQIP outcomes in wound infection, composite MI and stroke, and bleeding were evaluated. For wound infection, agreement was 92% (kappa = 0.56). For any postoperative MI or stroke, agreement was 99% (kappa = 0.85) and agreement was 83% for postoperative bleeding (kappa = 0.52).
Patient populations. Patients in VQI are younger than in NSQIP (65  vs 68 , p<0.001), and have a slightly higher BMI (27.6 [8.3] vs 27.0 [7.5], p=0.003). VQI also had a higher proportion of recent smokers (46% vs 41%, p=0.001), patients with CHF (16% vs 3.1%, p<0.001) and COPD (29% vs 13%, p<0.001). VQI patients were also less likely to be on dialysis (3.9% vs 6.1%, p=0.003), or on preoperative aspirin (62% vs 79%, p<0.001) or statin therapy (63% vs 68%, p<0.001). Rates of diabetes were similar across databases (47% vs 45%, p=0.1). Prior ipsilateral percutaneous interventions/stents were recorded in 23% and 22% of VQI and NSQIP patients respectively (p=0.6), but VQI patients were much less likely to have had a prior ipsilateral bypass (15% vs 28%, p<0.001). Emergent cases accounted for 4.6% and 5.6% of VQI and NSQIP cases respectively (p=0.1).
Outcomes. Overall 30d mortality was equivalent between VQI and NSQIP (2.0% vs 1.8%, p=0.6; see table), as was composite MI/stroke (3.9% vs 3.2%, p=0.2). Major amputation, return to OR, and wound infection rates were higher in NSQIP relative to VQI (3.3% vs 1.6%, p=0.002; 16% vs 12%, p<0.001; and 13% vs 1.4%, p<0.001, respectively). Bleeding rates, however, were higher in VQI (37% vs 17%, p<0.001). Both total and post-procedure length of stay were equivalent between VQI and NSQIP (6  vs 6 , p=0.7, and 4  vs 5 , p=0.2).
Propensity-matched analysis. The propensity matching model generated 1,266 pairs of records that were well-matched on all included covariates (20% standardized bias or less). The c-statistic of the matching model was 0.77 prior to matching, and appropriately decreased to 0.56 in the matched sample. Similar to the unmatched cohorts, there was no statistically significant difference between VQI and NSQIP outcomes in 30-day mortality (1.9% vs 1.6%, p=0.5), or composite MI/stroke rates (3.9% vs 3.0%, p=0.2). In the matched cohort, the difference in major amputation rates between VQI and NSQIP is no longer significant (1.4% vs 2.4%, p=0.1). Wound infection and return to OR rates remain significantly higher in NSQIP (13% vs 1.2%, p<0.001; 15% vs 11%, p=0.008, respectively), while bleeding rates remain higher in VQI (36% vs 15%, p<0.001).
This is the first study to compare overall population and outcome differences between the VQI and NSQIP-PT registries. We find small but statistically significant differences in demographic and comorbidity variables as well as in outcomes between databases. Although many variables share a common definition between the two registries, some have definitions that may be different enough to account for this difference. For example, in NSQIP the CHF comorbidity variable specifies that the patient has been diagnosed or experienced symptoms in the last 30 days, whereas VQI makes a distinction between levels of symptom severity, but does not address timing of symptoms. In outcome measures, wound infection is another example of how differing definitions may account for discrepancies in incidence rates: NSQIP includes all wound infections including superficial surgical site infections, whereas VQI defines the same complication as an infection with a positive culture or requiring antibiotic treatment. These differences in definitions between registries is critically important to understand, as these data form the basis for risk-adjustment and outcome results that now directly affect both facility and physician reimbursement as well as accreditation and credentialing; using data from different registries can produce markedly different results in some areas. In other measures where definitions are more clear-cut and events are less frequent, such as in-hospital mortality, MI and stroke, there is excellent consistency between the two cohorts.
Although the study is limited by the comparison of regional- to national-level data, this is an important first step in understanding the properties of these two seemingly similar registries. Moreover, there is not a substantial difference in outcome measures between the unadjusted and the propensity-matched analyses, suggesting that these differences are inherent to the datasets rather than the consequence of different patient populations resulting in unequal risks.
Prospective registries are now considered the gold standard for surgical epidemiological research, yet all registries with elective participation are subject to significant selection bias and there is not in fact any standard against which to compare the data from these registries. Therefore, it will not be possible to say which of these two databases is “better”. Because both datasets are completely de-identified, it is not possible to determine what the degree of institutional overlap is between VQI and NSQIP at this time. It will be important to determine whether these registries are largely duplicative, and therefore an inefficient use of resources, or in fact represent different patient and provider populations, and therefore incompletely capture the unique elements of each registry. Elements unique to NSQIP-PT include predicted risks of morbidity and mortality, a more extensive comorbidity list, information on readmission, and procedural coding (CPT codes and work RVUs). In contrast, VQI’s strength is capturing long-term follow-up and highly detailed procedural elements. More detailed comparisons in the future will help inform the allocation of limited resources and optimize the data collection efforts in vascular surgery. At present, which dataset is “better” depends entirely on the question being asked. However, this study demonstrates that even when both registries can appropriately answer the question, the answers provided may be different.
|Table: Unadjusted and Adjusted Outcomes of Infrainguinal Bypass in VQI & NSQIP Registries|
|Outcome, N (%)||NSQIP|
|VQI (N=1358)||p-value||NSQIP, unmatched (n=4007)||NSQIP, matched (n=1266)||VQI (n=1266)||p-value*|
|30-day Mortality||93/5273 (1.8)||27/1358 (2.0)||0.6||73/4007 (1.8)||20/1266 (1.6)||24/1266 (1.9)||0.5|
|Any MI or Stroke||171/5272 (3.2)||53/1355 (3.9)||0.2||133/4006 (3.3)||38/1266 (3.0)||49/1264 (3.9)||0.2|
|Return to OR||847/5245 (16)||156/1354 (12)||<0.001||660/3990 (17)†||187/1255 (15)||143/1263 (11)||0.008|
|Bleeding||905/5272 (17)||496/1358 (37)||<0.001||709/4006 (18)†||196/1266 (16)||459/1266 (36)||<0.001|
|Major Amputation||101/3097 (3.3)||22/1355 (1.6)||0.002||83/2361 (3.5)†||18/736 (2.5)||18/1264 (1.4)||0.1|
|Wound Infection||674/5272 (13)||19/1356 (1.4)||<0.001||504/4006 (13)||170/1266 (13)||15/1265 (1.2)||<0.001|
*p-value given for the comparison between matched groups; †represents a statistically significant difference (p<0.05) from the NSQIP-matched group.
Back to 2016 Annual Meeting Program