Theses and Dissertations--Epidemiology and Biostatistics

CLUSTERING HOSPITAL PERFORMANCE USING GROUP-BASED MULTI-TRAJECTORY MODELING WITH SINGULAR BAYESIAN INFORMATION CRITERION

Gaixin Du, University of KentuckyFollow

Author ORCID Identifier

https://orcid.org/0000-0002-7929-2426

Date Available

5-10-2025

Year of Publication

2023

Degree Name

Doctor of Philosophy (PhD)

Document Type

Doctoral Dissertation

College

Public Health

Department/School/Program

Epidemiology and Biostatistics

First Advisor

Richard J. Charnigo

Abstract

Hospital performance is complex and patient-experience oriented. Currently, the Centers for Medicare and Medicaid Services (CMS) evaluate hospitals yearly with a single score of one to five ("Star Rating") using composite measures from five domains. However, a single composite score cannot fully describe it, and alternative measures should be considered. Healthcare quality improvement needs long-term data to validate effectiveness. Group-based multi-trajectory modeling (GBMTM) estimates probabilities of latent group membership based on longitudinal profiles from multiple outcomes. We use GBMTM to identify groups of hospitals with similar performance in SAS PROC TRAJ.

We downloaded Medicare-eligible hospitals (N=5,111) that provided patient care between 2012 to 2021 from CMS. Our results suggested hospitals could be classified into three subgroups: 1) “small size-high patient rating” n=2515 hospitals, ~100 beds, with the lowest readmissions, lowest safety risk, low payment value, high patient rating, and perhaps surprisingly, high mortality rate; 2) “large size-medium patient rating” n=2063 hospitals, ~240 beds, with all outcomes ranked in the middle: medium mortality rate, medium readmission rate, etc.; and 3) “large size-low patient rating” n=533 hospitals, ~230 beds, which tended to be more for-profit, with the highest readmissions, safety risk, and payment value, lowest patient rating, but lowest mortality rate. Hospital performance trends are parallel with similar slopes across all outcomes.

Because the group-based trajectory model relies on numerical methodology requiring iteration, convergence issues can limit the application of this model. We generate Monte Carlo simulations with various sample sizes, the number of classes identified, outcomes, time points, mixing probability, variance, and percentage of outliers to explore convergence issues in GBMTM under the multivariate normal distribution. We investigate ways to reduce risks by capping and scaling separately and together. Our simulations show that the number of outcomes/classes leads to the largest risk of convergence failure. The increased number of time points, more severely unbalanced mixing probability, and greater variance also increase the chances of convergence failure. Increasing the sample size and percentage of outliers [0, 3%] will decrease the risk of convergence issues. Capping alone does not reduce the convergence issues. Scaling down decreases the risk. When we apply capping after scaling down, we have ~30% cases that have lower convergence risks compared to scaling down only.

Model selection is a critical step in hospital clustering. But GBMTM has non-identifiability when comparing models with a log-likelihood ratio. Singular BIC (sBIC) overcomes this difficulty by approximating the marginal likelihood averaged in the Bayesian model. It resulted in an analytical solution using algebra geometry. We calculate the learning coefficients and multiplicity in GBMTM, named sBIC11 and sBIC13. sBIC11 with a lighter penalty is closer to AIC, and sBIC13 with a heavier penalty is closer to regular BIC. sBIC13 tends to select fewer subgroups than AIC, while selecting more subgroups than BIC. sBICs are consistent with AIC and regular BIC in hospital quality data, identifying hospitals into three subgroups. Simulation results showed that sBIC13 has correctly identified the true number of classes more often than BIC >1%. We also see that the probability of identifying the correct model is related to sample size, data points collected, and outcome measures. More unbalanced mixing probability and larger within-group variance reduce the chance of identifying the correct model. Our results showed that sBICs perform competitively. With the relaxing of the penalty, sBICs can identify the number of true classes more often.

Digital Object Identifier (DOI)

https://doi.org/10.13023/etd.2023.192

Recommended Citation

Du, Gaixin, "CLUSTERING HOSPITAL PERFORMANCE USING GROUP-BASED MULTI-TRAJECTORY MODELING WITH SINGULAR BAYESIAN INFORMATION CRITERION" (2023). Theses and Dissertations--Epidemiology and Biostatistics. 38.
https://uknowledge.uky.edu/epb_etds/38

Table S4.1.xlsx (23 kB)
Figure S4.1 Four-way interaction on model selection.docx (3673 kB)

Download

Available for download on Saturday, May 10, 2025

Contact Author

Additional files available below

Included in

Data Science Commons, Health Services Research Commons, Longitudinal Data Analysis and Time Series Commons, Multivariate Analysis Commons, Quality Improvement Commons

COinS

Theses and Dissertations--Epidemiology and Biostatistics

CLUSTERING HOSPITAL PERFORMANCE USING GROUP-BASED MULTI-TRAJECTORY MODELING WITH SINGULAR BAYESIAN INFORMATION CRITERION

Author ORCID Identifier

Date Available

Year of Publication

Degree Name

Document Type

College

Department/School/Program

First Advisor

Abstract

Digital Object Identifier (DOI)

Recommended Citation

Included in

Search

Browse by Author

Author Corner

Connect

Theses and Dissertations--Epidemiology and Biostatistics

CLUSTERING HOSPITAL PERFORMANCE USING GROUP-BASED MULTI-TRAJECTORY MODELING WITH SINGULAR BAYESIAN INFORMATION CRITERION

Author

Author ORCID Identifier

Date Available

Year of Publication

Degree Name

Document Type

College

Department/School/Program

First Advisor

Abstract

Digital Object Identifier (DOI)

Recommended Citation

Included in

Share

Search

Browse by Author

Author Corner

Connect