Health care cost modeling can be challenging due to non-normal distributions. There are often many $0 observations and right-skewed cost distributions among health care users. Modeling disease cost to specific health care states adds complexity. Zhou et al. (2023) offer a tutorial on estimating costs with disease model states using generalized linear models.
Step 1: Preparing the dataset:
- Prepare data for discrete time periods and define disease states.
- Address issues like granular state definitions and multi-state scenarios.
- Handle censored data and missing cost data with appropriate methods.
- Map time periods to decision model cycles and transform data.
- Refer to the sample dataset below:
Step 2: Model selection:
- Use a two-part model within a generalized linear model framework.
- Transform the expected cost value nonlinearly using a GLM.
- Estimate the link function and error term distribution.
- Combine the GLM with a two-part model using the equations below:
Step 3: Selecting the final model.
- Consider which covariates are included and evaluate model fit.
- Explore covariate interactions and alternative selection techniques.
Step 4: Model prediction
- Derive marginal effects using recycled prediction for one-part or two-part models.
- Calculate the difference in mean costs for scenarios of interest.
- Assess an illustrative example for modeling hospital costs associated with cardiovascular events in the UK.
The authors also provide R code for further exploration. Download it here.
.