Session 5: 4-State Markov Model — Chronic Kidney Disease

Early screening and ACE-inhibitor intervention vs standard care

1. The Clinical Question

Chronic kidney disease (CKD) affects approximately 17% of the Indian population, with about 6% having stage 3 or worse. Most patients in India present late — over 50% are first seen when eGFR is already below 15 ml/min (stage 5), by which point dialysis or transplantation is needed at enormous cost.

The HTA question: Is early population-based screening for CKD followed by ACE-inhibitor therapy cost-effective compared to standard care (where CKD is detected only when symptomatic or incidental)?

This is a Markov cohort model — the most common model type for chronic diseases that progress through stages over time.

2. What Is a Markov Model?

A Markov model tracks a cohort of patients through defined health states over multiple time cycles (usually years). At each cycle, patients can:

  • Stay in their current state
  • Move to a worse state (progression)
  • Move to a better state (rare in CKD, but possible with treatment)
  • Die

The movement between states is governed by transition probabilities. These probabilities are the heart of the model.

3. Our 4-State Model

Code
library(DiagrammeR)

grViz("
digraph ckd_markov {
  graph [rankdir=LR, bgcolor='transparent', fontname='Helvetica', nodesep=0.8]
  node [fontname='Helvetica', fontsize=11, style='filled,rounded', shape=box]

  E [label='Early CKD\n(Stage 1-2)\neGFR ≥ 60', fillcolor='#59a14f', fontcolor='white', width=1.8]
  M [label='Moderate CKD\n(Stage 3)\neGFR 30-59', fillcolor='#f28e2b', fontcolor='white', width=1.8]
  A [label='Advanced CKD\n/ Dialysis\n(Stage 4-5)', fillcolor='#e15759', fontcolor='white', width=1.8]
  D [label='Death', fillcolor='#76b7b2', fontcolor='white', shape=doublecircle, width=1.2]

  E -> M [label='p_12', fontsize=10]
  M -> A [label='p_23', fontsize=10]
  E -> D [label='p_1d', fontsize=10]
  M -> D [label='p_2d', fontsize=10]
  A -> D [label='p_3d', fontsize=10]

  E -> E [label='stay', style=dashed, fontsize=9]
  M -> M [label='stay', style=dashed, fontsize=9]
  A -> A [label='stay', style=dashed, fontsize=9]
}
")
Figure 1: CKD Markov model: health states and transitions

The four states are:

  1. Early CKD (Stage 1-2): eGFR ≥60, often asymptomatic. Managed with lifestyle modification and renoprotective drugs.
  2. Moderate CKD (Stage 3): eGFR 30-59. Requires specialist referral, medication optimisation.
  3. Advanced CKD / Dialysis (Stage 4-5): eGFR <30. Requires dialysis or transplantation.
  4. Death: Absorbing state — patients cannot leave.

4. Model Parameters

Code
library(knitr)

# ============================================================
# MODEL PARAMETERS — CKD Markov Model
# ============================================================

# --- Cohort ---
n_cohort <- 10000   # Cohort of 10,000 patients with early CKD
n_cycles <- 20      # 20-year time horizon
cycle_length <- 1   # Annual cycles

# --- Starting Distribution ---
# All patients start in Early CKD (detected through screening)
state_names <- c("Early CKD", "Moderate CKD", "Advanced/Dialysis", "Death")
n_states <- length(state_names)

# --- Annual Transition Probabilities: STANDARD CARE ---
# Source: Adapted from international CKD progression literature
# (Go et al. 2004 NEJM; Kerr et al. 2012; SEEK study India)
# Adjusted for Indian demographics (younger onset, higher diabetes burden)

# From Early CKD (Stage 1-2)
p_12_standard <- 0.10   # Early → Moderate (annual)
p_1d_standard <- 0.02   # Early → Death (background + CKD-related)

# From Moderate CKD (Stage 3)
p_23_standard <- 0.12   # Moderate → Advanced (annual)
p_2d_standard <- 0.04   # Moderate → Death

# From Advanced CKD / Dialysis (Stage 4-5)
p_3d_standard <- 0.15   # Advanced → Death (annual)
# Source: Indian dialysis mortality ~15-25% annually
# (Rajapurkar et al. 2012; Indian CKD Registry)

# --- Costs (₹ per year) ---
# Source: Indian hospital data; PMJAY package rates; Jeloka et al. 2007;
#         Inside CKD India projections (Lancet eClinicalMedicine 2024)

# State costs (annual)
cost_early    <- 12000    # OPD visits, basic labs, lifestyle counselling
cost_moderate <- 45000    # Specialist visits, medications, frequent labs
cost_advanced <- 350000   # Dialysis (2-3 sessions/week at public hospital rates)
cost_death    <- 0

# Intervention-specific costs
cost_screening  <- 500     # One-time population screening cost per person
cost_ace_annual <- 6000    # ACE-inhibitor + monitoring (generic enalapril/ramipril)

# --- Utility Weights (annual QALYs) ---
# Source: Adapted from Gorodetskaya et al. 2005; Tajima et al. 2010
# Indian EQ-5D data limited; disclaimer applies
utility_early    <- 0.85
utility_moderate <- 0.72
utility_advanced <- 0.55   # Dialysis significantly reduces quality of life
utility_death    <- 0.00

# --- Discount Rate ---
discount_rate <- 0.03  # 3% annual for both costs and QALYs

# --- WTP Threshold ---
wtp_india <- 170000  # ~1x GDP per capita India

Parameter summary

Code
# Display parameters as a clean table
param_table <- data.frame(
  Parameter = c("Cohort size", "Time horizon", "Discount rate",
                "p(Early→Moderate)", "p(Early→Death)",
                "p(Moderate→Advanced)", "p(Moderate→Death)",
                "p(Advanced→Death)",
                "Cost: Early CKD", "Cost: Moderate CKD",
                "Cost: Advanced/Dialysis", "Cost: Screening (one-time)",
                "Cost: ACE-inhibitor (annual)",
                "Utility: Early", "Utility: Moderate",
                "Utility: Advanced", "Utility: Death"),
  Value = c("10,000", "20 years", "3%",
            "0.10", "0.02", "0.12", "0.04", "0.15",
            "₹12,000", "₹45,000", "₹3,50,000", "₹500", "₹6,000",
            "0.85", "0.72", "0.55", "0.00"),
  Source = c("Assumption", "Standard for chronic disease", "Indian HTA guidelines",
             "Go 2004; SEEK India", "Background + CKD-related",
             "Go 2004; Kerr 2012", "CKD-related excess mortality",
             "Indian CKD Registry; Rajapurkar 2012",
             "PMJAY rates", "Specialist care estimates",
             "Public hospital dialysis", "Population screening", "Generic enalapril",
             "Gorodetskaya 2005", "Gorodetskaya 2005",
             "Tajima 2010; Indian EQ-5D limited", "Convention")
)

kable(param_table, align = "llr",
      caption = "Model parameters with sources")
Model parameters with sources
Parameter Value Source
Cohort size 10,000 Assumption
Time horizon 20 years Standard for chronic disease
Discount rate 3% Indian HTA guidelines
p(Early→Moderate) 0.10 Go 2004; SEEK India
p(Early→Death) 0.02 Background + CKD-related
p(Moderate→Advanced) 0.12 Go 2004; Kerr 2012
p(Moderate→Death) 0.04 CKD-related excess mortality
p(Advanced→Death) 0.15 Indian CKD Registry; Rajapurkar 2012
Cost: Early CKD ₹12,000 PMJAY rates
Cost: Moderate CKD ₹45,000 Specialist care estimates
Cost: Advanced/Dialysis ₹3,50,000 Public hospital dialysis
Cost: Screening (one-time) ₹500 Population screening
Cost: ACE-inhibitor (annual) ₹6,000 Generic enalapril
Utility: Early 0.85 Gorodetskaya 2005
Utility: Moderate 0.72 Gorodetskaya 2005
Utility: Advanced 0.55 Tajima 2010; Indian EQ-5D limited
Utility: Death 0.00 Convention

5. Hazard Ratio to Probability Conversion

Clinical trials report treatment effects as hazard ratios (HR). An HR of 0.66 means the ACE-inhibitor reduces the rate of progression by 34%. But our Markov model needs probabilities, not rates. How do we convert?

The key relationship between probability (\(p\)) and rate (\(r\)) over one cycle is:

\[p = 1 - e^{-r}\]

And the reverse:

\[r = -\ln(1 - p)\]

To apply a hazard ratio, we: (1) convert the baseline probability to a rate, (2) multiply the rate by HR, and (3) convert back to a probability. This simplifies to:

\[p_{\text{intervention}} = 1 - (1 - p_{\text{standard}})^{HR}\]

Code
# --- Treatment Effect: HR = 0.66 ---
treatment_effect_hr <- 0.66

# Step-by-step conversion for Early → Moderate
rate_12 <- -log(1 - p_12_standard)                     # Step 1: prob → rate
rate_12_int <- rate_12 * treatment_effect_hr            # Step 2: apply HR
p_12_intervention <- 1 - exp(-rate_12_int)              # Step 3: rate → prob

# This is mathematically equivalent to the shortcut:
p_12_shortcut <- 1 - (1 - p_12_standard)^treatment_effect_hr

# Verify they match
kable(data.frame(
  Transition = c("Early → Moderate", "Moderate → Advanced"),
  Standard = c(p_12_standard, p_23_standard),
  `Step-by-step` = c(p_12_intervention, 1 - exp(-(-log(1 - p_23_standard)) * treatment_effect_hr)),
  Shortcut = c(p_12_shortcut, 1 - (1 - p_23_standard)^treatment_effect_hr),
  check.names = FALSE
), digits = 6, caption = "HR → probability conversion: both methods agree")
HR → probability conversion: both methods agree
Transition Standard Step-by-step Shortcut
Early → Moderate 0.10 0.067175 0.067175
Moderate → Advanced 0.12 0.080909 0.080909
Code
# Use the shortcut going forward
p_12_intervention <- 1 - (1 - p_12_standard)^treatment_effect_hr
p_23_intervention <- 1 - (1 - p_23_standard)^treatment_effect_hr

# Mortality not directly affected by ACE-inhibitor
p_1d_intervention <- p_1d_standard
p_2d_intervention <- p_2d_standard
p_3d_intervention <- p_3d_standard
WarningCommon Mistake

You cannot simply multiply the probability by the HR: 0.10 × 0.66 = 0.066 is wrong. That approach underestimates the true intervention probability (correct value: 0.0672). The error grows larger with higher baseline probabilities. Always use the rate-based conversion.

6. Building the Transition Matrices

A transition matrix is a table where each row represents the current state and each column represents where patients can go. Each row must sum to 1.0 (everyone goes somewhere).

Step 1: Create the scaffold

Code
# Start with an empty 4×4 matrix of zeros
tm_standard <- matrix(0, nrow = n_states, ncol = n_states,
                      dimnames = list(state_names, state_names))

# At this point, every cell is 0:
kable(tm_standard, digits = 2,
      caption = "Step 1: Empty scaffold — all zeros")
Step 1: Empty scaffold — all zeros
Early CKD Moderate CKD Advanced/Dialysis Death
Early CKD 0 0 0 0
Moderate CKD 0 0 0 0
Advanced/Dialysis 0 0 0 0
Death 0 0 0 0

Step 2: Populate with transition probabilities

You can address cells by state name (clearer, self-documenting) or by row/column number (shorter). Both are equivalent:

Code
# --- Method A: State-name indexing (recommended for clarity) ---
# From Early CKD
tm_standard["Early CKD", "Moderate CKD"] <- p_12_standard       # progression
tm_standard["Early CKD", "Death"]        <- p_1d_standard       # mortality
tm_standard["Early CKD", "Early CKD"]    <- 1 - p_12_standard - p_1d_standard  # stay

# From Moderate CKD
tm_standard["Moderate CKD", "Advanced/Dialysis"] <- p_23_standard
tm_standard["Moderate CKD", "Death"]             <- p_2d_standard
tm_standard["Moderate CKD", "Moderate CKD"]      <- 1 - p_23_standard - p_2d_standard

# From Advanced/Dialysis
tm_standard["Advanced/Dialysis", "Death"]             <- p_3d_standard
tm_standard["Advanced/Dialysis", "Advanced/Dialysis"]  <- 1 - p_3d_standard

# Death is absorbing
tm_standard["Death", "Death"] <- 1

# --- Method B: Numeric indexing (equivalent) ---
# tm_standard[1, 2] <- p_12_standard      # row 1 = Early, col 2 = Moderate
# tm_standard[1, 4] <- p_1d_standard      # row 1 = Early, col 4 = Death
# tm_standard[1, 1] <- 1 - p_12_standard - p_1d_standard
# ... and so on

kable(tm_standard, digits = 4,
      caption = "Standard Care transition matrix — completed")
Standard Care transition matrix — completed
Early CKD Moderate CKD Advanced/Dialysis Death
Early CKD 0.88 0.10 0.00 0.02
Moderate CKD 0.00 0.84 0.12 0.04
Advanced/Dialysis 0.00 0.00 0.85 0.15
Death 0.00 0.00 0.00 1.00
TipThe “Stay” Probability

The diagonal (stay probability) is always 1 minus all outgoing probabilities. For Early CKD: 1 - 0.10 - 0.02 = 0.88. A common bug is writing 1 - p_12 and forgetting to subtract p_1d — that would make the row sum exceed 1.0.

Row-sum verification

Code
# Every row must sum to exactly 1.0
kable(data.frame(
  State = state_names,
  `Row Sum` = rowSums(tm_standard),
  check.names = FALSE
), caption = "Row-sum check: all rows must equal 1.000")
Row-sum check: all rows must equal 1.000
State Row Sum
Early CKD Early CKD 1
Moderate CKD Moderate CKD 1
Advanced/Dialysis Advanced/Dialysis 1
Death Death 1

Intervention matrix

Code
# Build intervention matrix — same structure, different progression rates
tm_intervention <- matrix(0, nrow = n_states, ncol = n_states,
                          dimnames = list(state_names, state_names))

tm_intervention["Early CKD", "Moderate CKD"] <- p_12_intervention
tm_intervention["Early CKD", "Death"]        <- p_1d_intervention
tm_intervention["Early CKD", "Early CKD"]    <- 1 - p_12_intervention - p_1d_intervention

tm_intervention["Moderate CKD", "Advanced/Dialysis"] <- p_23_intervention
tm_intervention["Moderate CKD", "Death"]             <- p_2d_intervention
tm_intervention["Moderate CKD", "Moderate CKD"]      <- 1 - p_23_intervention - p_2d_intervention

tm_intervention["Advanced/Dialysis", "Death"]             <- p_3d_intervention
tm_intervention["Advanced/Dialysis", "Advanced/Dialysis"]  <- 1 - p_3d_intervention

tm_intervention["Death", "Death"] <- 1

kable(tm_intervention, digits = 4,
      caption = "Intervention transition matrix — ACE-inhibitor slows progression")
Intervention transition matrix — ACE-inhibitor slows progression
Early CKD Moderate CKD Advanced/Dialysis Death
Early CKD 0.9128 0.0672 0.0000 0.02
Moderate CKD 0.0000 0.8791 0.0809 0.04
Advanced/Dialysis 0.0000 0.0000 0.8500 0.15
Death 0.0000 0.0000 0.0000 1.00
TipKey Insight: The Matrices

The two matrices differ only in the progression probabilities (Early→Moderate and Moderate→Advanced). The treatment slows progression but does not change mortality directly. This is a realistic representation of what ACE-inhibitors do — they buy time, keeping patients in less costly, higher-quality health states for longer.

Transition matrix heatmap

Code
library(tidyr)
library(ggplot2)
Warning: package 'ggplot2' was built under R version 4.5.2
Code
tm_df_std <- as.data.frame(as.table(tm_standard))
names(tm_df_std) <- c("From", "To", "Probability")
tm_df_std$Strategy <- "Standard Care"

tm_df_int <- as.data.frame(as.table(tm_intervention))
names(tm_df_int) <- c("From", "To", "Probability")
tm_df_int$Strategy <- "Screening + ACE-inhibitor"

tm_combined <- rbind(tm_df_std, tm_df_int)

ggplot(tm_combined, aes(x = To, y = From, fill = Probability)) +
  geom_tile(colour = "white", linewidth = 1) +
  geom_text(aes(label = round(Probability, 3)), size = 3) +
  facet_wrap(~Strategy) +
  scale_fill_gradient(low = "white", high = "#e15759") +
  labs(title = "Transition Probability Heatmaps",
       subtitle = "Darker = higher probability of transition") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 8),
        axis.text.y = element_text(size = 8),
        legend.position = "bottom")
Figure 2: Transition matrix heatmap — Standard Care vs Intervention

7. Matrix Multiplication — How the Simulation Works

The simulation engine is a single line of R: trace[cycle + 1, ] <- trace[cycle, ] %*% tm. But what does %*% actually compute? Let’s break it down from first principles.

The dimension rule: m×n · n×p → m×p

Matrix multiplication is only possible when the inner dimensions match. The result takes its rows from the first matrix and columns from the second:

\[\underbrace{A}_{m \times n} \times \underbrace{B}_{n \times p} = \underbrace{C}_{m \times p}\]

In our Markov model, the state vector (1 row × 4 columns) multiplied by the transition matrix (4 rows × 4 columns) gives a new state vector (1 row × 4 columns):

\[\underbrace{\text{trace}}_{1 \times 4} \times \underbrace{\text{TM}}_{4 \times 4} = \underbrace{\text{new trace}}_{1 \times 4}\]

The inner dimension (4 = 4) matches — that’s the number of health states. Each element in the result comes from one dot product — a row from the left meeting a column from the right.

The “handshake”: how one result cell is computed

To compute a single cell in the result, take the row from the first matrix and the column from the second matrix. Stand them face-to-face, multiply each pair of corresponding elements, and add up all the products. That sum is the cell value.

Code
# ---------------------------------------------------------------
# HANDSHAKE VISUALISATION
# We show how result[1, "Early CKD"] is computed:
#   trace row meets TM column "Early CKD" face-to-face
# Then repeat for "Moderate CKD" column to show the pattern
# ---------------------------------------------------------------

par(mfrow = c(1, 2), mar = c(1, 1, 3, 1))

# --- Helper function to draw one handshake ---
draw_handshake <- function(row_vals, col_vals, row_labels,
                           result_val, result_label, title_text) {
  plot(NULL, xlim = c(0, 10), ylim = c(-1.5, 6.5), axes = FALSE, ann = FALSE)
  title(main = title_text, cex.main = 1.0, font.main = 2)

  # Column headers
  text(2.2, 6.2, "Trace Row", cex = 0.85, font = 2, col = "#4e79a7")
  text(5.0, 6.2, expression(symbol("\264")), cex = 1.0)
  text(7.5, 6.2, "TM Column", cex = 0.85, font = 2, col = "#e15759")

  # Draw each pair face-to-face
  box_cols <- c("#59a14f", "#f28e2b", "#e15759", "#76b7b2")

  for (i in 1:4) {
    y <- 5.5 - i * 1.3

    # Left box: trace row element
    rect(0.5, y - 0.4, 3.5, y + 0.4,
         col = paste0(box_cols[i], "25"), border = box_cols[i], lwd = 1.5)
    text(2.0, y, format(row_vals[i], big.mark = ","), cex = 0.95, font = 2)

    # Multiplication sign between
    text(4.8, y, expression(symbol("\264")), cex = 1.2, col = "grey40")

    # Right box: TM column element
    rect(6.0, y - 0.4, 9.0, y + 0.4,
         col = paste0(box_cols[i], "25"), border = box_cols[i], lwd = 1.5)
    text(7.5, y, sprintf("%.2f", col_vals[i]), cex = 0.95, font = 2)

    # State label on far right
    text(9.6, y, row_labels[i], cex = 0.65, col = "grey50")

    # Product on far left
    product <- row_vals[i] * col_vals[i]
    text(0.1, y, format(round(product, 1), big.mark = ","),
         cex = 0.7, col = "grey50", adj = 1)
  }

  # Sum line and result
  segments(0.5, -0.3, 9.0, -0.3, lwd = 2, col = "black")
  text(1.0, -0.5, "SUM", cex = 0.8, font = 2, adj = 0)
  text(4.8, -0.5, "=", cex = 1.5, font = 2)

  # Result box
  rect(6.0, -1.0, 9.0, -0.0, col = "#4e79a730", border = "#4e79a7", lwd = 2)
  text(7.5, -0.5, format(round(result_val, 1), big.mark = ","),
       cex = 1.2, font = 2, col = "#4e79a7")
  text(7.5, -1.3, result_label, cex = 0.8, font = 3, col = "#4e79a7")
}

# --- Trace row (cycle 0): all 10,000 in Early CKD ---
trace_row <- c(10000, 0, 0, 0)
labels <- c("Early", "Moderate", "Advanced", "Death")

# Handshake 1: trace row × TM column "Early CKD"
tm_col_early <- tm_standard[, "Early CKD"]
draw_handshake(trace_row, tm_col_early, labels,
               sum(trace_row * tm_col_early),
               "→ Early CKD at cycle 1",
               "Result cell: Early CKD")

# Handshake 2: trace row × TM column "Moderate CKD"
tm_col_mod <- tm_standard[, "Moderate CKD"]
draw_handshake(trace_row, tm_col_mod, labels,
               sum(trace_row * tm_col_mod),
               "→ Moderate CKD at cycle 1",
               "Result cell: Moderate CKD")
Figure 3: The handshake: row from trace meets column from TM, multiply pairs, sum to get result
TipThe Pattern

To fill every cell in the result, you repeat this handshake — one for each column of the transition matrix. Since we have 4 states, that’s 4 handshakes per cycle. R’s %*% operator does all 4 at once.

Cycle 0 → 1: All four results

Code
# --- Cycle 0: Starting state ---
trace_0 <- c(10000, 0, 0, 0)
names(trace_0) <- state_names

# R computes all 4 handshakes in one line:
trace_1 <- trace_0 %*% tm_standard
trace_1 <- as.numeric(trace_1)
names(trace_1) <- state_names

# Show the element-by-element computation
handshake_table <- data.frame(
  `Result Cell` = state_names,
  `Dot Product` = c(
    "10000×0.88 + 0×0 + 0×0 + 0×0",
    "10000×0.10 + 0×0.84 + 0×0 + 0×0",
    "10000×0 + 0×0.12 + 0×0.85 + 0×0",
    "10000×0.02 + 0×0.04 + 0×0.15 + 0×1"
  ),
  `= Count` = format(trace_1, big.mark = ","),
  check.names = FALSE
)

kable(handshake_table,
      caption = "Cycle 0 → 1: each result cell is one row × column handshake")
Cycle 0 → 1: each result cell is one row × column handshake
Result Cell Dot Product = Count
Early CKD Early CKD 10000×0.88 + 0×0 + 0×0 + 0×0 8,800
Moderate CKD Moderate CKD 10000×0.10 + 0×0.84 + 0×0 + 0×0 1,000
Advanced/Dialysis Advanced/Dialysis 10000×0 + 0×0.12 + 0×0.85 + 0×0 0
Death Death 10000×0.02 + 0×0.04 + 0×0.15 + 0×1 200

Cycle 1 → 2: Now it gets interesting

At cycle 1, patients are spread across Early (8,800) and Moderate (1,000) and Death (200). The handshakes now have multiple non-zero terms:

Code
par(mfrow = c(1, 2), mar = c(1, 1, 3, 1))

# Handshake: trace_1 × TM column "Moderate CKD"
tm_col_mod <- tm_standard[, "Moderate CKD"]
draw_handshake(trace_1, tm_col_mod, labels,
               sum(trace_1 * tm_col_mod),
               "→ Moderate CKD at cycle 2",
               "Moderate CKD (cycle 2)")

# Handshake: trace_1 × TM column "Death"
tm_col_death <- tm_standard[, "Death"]
draw_handshake(trace_1, tm_col_death, labels,
               sum(trace_1 * tm_col_death),
               "→ Death at cycle 2",
               "Death (cycle 2)")
Figure 4: Cycle 1 → 2: with patients in multiple states, the dot products have more non-zero terms
NoteWhat Changed?

At cycle 0, only one row element was non-zero (10,000 in Early CKD), so each handshake was trivial. By cycle 1, patients occupy multiple states, so the dot products genuinely combine contributions from different states — that’s where the model gets its power.

First 4 cycles — the full trace

Code
# Cycle 1 → 2
trace_2 <- as.numeric(trace_1 %*% tm_standard)
names(trace_2) <- state_names

# Cycle 2 → 3
trace_3 <- as.numeric(trace_2 %*% tm_standard)
names(trace_3) <- state_names

# Show the first 4 cycles as a table
first_cycles <- data.frame(
  Cycle = 0:3,
  `Early CKD` = c(trace_0[1], trace_1[1], trace_2[1], trace_3[1]),
  `Moderate CKD` = c(trace_0[2], trace_1[2], trace_2[2], trace_3[2]),
  `Advanced/Dialysis` = c(trace_0[3], trace_1[3], trace_2[3], trace_3[3]),
  Death = c(trace_0[4], trace_1[4], trace_2[4], trace_3[4]),
  check.names = FALSE
)

kable(first_cycles, digits = 1,
      caption = "First 4 cycles: watch the cohort flow through states")
First 4 cycles: watch the cohort flow through states
Cycle Early CKD Moderate CKD Advanced/Dialysis Death
0 10000.0 0.0 0.0 0.0
1 8800.0 1000.0 0.0 200.0
2 7744.0 1720.0 120.0 416.0
3 6814.7 2219.2 308.4 657.7
NoteWhat to Notice
  • Early CKD shrinks every cycle (patients leave faster than they arrive — no one enters Early CKD)
  • Moderate CKD grows then peaks — it receives patients from Early but loses them to Advanced and Death
  • Advanced/Dialysis appears at cycle 2 — it takes 2 cycles for the first patients to progress through Early → Moderate → Advanced
  • Death accumulates — it’s absorbing, so it only grows
  • Every row sums to 10,000 — no patients are created or destroyed

8. Running the Full Simulation

Now we automate this for all 20 cycles using a loop. The run_markov() function also computes costs, QALYs, half-cycle correction, and discounting — we’ll explain each of these explicitly in the following sections.

Code
# --- Markov Cohort Simulation ---
# Step 1: Create the trace matrix (21 rows × 4 columns)
trace_std <- matrix(0, nrow = n_cycles + 1, ncol = n_states,
                    dimnames = list(paste0("Cycle_", 0:n_cycles), state_names))
trace_std[1, 1] <- n_cohort  # All start in Early CKD

trace_int <- trace_std  # Same starting point for intervention

# Step 2: Run the simulation loop
for (cycle in 1:n_cycles) {
  trace_std[cycle + 1, ] <- trace_std[cycle, ] %*% tm_standard
  trace_int[cycle + 1, ] <- trace_int[cycle, ] %*% tm_intervention
}

The Markov trace table (Standard Care)

This is the raw output of the simulation — the number of patients in each state at each cycle:

Code
# Show first 6 and last 3 cycles
trace_display <- as.data.frame(round(trace_std, 1))
trace_display$Cycle <- 0:n_cycles
trace_display$Total <- rowSums(trace_std)
trace_display <- trace_display[, c("Cycle", state_names, "Total")]

# Show selected rows
rows_to_show <- c(1:6, 11, 16, 21)  # cycles 0-5, 10, 15, 20
kable(trace_display[rows_to_show, ], digits = 1, row.names = FALSE,
      caption = "Markov trace (Standard Care): patient counts per state")
Markov trace (Standard Care): patient counts per state
Cycle Early CKD Moderate CKD Advanced/Dialysis Death Total
0 10000.0 0.0 0.0 0.0 10000
1 8800.0 1000.0 0.0 200.0 10000
2 7744.0 1720.0 120.0 416.0 10000
3 6814.7 2219.2 308.4 657.7 10000
4 5997.0 2545.6 528.4 929.0 10000
5 5277.3 2738.0 754.6 1230.0 10000
10 2785.0 2590.0 1570.7 3054.3 10000
15 1469.7 1845.7 1699.4 4985.1 10000
20 775.6 1174.3 1429.6 6620.5 10000
NoteVerify: Total column

Every row sums to exactly 10,000 — the cohort is closed (no patients enter or leave the model, they just move between states). This is a useful sanity check.

9. Half-Cycle Correction

The simulation tells us the state at the start of each cycle (cycle 0, 1, 2, …). But patients transition during the cycle, not instantaneously at the start or end. Half-cycle correction approximates the average occupancy during each cycle by averaging the start and end counts:

\[\text{HCC}_t = \frac{\text{trace}_{t-1} + \text{trace}_t}{2}\]

Code
# Half-cycle corrected trace: average start and end of each cycle
trace_hcc_std <- (trace_std[1:n_cycles, ] + trace_std[2:(n_cycles + 1), ]) / 2
trace_hcc_int <- (trace_int[1:n_cycles, ] + trace_int[2:(n_cycles + 1), ]) / 2

# Show the difference for cycle 1
kable(data.frame(
  Measure = c("Start of cycle 1 (trace row 1)",
              "End of cycle 1 (trace row 2)",
              "Half-cycle corrected average"),
  `Early CKD` = c(trace_std[1, 1], trace_std[2, 1],
                   trace_hcc_std[1, 1]),
  `Moderate CKD` = c(trace_std[1, 2], trace_std[2, 2],
                      trace_hcc_std[1, 2]),
  `Adv/Dialysis` = c(trace_std[1, 3], trace_std[2, 3],
                      trace_hcc_std[1, 3]),
  Death = c(trace_std[1, 4], trace_std[2, 4],
            trace_hcc_std[1, 4]),
  check.names = FALSE
), digits = 1, caption = "Half-cycle correction for cycle 1: averaging start and end")
Half-cycle correction for cycle 1: averaging start and end
Measure Early CKD Moderate CKD Adv/Dialysis Death
Start of cycle 1 (trace row 1) 10000 0 0 0
End of cycle 1 (trace row 2) 8800 1000 0 200
Half-cycle corrected average 9400 500 0 100
TipWhy Does This Matter?

Without half-cycle correction, we’d either assume all transitions happen at the start of the year (overestimating time in new states) or at the end (underestimating). The average is a better approximation of reality. The effect is small per cycle but accumulates over 20 years. This is a standard technical requirement in published HTA models.

10. Discounting

Society values benefits today more than benefits in the future — a concept called time preference. In HTA, we account for this by discounting future costs and QALYs. India’s standard discount rate is 3% per year.

The discount factor for cycle \(t\) (starting from 0) is:

\[d_t = \frac{1}{(1 + r)^t}\]

where \(r\) = 0.03. This means ₹1,00,000 in year 10 is worth only ₹74,409 in today’s terms.

Code
# Discount factors for each cycle
discount_factors <- 1 / (1 + discount_rate)^(0:(n_cycles - 1))

# Show the first few
kable(data.frame(
  Cycle = 1:10,
  `Discount Factor` = round(discount_factors[1:10], 4),
  `₹1,00,000 worth today` = paste0("₹", format(round(100000 * discount_factors[1:10]), big.mark = ",")),
  check.names = FALSE
), caption = "Discount factors: ₹1,00,000 in future years expressed in today's value")
Discount factors: ₹1,00,000 in future years expressed in today’s value
Cycle Discount Factor ₹1,00,000 worth today
1 1.0000 ₹100,000
2 0.9709 ₹ 97,087
3 0.9426 ₹ 94,260
4 0.9151 ₹ 91,514
5 0.8885 ₹ 88,849
6 0.8626 ₹ 86,261
7 0.8375 ₹ 83,748
8 0.8131 ₹ 81,309
9 0.7894 ₹ 78,941
10 0.7664 ₹ 76,642

11. Costs, QALYs, and ICER

Now we combine the half-cycle corrected trace with costs, utilities, and discount factors to compute total costs and QALYs for each strategy:

Code
# Cost and utility vectors
costs_standard     <- c(cost_early, cost_moderate, cost_advanced, cost_death)
costs_intervention <- c(cost_early + cost_ace_annual,   # ACE-inhibitor added
                        cost_moderate + cost_ace_annual,
                        cost_advanced,   # ACE-inhibitor usually stopped in advanced CKD
                        cost_death)
utilities <- c(utility_early, utility_moderate, utility_advanced, utility_death)

# Cycle costs = HCC trace × cost vector (matrix multiplication)
cycle_costs_std <- trace_hcc_std %*% costs_standard
cycle_costs_int <- trace_hcc_int %*% costs_intervention

# Cycle QALYs = HCC trace × utility vector
cycle_qalys_std <- trace_hcc_std %*% utilities
cycle_qalys_int <- trace_hcc_int %*% utilities

# Apply discounting
disc_costs_std <- cycle_costs_std * discount_factors
disc_costs_int <- cycle_costs_int * discount_factors
disc_qalys_std <- cycle_qalys_std * discount_factors
disc_qalys_int <- cycle_qalys_int * discount_factors

# Total across all cycles
total_cost_std <- sum(disc_costs_std)
total_cost_int <- sum(disc_costs_int) + n_cohort * cost_screening  # add one-time screening
total_qaly_std <- sum(disc_qalys_std)
total_qaly_int <- sum(disc_qalys_int)

# ICER
inc_cost <- total_cost_int - total_cost_std
inc_qaly <- total_qaly_int - total_qaly_std
icer <- inc_cost / inc_qaly

Results summary

Code
results_summary <- data.frame(
  Strategy = c("Standard Care", "Screening + ACE-inhibitor", "Incremental"),
  `Total Cost (₹)` = c(
    format(round(total_cost_std), big.mark = ","),
    format(round(total_cost_int), big.mark = ","),
    format(round(inc_cost), big.mark = ",")
  ),
  `Total QALYs` = c(
    round(total_qaly_std, 1),
    round(total_qaly_int, 1),
    round(inc_qaly, 1)
  ),
  `Per Patient Cost` = c(
    paste0("₹", format(round(total_cost_std / n_cohort), big.mark = ",")),
    paste0("₹", format(round(total_cost_int / n_cohort), big.mark = ",")),
    paste0("₹", format(round(inc_cost / n_cohort), big.mark = ","))
  ),
  `Per Patient QALYs` = c(
    round(total_qaly_std / n_cohort, 4),
    round(total_qaly_int / n_cohort, 4),
    round(inc_qaly / n_cohort, 4)
  ),
  check.names = FALSE
)

kable(results_summary, align = "lrrrr",
      caption = "Cost-effectiveness results (20-year horizon, 3% discounting)")
Cost-effectiveness results (20-year horizon, 3% discounting)
Strategy Total Cost (₹) Total QALYs Per Patient Cost Per Patient QALYs
Standard Care 8,065,724,874 84707.5 ₹806,572 8.4707
Screening + ACE-inhibitor 6,473,703,963 92356.5 ₹647,370 9.2357
Incremental -1,592,020,911 7649.1 ₹-159,202 0.7649
Code
# -------------------------------------------------------
# ICER interpretation — handle the negative ICER carefully
# -------------------------------------------------------
# A negative ICER can mean TWO very different things:
#   1. inc_cost < 0 AND inc_qaly > 0 → DOMINANT (saves money, gains health)
#   2. inc_cost > 0 AND inc_qaly < 0 → DOMINATED (costs more, loses health)
# Both give a negative ratio, but the conclusions are opposite!
# This is why many HTA guidelines recommend AGAINST relying on the ICER alone
# and instead use Net Monetary Benefit (NMB).

icer_formatted <- format(round(icer), big.mark = ",")

interpretation <- if (inc_cost < 0 & inc_qaly > 0) {
  "DOMINANT — the intervention saves costs AND improves health"
} else if (inc_cost > 0 & inc_qaly < 0) {
  "DOMINATED — the intervention costs more AND worsens health"
} else if (inc_cost > 0 & inc_qaly > 0) {
  # Standard ICER quadrant (NE): compare to WTP
  if (icer < wtp_india) {
    paste0("Cost-effective (ICER < WTP of ₹", format(wtp_india, big.mark = ","), ")")
  } else {
    "NOT cost-effective at conventional thresholds"
  }
} else {
  # inc_cost < 0, inc_qaly < 0 (SW quadrant): intervention cheaper but worse
  paste0("Trade-off: saves money but loses QALYs (ICER = ₹", icer_formatted, ")")
}

kable(data.frame(
  Metric = c("Incremental Cost", "Incremental QALYs", "ICER",
             "WTP Threshold (1× GDP/capita)", "Conclusion"),
  Value = c(paste0("₹", format(round(inc_cost / n_cohort), big.mark = ","), " per patient"),
            paste0(round(inc_qaly / n_cohort, 4), " per patient"),
            paste0("₹", icer_formatted, " per QALY gained"),
            paste0("₹", format(wtp_india, big.mark = ",")),
            interpretation),
  check.names = FALSE
), caption = "Cost-effectiveness conclusion")
Cost-effectiveness conclusion
Metric Value
Incremental Cost ₹-159,202 per patient
Incremental QALYs 0.7649 per patient
ICER ₹-208,132 per QALY gained
WTP Threshold (1× GDP/capita) ₹170,000
Conclusion DOMINANT — the intervention saves costs AND improves health
WarningWhy a Negative ICER Is Ambiguous

The ICER is a ratio: ΔCost ÷ ΔQALYs. A negative ratio appears when the numerator and denominator have opposite signs — but that happens in two situations:

ΔQALYs > 0 (better) ΔQALYs < 0 (worse)
ΔCost > 0 (costlier) NE quadrant: standard ICER SE quadrant: DOMINATED
ΔCost < 0 (cheaper) NW quadrant: DOMINANT SW quadrant: trade-off

Both DOMINANT and DOMINATED produce a negative ICER, but the conclusions are opposite. Always check the signs of ΔCost and ΔQALYs separately — never rely on the ICER sign alone. This is exactly why the Net Monetary Benefit (NMB) was developed: it gives an unambiguous single number.

Net Monetary Benefit (NMB)

The NMB converts QALYs into monetary terms using the WTP threshold, then subtracts costs. A positive incremental NMB means the intervention is cost-effective — no sign ambiguity.

\[\text{NMB} = (\text{WTP} \times \text{QALYs}) - \text{Cost}\]

\[\Delta\text{NMB} = \text{NMB}_{\text{intervention}} - \text{NMB}_{\text{standard}} = (\text{WTP} \times \Delta\text{QALYs}) - \Delta\text{Cost}\]

If ΔNMB > 0 → cost-effective. If ΔNMB < 0 → not cost-effective. No quadrant confusion.

Code
# Net Monetary Benefit per patient
nmb_std <- (wtp_india * total_qaly_std / n_cohort) - (total_cost_std / n_cohort)
nmb_int <- (wtp_india * total_qaly_int / n_cohort) - (total_cost_int / n_cohort)
inc_nmb <- nmb_int - nmb_std

# Equivalently: WTP × ΔQALYs − ΔCost
inc_nmb_check <- wtp_india * (inc_qaly / n_cohort) - (inc_cost / n_cohort)

kable(data.frame(
  Metric = c("NMB Standard Care", "NMB Intervention",
             "Incremental NMB (ΔNMB)", "Decision"),
  Value = c(
    paste0("₹", format(round(nmb_std), big.mark = ",")),
    paste0("₹", format(round(nmb_int), big.mark = ",")),
    paste0("₹", format(round(inc_nmb), big.mark = ",")),
    if (inc_nmb > 0) "ADOPT — positive ΔNMB" else "REJECT — negative ΔNMB"
  ),
  check.names = FALSE
), caption = paste0("Net Monetary Benefit analysis (WTP = ₹", format(wtp_india, big.mark = ","), "/QALY)"))
Net Monetary Benefit analysis (WTP = ₹170,000/QALY)
Metric Value
NMB Standard Care ₹633,454
NMB Intervention ₹922,691
Incremental NMB (ΔNMB) ₹289,236
Decision ADOPT — positive ΔNMB
TipWhen to Use NMB vs ICER

The ICER is great for communication — “the intervention costs ₹X per QALY gained” is intuitive. But for decision-making, NMB is safer: it’s always unambiguous, and it’s essential for probabilistic sensitivity analysis (PSA) where you need to compare across thousands of iterations — you can’t average ICERs, but you can average NMBs.

Rule of thumb: Report the ICER in your results, but use NMB for the actual decision and for sensitivity analysis.

Code
library(gridExtra)

qaly_data <- data.frame(
  Strategy = c("Standard Care", "Screening +\nACE-inhibitor"),
  Total_QALY = c(total_qaly_std / n_cohort, total_qaly_int / n_cohort),
  Total_Cost = c(total_cost_std / n_cohort, total_cost_int / n_cohort)
)

p1 <- ggplot(qaly_data, aes(x = Strategy, y = Total_QALY, fill = Strategy)) +
  geom_col(show.legend = FALSE) +
  geom_text(aes(label = round(Total_QALY, 2)), vjust = -0.5, size = 4) +
  scale_fill_manual(values = c("#e15759", "#4e79a7")) +
  labs(x = "", y = "QALYs per Patient", title = "Discounted QALYs") +
  theme_minimal()

p2 <- ggplot(qaly_data, aes(x = Strategy, y = Total_Cost / 1000, fill = Strategy)) +
  geom_col(show.legend = FALSE) +
  geom_text(aes(label = paste0("₹", format(round(Total_Cost), big.mark = ","))),
            vjust = -0.5, size = 3.5) +
  scale_fill_manual(values = c("#e15759", "#4e79a7")) +
  labs(x = "", y = "Cost per Patient (₹ thousands)", title = "Discounted Costs") +
  theme_minimal()

grid.arrange(p1, p2, ncol = 2)
Figure 5: Discounted costs and QALYs per patient by strategy

12. Visualising the Markov Trace

Code
trace_df_std <- as.data.frame(trace_std / n_cohort)
trace_df_std$Cycle <- 0:n_cycles
trace_df_std$Strategy <- "Standard Care"

trace_df_int <- as.data.frame(trace_int / n_cohort)
trace_df_int$Cycle <- 0:n_cycles
trace_df_int$Strategy <- "Screening + ACE-inhibitor"

trace_combined <- rbind(trace_df_std, trace_df_int)

trace_long <- pivot_longer(trace_combined,
                           cols = all_of(state_names),
                           names_to = "State",
                           values_to = "Proportion")

trace_long$State <- factor(trace_long$State, levels = state_names)

ggplot(trace_long, aes(x = Cycle, y = Proportion, fill = State)) +
  geom_area(alpha = 0.8) +
  facet_wrap(~Strategy) +
  scale_fill_manual(values = c("Early CKD" = "#59a14f",
                                "Moderate CKD" = "#f28e2b",
                                "Advanced/Dialysis" = "#e15759",
                                "Death" = "#76b7b2")) +
  labs(
    x = "Year", y = "Proportion of Cohort",
    title = "Markov Trace: CKD Progression Over 20 Years",
    subtitle = "Comparing standard care vs early screening + ACE-inhibitor"
  ) +
  theme_minimal() +
  theme(legend.position = "bottom")
Figure 6: Markov trace: cohort distribution over 20 years
TipReading the Markov Trace

The green area (Early CKD) shrinks over time as patients progress. The intervention keeps more patients in green for longer — that’s the visual signature of a drug that slows progression. The red area (Advanced/Dialysis) grows more slowly in the intervention arm. Since dialysis costs ₹3,50,000/year, this delay translates directly into cost savings.

13. Patients Reaching Dialysis

Code
dialysis_data <- data.frame(
  Year = rep(0:n_cycles, 2),
  Proportion = c(trace_std[, "Advanced/Dialysis"] / n_cohort,
                 trace_int[, "Advanced/Dialysis"] / n_cohort),
  Strategy = rep(c("Standard Care", "Screening + ACE-inhibitor"), each = n_cycles + 1)
)

ggplot(dialysis_data, aes(x = Year, y = Proportion * 100, colour = Strategy)) +
  geom_line(linewidth = 1.2) +
  labs(x = "Year", y = "% of Cohort on Dialysis",
       title = "Patients Reaching Advanced CKD / Dialysis",
       subtitle = "Early intervention delays progression to this costly state") +
  scale_colour_manual(values = c("Standard Care" = "#e15759",
                                  "Screening + ACE-inhibitor" = "#4e79a7")) +
  theme_minimal() +
  theme(legend.position = "bottom")
Figure 7: Proportion of cohort in Advanced CKD/Dialysis over time

14. Cost Accumulation Over Time

Code
cumcost_std <- cumsum(disc_costs_std) / n_cohort
cumcost_int <- cumsum(disc_costs_int) / n_cohort + cost_screening

cost_over_time <- data.frame(
  Year = rep(1:n_cycles, 2),
  Cumulative_Cost = c(cumcost_std, cumcost_int),
  Strategy = rep(c("Standard Care", "Screening + ACE-inhibitor"), each = n_cycles)
)

ggplot(cost_over_time, aes(x = Year, y = Cumulative_Cost / 1000, colour = Strategy)) +
  geom_line(linewidth = 1.2) +
  labs(x = "Year", y = "Cumulative Cost per Patient (₹ thousands)",
       title = "Cost Accumulation: Standard Care vs Intervention",
       subtitle = "Intervention costs more early but saves through avoided dialysis") +
  scale_colour_manual(values = c("Standard Care" = "#e15759",
                                  "Screening + ACE-inhibitor" = "#4e79a7")) +
  theme_minimal() +
  theme(legend.position = "bottom")
Figure 8: Cumulative discounted costs per patient over 20 years

15. Quick Sensitivity Check

Before declaring a result “cost-effective”, we should ask: how sensitive is the ICER to our assumptions? Here we vary the two most influential parameters — dialysis cost and treatment efficacy — one at a time.

NotePreview Only

This is a brief demonstration. Session 6 covers Probabilistic Sensitivity Analysis (PSA) in full depth, including simultaneous variation of all uncertain parameters.

Code
# Helper function to run one arm (avoids repeating the full loop)
run_arm <- function(tm, cost_vec, n_cohort, n_cycles, utilities, discount_rate) {
  trace <- matrix(0, nrow = n_cycles + 1, ncol = 4)
  trace[1, 1] <- n_cohort
  for (c in 1:n_cycles) trace[c + 1, ] <- trace[c, ] %*% tm
  hcc <- (trace[1:n_cycles, ] + trace[2:(n_cycles + 1), ]) / 2
  df <- 1 / (1 + discount_rate)^(0:(n_cycles - 1))
  list(cost = sum((hcc %*% cost_vec) * df),
       qaly = sum((hcc %*% utilities) * df))
}

# --- One-Way SA: Dialysis Cost ---
dialysis_range <- seq(100000, 600000, by = 50000)
icer_by_dialysis <- numeric(length(dialysis_range))

for (i in seq_along(dialysis_range)) {
  cs <- c(cost_early, cost_moderate, dialysis_range[i], 0)
  ci <- c(cost_early + cost_ace_annual, cost_moderate + cost_ace_annual, dialysis_range[i], 0)
  rs <- run_arm(tm_standard, cs, n_cohort, n_cycles, utilities, discount_rate)
  ri <- run_arm(tm_intervention, ci, n_cohort, n_cycles, utilities, discount_rate)
  ri$cost <- ri$cost + n_cohort * cost_screening
  icer_by_dialysis[i] <- (ri$cost - rs$cost) / (ri$qaly - rs$qaly)
}

# --- One-Way SA: Treatment Effect (HR) ---
hr_range <- seq(0.40, 0.95, by = 0.05)
icer_by_hr <- numeric(length(hr_range))

for (i in seq_along(hr_range)) {
  p12i <- 1 - (1 - p_12_standard)^hr_range[i]
  p23i <- 1 - (1 - p_23_standard)^hr_range[i]
  tm_sa <- tm_standard
  tm_sa[1, 2] <- p12i; tm_sa[1, 1] <- 1 - p12i - p_1d_standard
  tm_sa[2, 3] <- p23i; tm_sa[2, 2] <- 1 - p23i - p_2d_standard
  rs <- run_arm(tm_standard, costs_standard, n_cohort, n_cycles, utilities, discount_rate)
  ri <- run_arm(tm_sa, costs_intervention, n_cohort, n_cycles, utilities, discount_rate)
  ri$cost <- ri$cost + n_cohort * cost_screening
  icer_by_hr[i] <- (ri$cost - rs$cost) / (ri$qaly - rs$qaly)
}

# --- Plot Both ---
par(mfrow = c(1, 2), mar = c(5, 5, 3, 1))

plot(dialysis_range / 1000, icer_by_dialysis / 1000, type = "b", pch = 19,
     col = "#4e79a7", xlab = "Annual Dialysis Cost (₹ thousands)",
     ylab = "ICER (₹ thousands / QALY)", main = "ICER vs Dialysis Cost")
abline(h = wtp_india / 1000, col = "red", lty = 2)
text(max(dialysis_range) / 1000, wtp_india / 1000, "WTP (1× GDP/capita)",
     pos = 1, col = "red", cex = 0.8)

plot(hr_range, icer_by_hr / 1000, type = "b", pch = 19,
     col = "#e15759", xlab = "Treatment Effect (Hazard Ratio)",
     ylab = "ICER (₹ thousands / QALY)", main = "ICER vs Treatment Efficacy")
abline(h = wtp_india / 1000, col = "red", lty = 2)
text(max(hr_range), wtp_india / 1000, "WTP (1× GDP/capita)",
     pos = 1, col = "red", cex = 0.8)

One-way sensitivity analysis: ICER vs dialysis cost and treatment effect
TipWhat This Tells Us

The ICER is most sensitive to dialysis cost — the more expensive dialysis is, the more money the intervention saves by keeping patients out of that state. The treatment effect also matters: as the hazard ratio approaches 1.0 (no effect), the ICER rises steeply. Notice that all ICERs here remain negative (dominant) — in PSA, where ICERs can flip between quadrants across iterations, we’d use NMB instead.

16. Cross-Validation Checkpoint

A critical step in any HTA model is cross-validation — verifying your R results against an independent implementation. We provide an Excel companion model (CKD-Markov-Model.xlsx) for this purpose.

Code
# Cross-validation reference table
xval <- data.frame(
  Metric = c("Standard: Cost/patient", "Standard: QALYs/patient",
             "Intervention: Cost/patient", "Intervention: QALYs/patient",
             "Incremental cost/patient", "Incremental QALYs/patient", "ICER"),
  `R Value` = c(
    paste0("₹", format(round(total_cost_std / n_cohort), big.mark = ",")),
    round(total_qaly_std / n_cohort, 4),
    paste0("₹", format(round(total_cost_int / n_cohort), big.mark = ",")),
    round(total_qaly_int / n_cohort, 4),
    paste0("₹", format(round(inc_cost / n_cohort), big.mark = ",")),
    round(inc_qaly / n_cohort, 4),
    paste0("₹", format(round(icer), big.mark = ","))
  ),
  `Excel Value` = rep("← check", 7),
  Match = rep("?", 7),
  check.names = FALSE
)

kable(xval, caption = "Cross-validation checklist: fill in Excel values and verify match")
Cross-validation checklist: fill in Excel values and verify match
Metric R Value Excel Value Match
Standard: Cost/patient ₹806,572 ← check ?
Standard: QALYs/patient 8.4707 ← check ?
Intervention: Cost/patient ₹647,370 ← check ?
Intervention: QALYs/patient 9.2357 ← check ?
Incremental cost/patient ₹-159,202 ← check ?
Incremental QALYs/patient 0.7649 ← check ?
ICER ₹-208,132 ← check ?
ImportantWhy Cross-Validate?

If your R ICER and your Excel ICER differ by more than ₹1–2 (rounding), something is wrong. Common sources of mismatch: forgetting half-cycle correction in Excel, discount factor indexing off by one cycle, or computing the “stay” probability as 1 - p_progression instead of 1 - p_progression - p_death.

17. What You Just Did

You built a complete Markov cohort model — the most common model type in HTA worldwide — step by step:

  1. Health states — the distinct conditions patients can be in
  2. HR → probability conversion — translating clinical trial evidence into model inputs
  3. Transition matrix — scaffold, populate, verify row sums
  4. Matrix multiplication — the one-line simulation engine
  5. Markov trace — raw counts table + area chart
  6. Half-cycle correction — averaging cycle start/end for accuracy
  7. Discounting — applying time preference at 3% per year
  8. ICER — incremental cost per QALY gained, with careful sign handling
  9. NMB — Net Monetary Benefit for unambiguous decision-making
  10. One-way sensitivity analysis — testing robustness to key assumptions

In R, the core simulation is a single line: trace[cycle + 1, ] <- trace[cycle, ] %*% tm — matrix multiplication handles all transitions at once.

Next: Open ckd-exercise.qmd to explore how changing parameters affects the model, then try the Session 5 Bonus: heemod Package to see the same model built with a dedicated Markov modelling package.

Key References

  • Go AS et al. (2004). Chronic kidney disease and the risks of death and hospitalisation. NEJM.
  • Singh AK et al. (2013). Epidemiology and risk factors of CKD in India — SEEK study. BMC Nephrology.
  • Rajapurkar MM et al. (2012). CKD in India: a clarion call for change. Lancet.
  • ACC/AHA (2024). Risk of initiating ACEi/ARBs in advanced CKD.
  • Inside CKD Investigators (2024). Economic burden of CKD at the patient level. Lancet eClinicalMedicine.
  • Hou FF et al. (2006). ACE inhibitor in progressive renal disease. NEJM.
  • Indian CKD Registry. BMC Nephrology 2012.