Session 7: Using Generative AI for HTA Coding

Structured prompts + validated packages = reliable, production-ready models

What This Session Is

This is a practical session on using AI tools (such as Claude, ChatGPT, Gemini, or Copilot) to support your R-based HTA work. It is one tool among many — alongside documentation, package vignettes, Stack Overflow, and colleagues.

This session is not about AI doing your HTA for you. The clinical reasoning, model structure, parameter selection, and interpretation remain yours. AI can help with the R code that implements your thinking.

The GenAI-Assisted HTA Workflow

Figure 1: Using GenAI as a coding assistant in HTA work

When AI Is Useful for HTA Work

AI tools are most helpful when you:

Know what you want to calculate but not how to write it in R
Have working code but need to modify it (add a parameter, change a distribution, create a new plot)
Get an error message and need help understanding it
Want to convert Excel-based model logic into R code
Need to look up which R function or package does something specific
Want to generate a Shiny dashboard from a validated model

AI tools are not reliable for:

Choosing model structures or clinical parameters (that is your expertise)
Validating whether a model is correct (always check the logic yourself)
Replacing peer review or expert consultation
Selecting appropriate survival distributions without seeing data

The Key Principle: Package-Based Prompting

You’ve seen both approaches in this workshop:

Session 5: You built a Markov model from scratch — matrix multiplication, half-cycle correction, discounting, all by hand. This taught you the mechanics.
Session 5 Bonus: You rebuilt the same model with heemod — the package handled HCC, discounting, DSA, and PSA automatically.

When you use AI to generate code, you want the heemod approach. A naive prompt like “write a Markov model” produces raw loops where a single indexing error goes unnoticed. Telling the AI “use the heemod package” constrains it to a validated API.

The Golden Rule

Always include this sentence in your prompt:

“Use the [package name] package. Do not write manual loops.”

Which Package for Which Model?

Package mapping: always specify the package in your prompt
Model Type	Package	Key Functions
Decision Tree	rdecision	DecisionNode, ChanceNode, LeafNode
Markov Cohort Model	heemod	define_parameters → define_state → define_transition → run_model
Partitioned Survival Model	flexsurv + manual wrapper	flexsurvreg() for fitting; write run_psm() for partitioning
DSA	Same as base case package	heemod: define_dsa() + run_dsa() / others: dampack::owsa()
PSA	Same as base case package	heemod: define_psa() + run_psa() / flexsurv: normboot.flexsurvreg()
CE Visualisation	BCEA or dampack	bcea() → CE plane, CEAC, EVPI in one call

The Real-World HTA Workflow

In practice, HTA researchers don’t start by writing code. They start by collecting parameters from literature, trial reports, and expert opinion — and storing them in a file. The AI workflow should mirror this.

Step 1: Prepare Your Parameter File

Before touching any AI tool, create a structured Excel workbook. This is your single source of truth — the AI reads it, your model reads it, and your publication references it.

Sheet 1: Model Settings — model type, states, time horizon, cycle length, discount rate, cohort size, WTP threshold

Sheet 2: Transition Probabilities — from/to, standard care value, intervention value, source citation

Sheet 3: Costs and Utilities — state, cost per arm, utility weight, source citation

Sheet 4: SA Distributions — parameter name, distribution type (Beta/Gamma/LogNormal), shape parameters, DSA low/high values

Why Parameter Files Matter

When you hand the AI a structured file, it maps your parameters to the package API directly. This produces far better code than describing parameters in prose — and your parameter file doubles as supplementary material for your publication.

Step 2: Base Case — Get It Working First

Upload your parameter file and use a short prompt. The file carries the detail — the prompt just says what to do with it.

PROMPT — Markov Base Case

I've attached my parameter file (Excel). Build a Markov cohort
cost-effectiveness model using the heemod package in R.

Read all parameters from the attached file. Use define_parameters,
define_state with discount(), define_transition with C complement,
and run_model with method="life-table". Do not write manual loops.

One-time costs: use ifelse(model_time == 1, cost, 0) in define_state.
Report ICER with 4-quadrant interpretation and Net Monetary Benefit.
Output tables with kable(), plots with ggplot2.

PROMPT — PSM Base Case

I've attached my parameter file (Excel). Build a partitioned survival
model using flexsurv for survival curves and a clean run_psm()
wrapper for costs/QALYs.

Apply HR by scaling lambda (proportional hazards):
S_int(t) = exp(-(lambda × HR) × t^gamma). NOT the AFT form.
Ensure PFS never exceeds OS. Apply half-cycle correction and
3% discounting. Report ICER and NMB. Use kable() and ggplot2.

PROMPT — Decision Tree Base Case

I've attached my parameter file (Excel). Build a cost-effectiveness
decision tree using the rdecision package in R.

Read parameters from the file. Use DecisionNode, ChanceNode, and
LeafNode objects. Report expected cost, QALYs, ICER, and NMB
per strategy. Use kable() and ggplot2.

Step 3: Validate

Before moving to sensitivity analysis, cross-validate the base case:

Do the total costs and QALYs match your hand calculation or Excel model?
Does the Markov trace make sense? (cohort sums to N at every cycle?)
Is the ICER in the right ballpark compared to published literature?
Are discount factors applied correctly?

If using heemod, the results should match your Session 5 manual model within rounding. If they don’t, fix before proceeding.

Step 4: DSA + PSA — Once Base Case Is Solid

Request both in a single prompt. They use the same model object, so the AI doesn’t need to rebuild anything.

PROMPT — Add DSA + PSA

The base case model works correctly. Now add sensitivity analysis
using the same model object.

DSA: Use the low/high values from Sheet 4 of my parameter file.
[heemod: use define_dsa() + run_dsa() + tornado plot]
[others: use dampack::owsa() or manual loop with NMB]

PSA: Use the distributions from Sheet 4 of my parameter file.
[heemod: use define_psa() + run_psa(N=1000) + CE plane + CEAC]
[others: use BCEA::bcea() for CE plane, CEAC, EVPI]

Report mean incremental NMB, P(cost-effective), P(dominant).
Do NOT average ICERs across iterations — use NMB instead.

Step 5: Shiny App — Once Everything Is Validated

PROMPT — Shiny App

I have a validated [Markov/PSM] model built with [heemod/flexsurv].
[Paste the working code or say "the code from our previous conversation"]

Create a Shiny app with:
- Sidebar: sliders for key parameters from my parameter file,
  plus a "Run PSA" button (PSA should not run on every slider change)
- Tab 1: Base case results (table + trace/survival plot), updates reactively
- Tab 2: Tornado diagram, updates reactively
- Tab 3: PSA results (CE plane + CEAC), updates only on button click
- Format costs in ₹ with commas. Include CSV download button.

Why Stage the Shiny App Last?

If your base case has bugs, the Shiny app will faithfully reproduce those bugs with pretty sliders. The app is for exploring a correct model, not for debugging a broken one.

Common Pitfalls with AI-Generated HTA Code

Pitfall 1: AI Ignores Your Package Instruction

The AI may still write raw loops. Check the first few lines — if you see for (cycle in 1:n_cycles) instead of define_transition(), push back: “Rewrite using heemod functions, not manual loops.”

Pitfall 2: Wrong Distribution Parameterisation

AI tools sometimes confuse the parameterisation of Beta and Gamma distributions. For example, they may use mean and SD directly rather than converting to shape parameters.

Always check: For a Beta distribution with mean μ and variance σ², the shape parameters are: \[\alpha = \mu \left(\frac{\mu(1-\mu)}{\sigma^2} - 1\right), \quad \beta = (1-\mu) \left(\frac{\mu(1-\mu)}{\sigma^2} - 1\right)\]

Pitfall 3: Wrong HR Application for Survival Models

A real bug we found and fixed in this workshop:

Proportional hazards (correct for most HTA): S(t) = exp(-(λ × HR) × t^γ) — scales the hazard
Accelerated failure time (different model): S(t) = exp(-λ × (HR × t)^γ) — scales time

Both are valid, but they give different results when γ ≠ 1. Always specify “proportional hazards, not AFT.”

Pitfall 4: Averaging ICERs in PSA

AI-generated PSA code often reports “mean ICER” across iterations. This is unreliable — a few iterations with near-zero ΔQALYs produce extreme values. Always insist on NMB-based reporting.

Pitfall 5: Forgetting Half-Cycle Correction

If using heemod, specify method = "life-table". If manual code, check for the averaging formula.

Pitfall 6: Hallucinated Functions

AI tools sometimes invent R functions or package names that do not exist. If you see a function you do not recognise, verify it exists by typing ?function_name in the R console.

A Sensible Workflow — Summary

Prepare your parameter file in Excel — this is your single source of truth
Prompt for base case — short prompt, attach the file, name the package
Validate — cross-check against hand calculations, Excel, or published ICERs
Prompt for DSA + PSA — once the base case is solid
Prompt for Shiny app — once everything is validated
Use AI for debugging — paste error messages and get explanations
Save working code — once something works, it becomes your template for next time

Key Takeaway

AI tools can dramatically speed up your R workflow for HTA, but they are an accelerator, not an autopilot. Your HTA expertise determines the model structure, the parameter choices, and the interpretation. AI helps translate that expertise into working R code more quickly.

Your parameter file + validated package + AI coding = production-ready model

Think of AI as a very fast but occasionally unreliable research assistant who is good at R syntax but knows nothing about your specific clinical question.

→ Proceed to Day 3 sessions.

--- title: "Session 7: Using Generative AI for HTA Coding" subtitle: "Structured prompts + validated packages = reliable, production-ready models" format: html: toc: true code-fold: show code-tools: true --- ## What This Session Is This is a practical session on using AI tools (such as Claude, ChatGPT, Gemini, or Copilot) to support your R-based HTA work. It is **one tool among many** — alongside documentation, package vignettes, Stack Overflow, and colleagues. This session is **not** about AI doing your HTA for you. The clinical reasoning, model structure, parameter selection, and interpretation remain yours. AI can help with the R code that implements your thinking. ## The GenAI-Assisted HTA Workflow ```{r} #| label: fig-genai-workflow #| echo: false #| fig-cap: "Using GenAI as a coding assistant in HTA work" library(DiagrammeR) grViz(" digraph genai_flow { graph [rankdir=TB, bgcolor='transparent', fontname='Helvetica'] node [shape=box, style='filled,rounded', fontname='Helvetica', fontsize=11] you [label='YOU\n(HTA expertise,\nclinical question)', fillcolor='#4e79a7', fontcolor='white', width=2] prompt [label='Write a Clear\nPrompt\n(describe model, attach\nparameter file)', fillcolor='#59a14f', fontcolor='white', width=2] ai [label='GenAI\nGenerates\nR Code', fillcolor='#f28e2b', fontcolor='white', width=2] review [label='YOU Review\n& Validate\n(does it make sense?)', fillcolor='#e15759', fontcolor='white', width=2] test [label='Test &\nModify\n(run, check outputs)', fillcolor='#76b7b2', fontcolor='white', width=2] use [label='Use in\nYour Analysis', fillcolor='#edc948', fontcolor='black', width=2] you -> prompt -> ai -> review -> test -> use review -> prompt [style=dashed, label='Refine prompt', fontsize=9] test -> review [style=dashed, label='Fix issues', fontsize=9] } ") ``` ## When AI Is Useful for HTA Work AI tools are most helpful when you: - Know *what* you want to calculate but not *how* to write it in R - Have working code but need to modify it (add a parameter, change a distribution, create a new plot) - Get an error message and need help understanding it - Want to convert Excel-based model logic into R code - Need to look up which R function or package does something specific - Want to generate a Shiny dashboard from a validated model AI tools are **not** reliable for: - Choosing model structures or clinical parameters (that is your expertise) - Validating whether a model is correct (always check the logic yourself) - Replacing peer review or expert consultation - Selecting appropriate survival distributions without seeing data ## The Key Principle: Package-Based Prompting You've seen both approaches in this workshop: - **Session 5:** You built a Markov model from scratch — matrix multiplication, half-cycle correction, discounting, all by hand. This taught you the mechanics. - **Session 5 Bonus:** You rebuilt the same model with `heemod` — the package handled HCC, discounting, DSA, and PSA automatically. When you use AI to generate code, you want the **heemod approach**. A naive prompt like "write a Markov model" produces raw loops where a single indexing error goes unnoticed. Telling the AI "use the heemod package" constrains it to a validated API. ::: {.callout-important} ## The Golden Rule Always include this sentence in your prompt: **"Use the [package name] package. Do not write manual loops."** ::: ### Which Package for Which Model? ```{r} #| label: package-map #| echo: false library(knitr) kable(data.frame( `Model Type` = c("Decision Tree", "Markov Cohort Model", "Partitioned Survival Model", "DSA", "PSA", "CE Visualisation"), `Package` = c("rdecision", "heemod", "flexsurv + manual wrapper", "Same as base case package", "Same as base case package", "BCEA or dampack"), `Key Functions` = c( "DecisionNode, ChanceNode, LeafNode", "define_parameters → define_state → define_transition → run_model", "flexsurvreg() for fitting; write run_psm() for partitioning", "heemod: define_dsa() + run_dsa() / others: dampack::owsa()", "heemod: define_psa() + run_psa() / flexsurv: normboot.flexsurvreg()", "bcea() → CE plane, CEAC, EVPI in one call" ), check.names = FALSE ), caption = "Package mapping: always specify the package in your prompt") ``` ## The Real-World HTA Workflow In practice, HTA researchers don't start by writing code. They start by **collecting parameters** from literature, trial reports, and expert opinion — and storing them in a file. The AI workflow should mirror this. ### Step 1: Prepare Your Parameter File Before touching any AI tool, create a structured Excel workbook. This is your **single source of truth** — the AI reads it, your model reads it, and your publication references it. **Sheet 1: Model Settings** — model type, states, time horizon, cycle length, discount rate, cohort size, WTP threshold **Sheet 2: Transition Probabilities** — from/to, standard care value, intervention value, source citation **Sheet 3: Costs and Utilities** — state, cost per arm, utility weight, source citation **Sheet 4: SA Distributions** — parameter name, distribution type (Beta/Gamma/LogNormal), shape parameters, DSA low/high values ::: {.callout-tip} ## Why Parameter Files Matter When you hand the AI a structured file, it maps your parameters to the package API directly. This produces far better code than describing parameters in prose — and your parameter file doubles as supplementary material for your publication. ::: ### Step 2: Base Case — Get It Working First Upload your parameter file and use a short prompt. The file carries the detail — the prompt just says what to do with it. ::: {.callout-note appearance="simple"} **PROMPT — Markov Base Case** ``` I've attached my parameter file (Excel). Build a Markov cohort cost-effectiveness model using the heemod package in R. Read all parameters from the attached file. Use define_parameters, define_state with discount(), define_transition with C complement, and run_model with method="life-table". Do not write manual loops. One-time costs: use ifelse(model_time == 1, cost, 0) in define_state. Report ICER with 4-quadrant interpretation and Net Monetary Benefit. Output tables with kable(), plots with ggplot2. ``` ::: ::: {.callout-note appearance="simple"} **PROMPT — PSM Base Case** ``` I've attached my parameter file (Excel). Build a partitioned survival model using flexsurv for survival curves and a clean run_psm() wrapper for costs/QALYs. Apply HR by scaling lambda (proportional hazards): S_int(t) = exp(-(lambda × HR) × t^gamma). NOT the AFT form. Ensure PFS never exceeds OS. Apply half-cycle correction and 3% discounting. Report ICER and NMB. Use kable() and ggplot2. ``` ::: ::: {.callout-note appearance="simple"} **PROMPT — Decision Tree Base Case** ``` I've attached my parameter file (Excel). Build a cost-effectiveness decision tree using the rdecision package in R. Read parameters from the file. Use DecisionNode, ChanceNode, and LeafNode objects. Report expected cost, QALYs, ICER, and NMB per strategy. Use kable() and ggplot2. ``` ::: ### Step 3: Validate Before moving to sensitivity analysis, **cross-validate the base case**: - Do the total costs and QALYs match your hand calculation or Excel model? - Does the Markov trace make sense? (cohort sums to N at every cycle?) - Is the ICER in the right ballpark compared to published literature? - Are discount factors applied correctly? If using heemod, the results should match your Session 5 manual model within rounding. If they don't, fix before proceeding. ### Step 4: DSA + PSA — Once Base Case Is Solid Request both in a single prompt. They use the **same model object**, so the AI doesn't need to rebuild anything. ::: {.callout-note appearance="simple"} **PROMPT — Add DSA + PSA** ``` The base case model works correctly. Now add sensitivity analysis using the same model object. DSA: Use the low/high values from Sheet 4 of my parameter file. [heemod: use define_dsa() + run_dsa() + tornado plot] [others: use dampack::owsa() or manual loop with NMB] PSA: Use the distributions from Sheet 4 of my parameter file. [heemod: use define_psa() + run_psa(N=1000) + CE plane + CEAC] [others: use BCEA::bcea() for CE plane, CEAC, EVPI] Report mean incremental NMB, P(cost-effective), P(dominant). Do NOT average ICERs across iterations — use NMB instead. ``` ::: ### Step 5: Shiny App — Once Everything Is Validated ::: {.callout-note appearance="simple"} **PROMPT — Shiny App** ``` I have a validated [Markov/PSM] model built with [heemod/flexsurv]. [Paste the working code or say "the code from our previous conversation"] Create a Shiny app with: - Sidebar: sliders for key parameters from my parameter file, plus a "Run PSA" button (PSA should not run on every slider change) - Tab 1: Base case results (table + trace/survival plot), updates reactively - Tab 2: Tornado diagram, updates reactively - Tab 3: PSA results (CE plane + CEAC), updates only on button click - Format costs in ₹ with commas. Include CSV download button. ``` ::: ::: {.callout-tip} ## Why Stage the Shiny App Last? If your base case has bugs, the Shiny app will faithfully reproduce those bugs with pretty sliders. The app is for **exploring** a correct model, not for **debugging** a broken one. ::: ## Common Pitfalls with AI-Generated HTA Code ### Pitfall 1: AI Ignores Your Package Instruction The AI may still write raw loops. Check the first few lines — if you see `for (cycle in 1:n_cycles)` instead of `define_transition()`, push back: *"Rewrite using heemod functions, not manual loops."* ### Pitfall 2: Wrong Distribution Parameterisation AI tools sometimes confuse the parameterisation of Beta and Gamma distributions. For example, they may use mean and SD directly rather than converting to shape parameters. **Always check:** For a Beta distribution with mean μ and variance σ², the shape parameters are: $$\alpha = \mu \left(\frac{\mu(1-\mu)}{\sigma^2} - 1\right), \quad \beta = (1-\mu) \left(\frac{\mu(1-\mu)}{\sigma^2} - 1\right)$$ ### Pitfall 3: Wrong HR Application for Survival Models A real bug we found and fixed in this workshop: - **Proportional hazards (correct for most HTA):** `S(t) = exp(-(λ × HR) × t^γ)` — scales the hazard - **Accelerated failure time (different model):** `S(t) = exp(-λ × (HR × t)^γ)` — scales time Both are valid, but they give different results when γ ≠ 1. Always specify "proportional hazards, not AFT." ### Pitfall 4: Averaging ICERs in PSA AI-generated PSA code often reports "mean ICER" across iterations. This is unreliable — a few iterations with near-zero ΔQALYs produce extreme values. Always insist on **NMB-based reporting**. ### Pitfall 5: Forgetting Half-Cycle Correction If using heemod, specify `method = "life-table"`. If manual code, check for the averaging formula. ### Pitfall 6: Hallucinated Functions AI tools sometimes invent R functions or package names that do not exist. If you see a function you do not recognise, verify it exists by typing `?function_name` in the R console. ## A Sensible Workflow — Summary 1. **Prepare your parameter file** in Excel — this is your single source of truth 2. **Prompt for base case** — short prompt, attach the file, name the package 3. **Validate** — cross-check against hand calculations, Excel, or published ICERs 4. **Prompt for DSA + PSA** — once the base case is solid 5. **Prompt for Shiny app** — once everything is validated 6. **Use AI for debugging** — paste error messages and get explanations 7. **Save working code** — once something works, it becomes your template for next time ## Key Takeaway AI tools can dramatically speed up your R workflow for HTA, but they are an **accelerator**, not an **autopilot**. Your HTA expertise determines the model structure, the parameter choices, and the interpretation. AI helps translate that expertise into working R code more quickly. > **Your parameter file + validated package + AI coding = production-ready model** Think of AI as a very fast but occasionally unreliable research assistant who is good at R syntax but knows nothing about your specific clinical question. → **Proceed to Day 3 sessions.**