Processes raw STEPS survey data: renames columns, coerces types, derives standard indicators, handles missing values, and applies plausibility checks.
clean_steps_data(
data,
cols,
age_min = 18,
age_max = 69,
bp_sbp_threshold = 140,
bp_dbp_threshold = 90,
bmi_overweight = 25,
bmi_obese = 30,
glucose_threshold = 7,
glucose_impaired_threshold = 6.1,
chol_threshold = 5
)A data frame (typically from import_steps_data()).
A named list of column names, as returned by detect_steps_columns().
Minimum age for inclusion (default 18).
Maximum age for inclusion (default 69).
SBP threshold for raised BP (default 140; Mongolia uses 130).
DBP threshold for raised BP (default 90; Mongolia uses 80).
BMI threshold for overweight (default 25.0).
BMI threshold for obesity (default 30.0).
Fasting glucose threshold for raised glucose / diabetes in mmol/L (default 7.0).
Fasting glucose threshold for impaired fasting glucose in mmol/L (default 6.1).
Total cholesterol threshold for raised cholesterol in mmol/L (default 5.0).
A data frame with standardised and derived variables, ready for survey design setup.
The function performs the following transformations:
Renames columns to standard names (age, sex, wt_final, etc.)
Converts numeric strings to appropriate types
Restricts age to [age_min, age_max]
Creates WHO standard age groups (18-24, 25-34, etc.)
Harmonises sex coding to Male/Female
Derives body mass index (BMI) and categories
Averages blood pressure readings (last 2 of 3)
Recodes yes/no variables to logical
Creates derived risk indicators (raised BP, diabetes, etc.)
Applies plausibility checks to measurements
Drops records with missing age or sex