← View series: statistics
~/blog
Variables
Every feature in a machine learning dataset is a variable. But not all variables are created equal — the type of variable determines what operations are meaningful, what statistics are interpretable, and what models you can use. Fitting a linear regression to a blood type column or computing the mean of a severity rating produces numbers, but those numbers are meaningless. Understanding variable types is what prevents that mistake.
The Anchor Dataset
Throughout this post, examples draw on a realistic ML model evaluation setup. The primary running example is six cross-validation accuracy scores:
accuracy = [0.82, 0.79, 0.91, 0.85, 0.78, 0.88]
And a small feature table for context:
| Sample | accuracy (fold) | model_type | severity_rating | n_errors |
|---|---|---|---|---|
| fold_1 | 0.82 | CNN | 3 | 12 |
| fold_2 | 0.79 | CNN | 4 | 18 |
| fold_3 | 0.91 | ResNet | 2 | 7 |
What Is a Variable?
A variable is a characteristic that can take different values across observations. The accuracy score changes fold to fold. The model type changes sample to sample. Both are variables.
A constant does not change: is always . If every fold uses the same learning rate and you never vary it, the learning rate is a constant — not a variable — in that dataset.
Everything in statistics is built on understanding variables. The type of variable determines what statistical methods make sense, how you visualize it, and what you can conclude.
Qualitative vs Quantitative: The First Split
Variables divide into two broad families:
Qualitative (categorical) variables describe qualities or group memberships that cannot be meaningfully expressed with numbers. Model architecture (CNN vs ResNet vs Transformer) is qualitative. You can assign CNN = 1, ResNet = 2, but those numbers carry no meaning — there is no sense in which ResNet is "twice" CNN.
Quantitative (numerical) variables represent quantities that can be measured with numbers. Accuracy, loss, F1 score, number of parameters — all quantitative.
Qualitative Variables: Nominal vs Ordinal
Nominal variables have categories with no natural ordering:
- Model architecture: CNN, ResNet, Transformer
- Predicted class: cat, dog, bird
- Error type: false positive, false negative, true positive, true negative
Ordinal variables have meaningful order, but the gaps between categories are not guaranteed to be equal:
- Severity rating (1 < 2 < 3 < 4 < 5): the ordering is real, but "4" is not twice as severe as "2"
- Model size: small < medium < large
- Data quality tier: gold < silver < bronze
With ordinal data, you can say "severity 4 is worse than severity 3," but you cannot say the difference between 3 and 4 equals the difference between 1 and 2. This matters: computing the mean of ordinal data produces a number, but that number may not be meaningful.
Quantitative Variables: Discrete vs Continuous
Discrete variables take specific, countable values — usually integers:
n_errors: 12, 18, 7 — you cannot have 12.5 errors- Number of epochs until convergence
- Number of classes in a classification problem
Continuous variables can take any value within a range:
accuracy: can be 0.82, 0.823, 0.8234... — measured at whatever precision you choose- Loss value, F1 score, AUC-ROC
The test: between any two values, can you always find another value? For accuracy, yes — between 0.82 and 0.83 sits 0.825. For n_errors, no — between 12 and 13 there is nothing.
The Measurement Scale Hierarchy
This framework connects variable types to what mathematical operations are meaningful:
Nominal → Ordinal → Interval → Ratio
| Level | Categories | Ordered | Equal Intervals | True Zero | Example |
|---|---|---|---|---|---|
| Nominal | Yes | No | No | No | model architecture |
| Ordinal | Yes | Yes | No | No | severity rating |
| Interval | Yes | Yes | Yes | No | temperature in Celsius |
| Ratio | Yes | Yes | Yes | Yes | accuracy, loss, n_errors |
Interval scales have equal gaps but no true zero. Temperature in Celsius: the difference between 20°C and 30°C equals the difference between 30°C and 40°C. But 0°C does not mean "no temperature." You cannot say 40°C is "twice as hot" as 20°C.
Ratio scales have everything, including a true zero. Accuracy = 0 means the model got nothing right. Loss = 0 means perfect. You can say "Model A has twice the error rate of Model B."
Most ML metrics (accuracy, precision, recall, F1, AUC, loss) are ratio-scale. Most survey-style inputs (satisfaction ratings, severity scores) are ordinal.
Independent vs Dependent Variables
In ML training and experimentation:
- Independent variable: what you vary or control — learning rate, batch size, model architecture
- Dependent variable: what you measure as the outcome — validation loss, accuracy
When you run a hyperparameter sweep, the learning rate is the independent variable. The validation accuracy is the dependent variable. Calling it "dependent" carries a causal implication: changes in learning rate cause changes in accuracy. In purely observational data, this causality is harder to establish.
Fuzzy Boundaries Worth Knowing
Is accuracy discrete or continuous? In practice, it depends on the dataset size. With 100 test examples, accuracy can only take 101 values (0/100, 1/100, ..., 100/100). With 10,000 examples, it effectively behaves as continuous. Most analyses treat accuracy as continuous regardless.
Are severity ratings ordinal or interval? They are ordinal in principle, but many practitioners treat them as interval for convenience — computing means and standard deviations of Likert-scale data. This is a simplification that sometimes works and sometimes misleads. Know when you are making it.
Python Example
import pandas as pd
import numpy as np
data = {
'fold': [1, 2, 3, 4, 5, 6],
'accuracy': [0.82, 0.79, 0.91, 0.85, 0.78, 0.88],
'model_type': ['CNN', 'CNN', 'ResNet', 'ResNet', 'CNN', 'ResNet'],
'severity_rating': [3, 4, 2, 3, 4, 2],
'n_errors': [12, 18, 7, 11, 19, 9]
}
df = pd.DataFrame(data)
print(df.dtypes)
print()
print("Quantitative summary:")
print(df[['accuracy', 'n_errors']].describe())
print()
print("Nominal summary:")
print(df['model_type'].value_counts())fold int64
accuracy float64
model_type object
severity_rating int64
n_errors int64
dtype: object
Quantitative summary:
accuracy n_errors
count 6.000000 6.000000
mean 0.838333 12.666667
std 0.051169 4.589344
min 0.780000 7.000000
max 0.910000 19.000000
Nominal summary:
model_type
CNN 3
ResNet 3
dtype: int64
Calculation Trace
| Variable | Type | What is meaningful | What is not meaningful |
|---|---|---|---|
accuracy | Ratio continuous | Mean, std dev, ratio | — |
model_type | Nominal | Count, mode | Mean, order |
severity_rating | Ordinal | Median, rank | Mean (debatable), ratio |
n_errors | Ratio discrete | Mean, sum, ratio | Values between integers |
Related Concepts
The previous posts computed the mean, median, variance, and standard deviation of accuracy. Those operations are valid because accuracy is ratio-scale and continuous. The next post extends this to random variables — the formal probability-theoretic objects that model quantities like accuracy as uncertain values before they are observed. From there, histograms show how the distribution of a continuous variable looks, and percentiles/quartiles give you a way to locate specific values within that distribution.
When This Framework Breaks Down
The variable type taxonomy is a guide, not a hard rule. In practice, the right choice depends on context and what you want to do with the data. Computing the mean of ordinal severity ratings is technically questionable but widely done and sometimes gives useful results — just interpret with caution. More seriously, confusing ordinal with ratio leads to mistakes like: "Model A's severity rating went from 2 to 4, so its errors became twice as severe." That reasoning requires ratio-scale data. Ordinal data only supports rank comparisons.
Test Your Understanding
-
You have a dataset with columns:
optimizer(Adam/SGD/RMSprop),learning_rate(0.001 to 0.1),epochs_to_convergence(integer),final_accuracy(float). Classify each as nominal, ordinal, discrete, or continuous. Which columns support computing the mean? -
A researcher reports the "average model architecture" in their study as 1.7 (encoding CNN=1, ResNet=2, Transformer=3). What is wrong with this calculation, and what should they report instead?
-
For
accuracy = [0.82, 0.79, 0.91, 0.85, 0.78, 0.88], can you compute a meaningful ratio — for example, "fold 3 has 1.165 times the accuracy of fold 2"? What property of the variable type makes this valid? -
Temperature in Celsius is interval-scale. Temperature in Kelvin is ratio-scale. A model's performance degrades by 3°C. Does "twice as cold" make sense in Celsius? In Kelvin? What does this tell you about when ratios are valid?
Ready to learn about Random Variables — variables whose values are determined by random processes?
Previous: Standard Deviation | Next: What Are Random Variables