~/blog

MAPE and SMAPE

Jul 3, 2026•7 min read•By Mohammed Vasim

deep-learningneural-networksmachine-learningrepresentation-learning

MAE tells you the average absolute error in the units of the target. For house prices, an MAE of 20,000 might be acceptable — a 5% error on a 400,000 house. For shoe prices, an MAE of 20,000 is catastrophic — it exceeds the entire price. The same numeric error means different things depending on scale.

Percentage error solves this: express errors relative to the true value, so the metric is comparable across domains and across orders of magnitude. A supply chain manager who needs "are we within 5% of demand?" can interpret MAPE directly. A finance team comparing forecasting models across different stock prices can use MAPE regardless of the price range.

Anchor: 5 house price predictions.

python

y_true = [300000, 180000, 450000, 120000, 350000]
y_pred = [320000, 165000, 480000, 135000, 340000]

MAPE

MAPE = (100/n) Σ |yᵢ − ŷᵢ| / |yᵢ|

For each sample, compute the absolute error divided by the true value — the relative error. Then average across all samples and multiply by 100 to express as a percentage.

MAPE Trace Table:

Sample	y_true	y_pred	\|error\|	\|error\|/y_true	% error
1	300000	320000	20000	20000/300000	6.6667%
2	180000	165000	15000	15000/180000	8.3333%
3	450000	480000	30000	30000/450000	6.6667%
4	120000	135000	15000	15000/120000	12.5000%
5	350000	340000	10000	10000/350000	2.8571%

MAPE = (6.6667 + 8.3333 + 6.6667 + 12.5000 + 2.8571) / 5 = 37.0238 / 5 = 7.4048%

The model is on average about 7.4% wrong relative to the true price. Sample 4 (120,000 house with 135,000 prediction) drives the most error — 12.5% — because the denominator is small.

The Zero Problem

MAPE has a fatal flaw: when y_true = 0, the denominator is zero. When y_true is very small, the percentage error explodes for even tiny absolute errors.

Mini-anchor: y_true = [1000, 0, 500], y_pred = [1100, 50, 600]

Sample 1: |1100−1000|/1000 = 10% — fine
Sample 2: |50−0|/0 = undefined (division by zero)
Sample 3: |600−500|/500 = 20% — fine

MAPE is undefined the moment any true value is zero. In practice, some implementations replace 0 with a small ε (like 0.001), but then the percentage error for sample 2 becomes 50,000% — a single near-zero true value can dominate the entire metric.

SMAPE

SMAPE (Symmetric Mean Absolute Percentage Error) replaces the denominator with the average of the absolute true and predicted values:

SMAPE = (100/n) Σ |yᵢ − ŷᵢ| / ((|yᵢ| + |ŷᵢ|) / 2)

When y_true = 0 but y_pred ≠ 0: denominator = (0 + |y_pred|)/2 = |y_pred|/2 — no longer undefined (unless y_pred is also 0).

SMAPE Trace Table:

Sample	y_true	y_pred	\|error\|	denom	SMAPE%
1	300000	320000	20000	(300000+320000)/2=310000	20000/310000×100=6.4516%
2	180000	165000	15000	172500	15000/172500×100=8.6957%
3	450000	480000	30000	465000	30000/465000×100=6.4516%
4	120000	135000	15000	127500	15000/127500×100=11.7647%
5	350000	340000	10000	345000	10000/345000×100=2.8986%

SMAPE = (6.4516 + 8.6957 + 6.4516 + 11.7647 + 2.8986) / 5 = 36.2622 / 5 = 7.2524%

MAPE vs SMAPE comparison: MAPE = 7.4048%, SMAPE = 7.2524%. They are close on this anchor because no true values are near zero. The difference grows when predictions and truth diverge significantly.

SMAPE Is Still Imperfect

Despite the name, SMAPE is not truly symmetric and has its own edge cases:

Double-zero case: y_true = 0 AND y_pred = 0 → SMAPE = 0/0, undefined.

Bounded but unintuitive at extremes:

y_true=100, y_pred=0: |100−0|/((100+0)/2) = 100/50 = 2.0 → 200%
y_true=0, y_pred=100: |0−100|/((0+100)/2) = 100/50 = 2.0 → 200%

SMAPE is bounded above at 200% — which can be misleading. A SMAPE of 200% looks like a dramatic bounded error but is actually the worst case.

Asymmetry example:

y_true=100, y_pred=150: |−50|/((100+150)/2) = 50/125 = 40%
y_true=150, y_pred=100: |50|/((150+100)/2) = 50/125 = 40% — symmetric here
But: y_true=100, y_pred=50: 50/75 = 66.7%; y_true=50, y_pred=100: 50/75 = 66.7% — still symmetric
The asymmetry appears when one value is much larger: SMAPE treats over and under predictions differently despite the "symmetric" label in non-trivial cases.

When to Use Each

Metric	Use when	Avoid when
MAPE	Data always positive, never near zero (house prices, product sales)	Any zero true values; values near zero
SMAPE	Slightly more stable than MAPE; comparison across scales	Both y and ŷ near zero; when intuitive percentage interpretation matters
MAE/RMSE	Scale-specific error is interpretable; data crosses zero (temperatures, returns, financial deltas)	Comparing models across different magnitude data

Code

python

import numpy as np

def mape(y_true, y_pred):
    return np.mean(np.abs((y_true - y_pred) / y_true)) * 100

def smape(y_true, y_pred):
    denom = (np.abs(y_true) + np.abs(y_pred)) / 2
    return np.mean(np.abs(y_true - y_pred) / denom) * 100

y_true = np.array([300000, 180000, 450000, 120000, 350000], dtype=float)
y_pred = np.array([320000, 165000, 480000, 135000, 340000], dtype=float)

print("MAPE per sample:")
pct_errors = np.abs((y_true - y_pred) / y_true) * 100
for i, (yt, yp, pe) in enumerate(zip(y_true, y_pred, pct_errors)):
    print(f"  Sample {i+1}: |{yt:.0f}-{yp:.0f}|/{yt:.0f} = {pe:.4f}%")
print(f"  MAPE = {mape(y_true, y_pred):.4f}%")

print("\nSMAPE per sample:")
denoms = (np.abs(y_true) + np.abs(y_pred)) / 2
smape_per = np.abs(y_true - y_pred) / denoms * 100
for i, (yt, yp, d, s) in enumerate(zip(y_true, y_pred, denoms, smape_per)):
    print(f"  Sample {i+1}: denom={d:.0f}, SMAPE={s:.4f}%")
print(f"  SMAPE = {smape(y_true, y_pred):.4f}%")

# Zero failure case
print("\nZero value failure (MAPE):")
y_true_z = np.array([1000., 0.001, 500.])
y_pred_z = np.array([1100., 50., 600.])
try:
    print(f"  MAPE = {mape(y_true_z, y_pred_z):.2f}%  (explodes for near-zero)")
except:
    print("  MAPE: undefined (division by zero)")

text

MAPE per sample:
  Sample 1: |300000-320000|/300000 = 6.6667%
  Sample 2: |180000-165000|/180000 = 8.3333%
  Sample 3: |450000-480000|/450000 = 6.6667%
  Sample 4: |120000-135000|/120000 = 12.5000%
  Sample 5: |350000-340000|/350000 = 2.8571%
  MAPE = 7.4048%

SMAPE per sample:
  Sample 1: denom=310000, SMAPE=6.4516%
  Sample 2: denom=172500, SMAPE=8.6957%
  Sample 3: denom=465000, SMAPE=6.4516%
  Sample 4: denom=127500, SMAPE=11.7647%
  Sample 5: denom=345000, SMAPE=2.8986%
  SMAPE = 7.2524%

Zero value failure (MAPE):
  MAPE = 4975.00%  (explodes for near-zero)

The near-zero value (0.001 in place of 0) makes MAPE report 4975% — almost entirely driven by that one sample's massive relative error. The other two samples would give only (10% + 20%)/2 = 15% on their own.

MAPE and SMAPE are percentage-error variants of MAE (02-regression-losses.md). The M4 and M5 forecasting competitions — the largest academic benchmarks for time-series models — used SMAPE and sMAPE variants as primary metrics, which is why they appear frequently in forecasting literature. For zero-crossing data like stock returns, temperature anomalies, or financial deltas, MAE or RMSE (02-regression-losses.md) remain the appropriate choices since percentage error is undefined or meaningless when the denominator can be zero.

Honest Limitations

MAPE has an inherent asymmetry that is rarely discussed: it penalizes underpredictions more than overpredictions at the same absolute scale. If y_true=100 and you predict 150 (over by 50), MAPE contribution = 50/100 = 50%. If y_true=100 and you predict 50 (under by 50), MAPE contribution = 50/100 = 50% — same here only because the denominator is the same true value. But over a distribution of true values, the relative denominator effect changes the weighting. This makes MAPE subtly biased toward underprediction.

SMAPE is bounded at 200% but this bound is unintuitive. If a model always predicts zero for every sample, its SMAPE is always 200% — it appears to be a "bounded" error rather than a catastrophic failure. Practitioners unfamiliar with this ceiling can be misled into thinking 200% SMAPE represents a reasonably quantified error.

Neither metric works when the target crosses zero. Predicting monthly temperature changes (which range from −10 to +10 degrees) with MAPE or SMAPE will produce meaningless results near zero. For any regression target that is not strictly positive and bounded away from zero, default to RMSE or MAE.

Test Your Understanding

Compute MAPE for a single prediction where y_true=500 and y_pred=400. Now compute it where y_true=50 and y_pred=40. Same absolute error, different MAPE. What does this reveal about the metric?
In the anchor, sample 4 (y_true=120000) contributes the highest MAPE at 12.5%. Why does the smallest true value in the dataset drive the highest percentage error even with the same absolute error as sample 5?
Compute SMAPE for y_true=0 and y_pred=80. Is it defined? What is the value? What does a SMAPE of 200% mean in this context?
You are evaluating two models on a financial forecasting task where values range from −500 to +500. Why would MAPE be an inappropriate choice even if no value is exactly zero?
The M5 forecasting competition used SMAPE as a primary metric. At least 3 teams reported that optimizing SMAPE directly (using it as the training loss) led to worse performance than using MAE and then evaluating SMAPE. Propose a reason why SMAPE is difficult to optimize directly using gradient descent.

MAPE and SMAPE

MAPE

The Zero Problem

SMAPE

SMAPE Is Still Imperfect

When to Use Each

Code

Honest Limitations

Test Your Understanding

Comments (0)

Leave a comment

MAPE and SMAPE

MAPE

The Zero Problem

SMAPE

SMAPE Is Still Imperfect

When to Use Each

Code

Related Concepts

Honest Limitations

Test Your Understanding

Comments (0)

Leave a comment