← View series: machine learning
~/blog
Distance of a Point from a Plane
The distance from a point to a hyperplane is the single geometric quantity that underpins Support Vector Machines, the margin concept, and the intuition behind every linear classifier. Before reaching SVMs, you need this formula cold — in 2D, 3D, and dimensions — and you need to understand what the sign of that distance tells you.
Distance from a Point to a Line (2D)
A line in 2D is written in general form as . The distance from a point to this line is the length of the perpendicular from the point to the line.
Why the perpendicular? Any other path from the point to the line is longer — the perpendicular is the shortest. The perpendicular meets the line at the point where the line's normal vector points from the line to .
Working through the projection, the signed distance is:
The absolute value gives the distance; the sign tells you which side of the line the point is on.
Anchor setup: Predict loan default from income (, in \kx_20.5 x_1 - 3 x_2 - 20 = 0a = 0.5b = -3c = -20\sqrt{0.5^2 + 3^2} = \sqrt{0.25 + 9} = \sqrt{9.25} \approx 3.04$.
Computing distances for three anchor points:
Point — income $45k, debt ratio 0.35:
Point — income $31k, debt ratio 0.62:
Point — income $95k, debt ratio 0.12:
| Point | Raw: | Distance | ||
|---|---|---|---|---|
| (45, 0.35) | +1.45 | 1.45 | 3.04 | 0.477 |
| (31, 0.62) | −6.36 | 6.36 | 3.04 | 2.09 |
| (95, 0.12) | +27.14 | 27.14 | 3.04 | 8.93 |
<line x1="60" y1="280" x2="520" y2="280" stroke="#334155" stroke-width="1.5"/>
<line x1="60" y1="20" x2="60" y2="280" stroke="#334155" stroke-width="1.5"/>
<text x="290" y="310" text-anchor="middle" font-size="12" fill="#334155">income ($k)</text>
<text x="20" y="150" text-anchor="middle" font-size="12" fill="#334155" transform="rotate(-90,20,150)">debt ratio</text>
<line x1="100" y1="20" x2="500" y2="200" stroke="#3b82f6" stroke-width="1.8"/>
<text x="410" y="196" font-size="10" fill="#3b82f6">0.5x₁ − 3x₂ − 20 = 0</text>
<circle cx="200" cy="230" r="6" fill="#dc2626"/>
<text x="208" y="228" font-size="10" fill="#dc2626">(45, 0.35) default</text>
<line x1="200" y1="230" x2="205" y2="210" stroke="#f59e0b" stroke-width="1.5"/>
<text x="208" y="222" font-size="9" fill="#f59e0b">d=0.48</text>
<circle cx="130" cy="80" r="6" fill="#dc2626"/>
<text x="138" y="78" font-size="10" fill="#dc2626">(31, 0.62) default</text>
<line x1="130" y1="80" x2="165" y2="95" stroke="#f59e0b" stroke-width="1.5"/>
<text x="148" y="85" font-size="9" fill="#f59e0b">d=2.09</text>
<circle cx="460" cy="250" r="6" fill="#22c55e"/>
<text x="380" y="268" font-size="10" fill="#22c55e">(95, 0.12) no-default</text>
<line x1="460" y1="250" x2="415" y2="228" stroke="#f59e0b" stroke-width="1.5"/>
<text x="420" y="245" font-size="9" fill="#f59e0b">d=8.93</text>
<circle cx="300" cy="200" r="6" fill="#22c55e"/>
<circle cx="380" cy="260" r="6" fill="#22c55e"/>
<circle cx="150" cy="140" r="6" fill="#dc2626"/>
Signed Distance — Which Side of the Line?
The sign of directly identifies which side of the decision boundary a point is on:
- Positive ( for point (45, 0.35)): on the side where . Classifier predicts class +1.
- Negative ( for point (31, 0.62)): on the opposite side. Classifier predicts class −1.
This signed quantity is exactly what SVMs use. The sign is the prediction; the magnitude is the confidence. A point 8.93 units from the decision boundary is much more confidently classified than one 0.48 units away.
Normalizing the Line Equation
Dividing , , by creates a normalized form where the denominator equals 1, simplifying the distance formula to just the numerator:
Check for point :
The small difference from 0.477 is rounding. Normalizing is why SVM theory often assumes — it makes the distance formula clean.
Extension to 3D: Distance from a Point to a Plane
In 3D, the plane equation is . The distance formula extends naturally:
For the plane and point :
The geometry is identical to the 2D case: the denominator is the length of the normal vector to the plane, and dividing by it projects the point's displacement onto the unit normal.
Extension to p Dimensions: Distance from a Point to a Hyperplane
In dimensions the hyperplane is , and the distance from a point to it is:
This is the margin formula in SVMs. The margin between two classes is defined as — twice the minimum distance from the closest training point to the decision hyperplane. Maximizing the margin means minimizing , which is the SVM optimization objective.
Distance Formula Reference
| Setting | Formula | Denominator |
|---|---|---|
| Point to line (2D) | Normal vector length | |
| Point to plane (3D) | Normal vector length | |
| Point to hyperplane (D) | Weight vector norm |
Related Concepts and Honest Limitations
This distance formula is the mathematical backbone of SVMs (margin maximization), signed output in logistic regression (the log-odds is ), and the geometric interpretation of regularization (constraining constrains the margin). The formula requires the hyperplane to be parameterized in the form — not all representations make the normal vector explicit.
The limitation here is dimensionality. In high dimensions, points that look geometrically distant can actually cluster near the surface of a hyperphere (the curse of dimensionality). Distance-based reasoning becomes unreliable when , which is part of why SVMs use kernel tricks to work in feature spaces rather than raw input spaces.
Test Your Understanding
-
For the decision line , compute the distance for the point using the formula. Which class does the sign predict?
-
If you multiply both sides of the line equation by (so ), does the distance change? Does the signed distance change?
-
In SVM, the margin is . If as in our anchor, what is the margin? To double the margin, what would you need to do to ?
-
The normalization step changes to . If you use the normalized coefficients in the distance formula, why does the denominator disappear?
-
A data point lies exactly on the decision hyperplane. What is its signed distance? What prediction does an SVM make for it, and why is this a problem in practice?