Back to blog
← View series: machine learning

Types of Machine Learning Equation of a Line, 3D Plane, and Hyperplane Distance of a Point from a Plane Instance-Based vs Model-Based Learning Simple Linear Regression Cost Function in Linear Regression Gradient Descent Multiple Linear Regression Performance Metrics for Regression Overfitting and Underfitting Linear Regression OLS: The Normal Equation Practicals: Simple and Multiple Linear Regression Polynomial Regression Ridge, Lasso, and ElasticNet Regression Cross-Validation End-to-End ML Project: Linear Regression

~/blog

Distance of a Point from a Plane

Jun 25, 2026•6 min read•By Mohammed Vasim

Machine LearningAIData Science

The distance from a point to a hyperplane is the single geometric quantity that underpins Support Vector Machines, the margin concept, and the intuition behind every linear classifier. Before reaching SVMs, you need this formula cold — in 2D, 3D, and $p$ dimensions — and you need to understand what the sign of that distance tells you.

Distance from a Point to a Line (2D)

A line in 2D is written in general form as $a x + b y + c = 0$ . The distance from a point $(x_{0}, y_{0})$ to this line is the length of the perpendicular from the point to the line.

Why the perpendicular? Any other path from the point to the line is longer — the perpendicular is the shortest. The perpendicular meets the line at the point where the line's normal vector $n = (a, b)$ points from the line to $(x_{0}, y_{0})$ .

Working through the projection, the signed distance is:

$d = \frac{a x _{0} + b y _{0} + c}{a ^{2} + b ^{2}}$

The absolute value $∣ d ∣$ gives the distance; the sign tells you which side of the line the point is on.

Anchor setup: Predict loan default from income ( $x_{1}$ , in $\$ k $) an dd e b t r a t i o ($ x_2 $) . A c an d i d a t e d ec i s i o n l in e i s$ 0.5 x_1 - 3 x_2 - 20 = 0 $, so$ a = 0.5 $,$ b = -3 $,$ c = -20 $. T h e d e n o mina t or i s$ \sqrt{0.5^2 + 3^2} = \sqrt{0.25 + 9} = \sqrt{9.25} \approx 3.04$.

Computing distances for three anchor points:

Point $(45, 0.35)$ — income $45k, debt ratio 0.35:

$d = \frac{0.5 \times 45 + ( - 3 ) \times 0.35 + ( - 20 )}{3.04} = \frac{22.5 - 1.05 - 20}{3.04} = \frac{1.45}{3.04} \approx 0.477$

Point $(31, 0.62)$ — income $31k, debt ratio 0.62:

$d = \frac{0.5 \times 31 + ( - 3 ) \times 0.62 + ( - 20 )}{3.04} = \frac{15.5 - 1.86 - 20}{3.04} = \frac{- 6.36}{3.04} \approx - 2.09$

Point $(95, 0.12)$ — income $95k, debt ratio 0.12:

$d = \frac{0.5 \times 95 + ( - 3 ) \times 0.12 + ( - 20 )}{3.04} = \frac{47.5 - 0.36 - 20}{3.04} = \frac{27.14}{3.04} \approx 8.93$

Point	Raw: $a x_{0} + b x_{0} + c$	$∣ raw ∣$	$a^{2} + b^{2}$	Distance $d$
(45, 0.35)	+1.45	1.45	3.04	0.477
(31, 0.62)	−6.36	6.36	3.04	2.09
(95, 0.12)	+27.14	27.14	3.04	8.93

<line x1="60" y1="280" x2="520" y2="280" stroke="#334155" stroke-width="1.5"/>
<line x1="60" y1="20" x2="60" y2="280" stroke="#334155" stroke-width="1.5"/>

<text x="290" y="310" text-anchor="middle" font-size="12" fill="#334155">income ($k)</text>
<text x="20" y="150" text-anchor="middle" font-size="12" fill="#334155" transform="rotate(-90,20,150)">debt ratio</text>

<line x1="100" y1="20" x2="500" y2="200" stroke="#3b82f6" stroke-width="1.8"/>
<text x="410" y="196" font-size="10" fill="#3b82f6">0.5x₁ − 3x₂ − 20 = 0</text>

<circle cx="200" cy="230" r="6" fill="#dc2626"/>
<text x="208" y="228" font-size="10" fill="#dc2626">(45, 0.35) default</text>
<line x1="200" y1="230" x2="205" y2="210" stroke="#f59e0b" stroke-width="1.5"/>
<text x="208" y="222" font-size="9" fill="#f59e0b">d=0.48</text>

<circle cx="130" cy="80" r="6" fill="#dc2626"/>
<text x="138" y="78" font-size="10" fill="#dc2626">(31, 0.62) default</text>
<line x1="130" y1="80" x2="165" y2="95" stroke="#f59e0b" stroke-width="1.5"/>
<text x="148" y="85" font-size="9" fill="#f59e0b">d=2.09</text>

<circle cx="460" cy="250" r="6" fill="#22c55e"/>
<text x="380" y="268" font-size="10" fill="#22c55e">(95, 0.12) no-default</text>
<line x1="460" y1="250" x2="415" y2="228" stroke="#f59e0b" stroke-width="1.5"/>
<text x="420" y="245" font-size="9" fill="#f59e0b">d=8.93</text>

<circle cx="300" cy="200" r="6" fill="#22c55e"/>
<circle cx="380" cy="260" r="6" fill="#22c55e"/>
<circle cx="150" cy="140" r="6" fill="#dc2626"/>

Signed Distance — Which Side of the Line?

The sign of $a x_{0} + b x_{0} + c$ directly identifies which side of the decision boundary a point is on:

Positive ( $+ 1.45$ for point (45, 0.35)): on the side where $0.5 x_{1} - 3 x_{2} - 20 > 0$ . Classifier predicts class +1.
Negative ( $- 6.36$ for point (31, 0.62)): on the opposite side. Classifier predicts class −1.

This signed quantity is exactly what SVMs use. The sign is the prediction; the magnitude is the confidence. A point 8.93 units from the decision boundary is much more confidently classified than one 0.48 units away.

Normalizing the Line Equation

Dividing $a$ , $b$ , $c$ by $a^{2} + b^{2}$ creates a normalized form where the denominator equals 1, simplifying the distance formula to just the numerator:

$a^{'} = \frac{0.5}{3.04} = 0.164, b^{'} = \frac{- 3}{3.04} = - 0.987, c^{'} = \frac{- 20}{3.04} = - 6.58$

Check for point $(45, 0.35)$ :

$∣0.164 \times 45 + (- 0.987) \times 0.35 - 6.58∣ = ∣7.38 - 0.35 - 6.58∣ = ∣0.45∣ \approx 0.48 ✓$

The small difference from 0.477 is rounding. Normalizing is why SVM theory often assumes $∥ w ∥ = 1$ — it makes the distance formula clean.

Extension to 3D: Distance from a Point to a Plane

In 3D, the plane equation is $a x + b y + cz + d = 0$ . The distance formula extends naturally:

$dist = \frac{∣ a x _{0} + b y _{0} + c z _{0} + d ∣}{a ^{2} + b ^{2} + c ^{2}}$

For the plane $x + 2 y - 3 z + 6 = 0$ and point $P = (1, 2, 3)$ :

$Numerator: ∣1 \times 1 + 2 \times 2 + (- 3) \times 3 + 6∣ = ∣1 + 4 - 9 + 6∣ = ∣2∣ = 2$

$Denominator: 1^{2} + 2^{2} + 3^{2} = 14 \approx 3.742$

$dist = \frac{2}{3.742} \approx 0.535$

The geometry is identical to the 2D case: the denominator is the length of the normal vector to the plane, and dividing by it projects the point's displacement onto the unit normal.

Extension to p Dimensions: Distance from a Point to a Hyperplane

In $p$ dimensions the hyperplane is $w \cdot x + w_{0} = 0$ , and the distance from a point $x_{0}$ to it is:

$d = \frac{∣ w \cdot x _{0} + w _{0} ∣}{∥ w ∥}$

This is the margin formula in SVMs. The margin between two classes is defined as $\frac{2}{∥ w ∥}$ — twice the minimum distance from the closest training point to the decision hyperplane. Maximizing the margin means minimizing $∥ w ∥$ , which is the SVM optimization objective.

Distance Formula Reference

Setting	Formula	Denominator
Point to line (2D)	$∥ a x_{0} + b y_{0} + c ∥/ a^{2} + b^{2}$	Normal vector length
Point to plane (3D)	$∥ a x_{0} + b y_{0} + c z_{0} + d ∥/ a^{2} + b^{2} + c^{2}$	Normal vector length
Point to hyperplane ( $p$ D)	$∥ w \cdot x_{0} + w_{0} ∥/∥ w ∥$	Weight vector norm

This distance formula is the mathematical backbone of SVMs (margin maximization), signed output in logistic regression (the log-odds is $w \cdot x$ ), and the geometric interpretation of regularization (constraining $∥ w ∥$ constrains the margin). The formula requires the hyperplane to be parameterized in the $w \cdot x + w_{0} = 0$ form — not all representations make the normal vector explicit.

The limitation here is dimensionality. In high dimensions, points that look geometrically distant can actually cluster near the surface of a hyperphere (the curse of dimensionality). Distance-based reasoning becomes unreliable when $p ≫ n$ , which is part of why SVMs use kernel tricks to work in feature spaces rather than raw input spaces.

Test Your Understanding

For the decision line $0.5 x_{1} - 3 x_{2} - 20 = 0$ , compute the distance for the point $(67, 0.28)$ using the formula. Which class does the sign predict?
If you multiply both sides of the line equation by $- 1$ (so $- 0.5 x_{1} + 3 x_{2} + 20 = 0$ ), does the distance change? Does the signed distance change?
In SVM, the margin is $2/∥ w ∥$ . If $w = [0.5, - 3]$ as in our anchor, what is the margin? To double the margin, what would you need to do to $w$ ?
The normalization step changes $a = 0.5$ to $a^{'} = 0.164$ . If you use the normalized coefficients in the distance formula, why does the denominator disappear?
A data point lies exactly on the decision hyperplane. What is its signed distance? What prediction does an SVM make for it, and why is this a problem in practice?

Distance of a Point from a Plane

Distance from a Point to a Line (2D)

Signed Distance — Which Side of the Line?

Normalizing the Line Equation

Extension to 3D: Distance from a Point to a Plane

Extension to p Dimensions: Distance from a Point to a Hyperplane

Distance Formula Reference

Test Your Understanding

Comments (0)

Leave a comment

Distance of a Point from a Plane

Distance from a Point to a Line (2D)

Signed Distance — Which Side of the Line?

Normalizing the Line Equation

Extension to 3D: Distance from a Point to a Plane

Extension to p Dimensions: Distance from a Point to a Hyperplane

Distance Formula Reference

Related Concepts and Honest Limitations

Test Your Understanding

Comments (0)

Leave a comment