Perpendicular distance

Having flirted with array languages in the past, I've always held a passive curiosity about linear algebra but never arrived at a practical use for it. I've recently been playing with polygon simplification via the Ramer–Douglas–Peucker algorithm and found myself bumping up against vectors and dot products, so I finally had to hit the books, if ever so gently. Casual notes follow. I think some kids learn this stuff in high school, but I clearly wasn't paying attention back then.

Curve flattening

I won't bore you with the full recursive process here. We start with a given subsection of a polygon: a curve with a length of n.

curve 1

We then draw a line from curve₀ to curve_n, representing a hypothetical flattening of this part of the polygon into a straight line. This line (or vector) is called CD.

curve 2

We then have to find a point on the curve that is the furthest perpendicular distance from CD. We can visually confirm that this is curve₇, but this is where I had to study some new math.

curve 3

In order to calculate this distance, we draw another vector AB to a given point, and then calculate the length of its projection upon CD to determine the perpendicular distance of a point.

curve 4

If the distance is within an acceptable compression error boundary ε, then this entire section of the curve can be replaced by CD. If not, the algorithm will recurse on two sub-curves split at this furthest point.

It's visually simple, but getting that perpendicular distance isn't trivial.

Basic vectors

Here's where I freestyle, theoretical rigor be damned. Two Cartesian points comprise CD: (0,0) and (10, 10). That's four pieces of data. We can instead encode this as only the x-offset from the originating point, and its equivalent y-offset: (10, 10). This is a 2D vector.

CD = (curve_xn - curve_x0 , curve_yn - curve_y0)
CD = (10 , 10)

In a sense, vectors are both compression and generalization. We've gone from four points to two offsets, and also uprooted them from any particular location on the plane. Vectors can be compared against one another irrespective of their origins.

Every vector has direction (where its pointing) and magnitude (how far its going). These two offsets encode both.

Magnitude

Magnitude (represented by |CD|) is perhaps more obvious; it's just the length of the slope CD in the graph, and can be arrived at with good old Pythagoras.

|CD| = √(CD_x² + CD_y²)
|CD| = 14.1421…

A vector's magnitude can be scaled by a single value, unsurprisingly known as a scalar. An arithmetic operation ⊙ between a scalar S and a A is just S⊙ applied to all vector members.

S ⊙ A = (S ⊙ A_x , S ⊙ A_y)
2 × CD = (2 × 10 , 2 × 10)
2 × CD = (20 , 20)

Pretty straightforward.

Direction

Direction can likewise be isolated from a vector. (10, 10) gives x and y offsets, which is certainly a direction, but these offsets are scaled by their magnitude. To arrive at "pure direction", we must derive a unit vector, which encodes the pure ratio of x and y direction for every unit of distance. Unit vectors are usually written with a little accent circonflexe on top.

|CD| = 14.1421…
ĈD = CD ÷ |CD|
ĈD = (10 ÷ 14.1421 , 10 ÷ 14.1421)
ĈD = (0.70710678 , 0.70710678)

This is the vector's unscaled offset within one unit of distance, hence the name. It has additional curious properties that will come up later.

Sometimes it helps to think of an offset as a scalar multiplied against this unit, so that each of these values may be manipulated in isolation.

Dot product

This little tuple already encodes a lot of useful information, but even more data reveals itself when combined with other vectors. Something cool happens when we reduce two vectors into a scalar value with a dot product.

A ∙ B = (A_x × B_x) + (A_y × B_y)

Believe it or not, this scalar value encodes the angle θ between A and B—not an immediate value in degrees, though we could certainly calculate that with enough effort. I'll demonstrate visually in a moment, but it could maybe be thought of as:

Scaling one x offset against the other.
Scaling one y offset against the other.
Combining them to see how they relate.

That's still vague, I know.

If the product is negative, then θ is obtuse.
If the product is positive, then θ is acute.
If the product is 0, then θ = 90° and A is perpendicular to B.

That third point should be explored further. What else is zero when its θ input is 90°? The cosine.

cos(90°) = 0°

So it might be reasonable to say that

A ∙ B = cos(θ)

Is this true? Let's plot out two vectors with a zero dot product and see if they draw a right angle.

A = (3, 4)
B = (8, -6)
A ∙ B = (3 × 8) + (4 × -6)
A ∙ B = 24 - 24
A ∙ B = 0

We should expect θ between A and B to be 90°.

curve 5

The image checks out, but the equation is missing something that the graph displays: that B is twice the magnitude of A. Recall the wishy washy language of "scaling one offset against the other"—here's some more: the directional relation between A and B is "smeared" by their respective magnitudes. If the dot product relation were revised to take magnitudes into play, it might read as

A ∙ B = |A| |B| cos(θ)

You can see how even when the pollution of magnitudes is represented, the dot product gleans the same angular relation from the two vectors.

A = (3, 4)
B = (8, -6)
|A| = 5
|B| = 10
A ∙ B = 0
A ∙ B = |A| |B| cos(θ)
cos(θ) = (A ∙ B) ÷ |A| |B|
cos(θ) = 0 ÷ (5 × 10)
cos(θ) = 0
θ = arccos(0°)
θ = 90°

So any scaling is corrected in the dot vector process, but is there a way to represent A and B "unstretched" in the first place? Recall that unit vectors encode "pure direction." Could that be the key? There is no circonflexe over B, so I'll write its unit vector as Ḃ.

Â = A ÷ |A|
Â = (0.6, 0.8)
Ḃ = B ÷ |B|
Ḃ = (0.8, -0.6)

Plotted.

curve 6

It's just as accurate, but simplifies some future algebra. Why? The magnitude of a unit vector is always 1.

|Â| = √(0.6² + 0.8²)
|Â| = 1
|Ḃ| = √(0.8² + -0.6²)
|Ḃ| = 1

This makes sense because there is no scale associated with these offsets. Therefore

Â ∙ Ḃ = |Â| |Ḃ| cos(θ)
Â ∙ Ḃ = 1 × 1 × cos(θ)
Â ∙ Ḃ = cos(θ)

Remember this.

Vector projection

After that lengthy digression, we're actually close to the practical implementation: projecting A onto B in order to calculate perpendicular distance. Projection is often thought of as the "shadow cast" by A upon B: a process in which θ plays a crucial role.

curve 7

Consider how

|A| = 5
|B| = 8

But due to the angle between them, the shadow cast by A upon B does not take up 5÷8 or 62.5% of B. Knowing what percentage of B is actually encompassed by A's shadow would let us draw yet another vector between A and B's intersection point; its magnitude will be our perpendicular distance.

Look at the shadow again though; it forms a right triangle, with

A as the hypotenuse
A's projected shadow (we'll call it υ) as the adjacent
The dotted perpendicular vector as the opposite.

Recall the definition of the cosine: the measure of the adjacent (|υ|) over the measure of the hypotenuse (|A|). So

cos(θ) = |υ| ÷ |A|

But remember our earlier definition using the dot product.

cos(θ) = (A ∙ B) ÷ (|A| |B|)

We can set them equal to one another then.

|υ| ÷ |A| = (A ∙ B) ÷ (|A| |B|)
|υ| = (A ∙ B) ÷ |B|

Here is the magnitude of the projection then! But we need to know where the projection hits B. If we were to apply this calculated magnitude |υ| to the direction of B, we'd wind up with a full projection vector of A upon B. Recall that the "pure direction" of B is its unit vector Ḃ.

Proj_AB = ((Α ∙ B) ÷ |B|) × Ḃ

Or if we were to break Ḃ into its own definition

Proj_AB = ((Α ∙ B) ÷ |B|) × (B ÷ |B|)
Proj_AB = ((Α ∙ B) ÷ |B|²) × B

This is the full projection formula, but we don't actually need it all. You can see that it's a scalar value—the dot product divided by B's square magnitude—multiplied into B to scale it down to the end of the shadow, but this scalar itself is the "percentage of B" we seek.

A = (3,4)
B = (8,0)
|A| = 5
|B| = 8
A ∙ B = 24
Proj_AB = (24 ÷ 8²) × B
Proj_AB = 0.375 × B
υ = (3, 0)

We can visually confirm that A's shadow stretches from (0,0) to (3,0), and that it's more or less 37.5% of B.

We can draw a new vector (subtracting) between the end of A (3,4) and this intersection point (3,0) to get a vector (0, 4) representing the opposite side of the shadow's right triangle. Its magnitude is our perpendicular distance: 4. If this is within a user-defined ε, we could collapse the curve to a straight line.

As code

No need to use wanky math libraries here. Just declare each value separately.

/* curve is an array of floating point tuples with no intersections  */

curve = [(0.0, 0.0), ...]
n = length(curve)

/* start and end of the curve */
x1 = curve[0].x
x2 = curve[n-1].x
y1 = curve[0].y
y2 = curve[n-1].y

/* x,y coords of point of measurement */
i = /* current index to measure */
px = curve[i].x
py = curve[i].y

/* vector AB from curve origin to point in question */
a = px - x1
b = py - y1

/* vector CD representing hypothetical collapse of entire curve */
c = x2 - x1
d = y2 - y1

/* AB ∙ CD: angle between the two vectors */
dot = (a * c) + (b * d)

/* |CD|²: Squaring a magnitude means undoing its square root operation */
cd2 = (c * c) + (d * d)

/* (AB ∙ CD) ÷ |CD|²: the scaling factor of the projection of AB onto CD */
/* In other words, the percentage of CD where a perpendicular
   line drawn from AB would intersect with CD */
/* Might have to guard against divide by zero here */
scalingParam = dot / cd2

/* Depending upon the cleanliness of your input curve, you might have to "snap"
   this scalingParam to a proper percentage between [0.0, 1.0], but for legal
   curves this should not be needed. */
/* Now multiply the origin point by scalingParam to find the projection point
   upon CD */
ix = x1 + param * c
iy = y1 + param * d

/* Draw a new vector representing the distance from the focus point to its 
   projection intersection with CD */
dx = px - ix
dy = py - iy

/* |(dx, dy)| will give the perfect perpendicular distance to CD, but
   we can avoid an expensive square root if the epsilon we're measuring
   against is itself squared. */
ε² = /* user input */²
d² = (dx * dx) + (dy * dy)

/* collapse entire curve to CD if within the error boundary */
if (d² < ε²) {
  [(x1, y1), (x2, y2)]
} else {
  /* Recurse with two sub-curve arrays split and copied at curve[i] */
}

All those numbers would've made no fucking sense to me without playing around first.