Surveyors formula for area of a polygon

If you know the vertices of a polygon, how do you compute its area? This seems like this could be a complicated, with special cases for whether the polygon is convex or maybe other considerations. But as long as the polygon is “simple,” i.e. the sides meet at vertices but otherwise do not intersect each other, then there is a general formula for the area.

The formula

List the vertices starting anywhere and moving counterclockwise around the polygon: (x₁, y₁), (x₂, y₂), …, (x_n, y_n). Then the area is given by the formula below.

$A = \frac{1}{2} \begin{vmatrix} x_1 & x_2 & \ldots & x_n & x_1\\ y_1 & y_2 & \ldots & y_n & y_1 \end{vmatrix}$

But what does that mean? The notation is meant to be suggestive of a determinant. It’s not literally a determinant because the matrix isn’t square. But you evaluate it in a way analogous to 2 by 2 determinants: add the terms going down and to the right, and subtract the terms going up and to the right. That is,

x₁ y₂ + x₂ y₃ + … x_n y₁ – y₁ x₂ – y₂ x₃ – … – y_n x₁

This formula is sometimes called the shoelace formula because the pattern of multiplications resembles lacing a shoe. It’s also called the surveyor’s formula because it’s obviously useful in surveying.

Update: Here is an analogous formula for centroid.

Numerical implementation

As someone pointed out in the comments, in practice you might want to subtract the minimum x value from all the x coordinates and the minimum y value from all the y coordinates before using the formula. Why’s that?

If you add a constant amount to each vertex, you move your polygon but you don’t change the area. So in theory it makes no difference whether you translate the polygon before computing its area. But in floating point, it can make a difference.

The cardinal rule of floating point arithmetic is to avoid subtracting nearly equal numbers. If you subtract two numbers that agree to k significant figures, you can lose up to k figures of precision. We’ll illustrate this by taking a right triangle with base 2 and height π. The area should remain π as we translate the triangle away from the origin, we lose precision the further out we translate it.

Here’s a Python implementation of the shoelace formula.

    def area(x, y):
        n = len(x)
        s = 0.0
        for i in range(-1, n-1):
            s += x[i]*y[i+1] - y[i]*x[i+1]
        return 0.5*s

If you’re not familiar with Python, a couple things require explanation. First, range(n-1) is a list of integers starting at 0 but less than n-1. Second, the -1 index returns the last element of the array.

Now, watch how the precision of the area degrades as we shift the triangle by powers of 10.

    import numpy as np

    x = np.array([0.0, 0.0, 2.0])
    y = np.array([np.pi, 0.0, 0.0])

    for n in range(0, 10):
        t = 10**n
        print( area(x+t, y+t) )

This produces

    3.141592653589793
    3.1415926535897825
    3.1415926535901235
    3.1415926535846666
    3.141592651605606
    3.1415929794311523
    3.1416015625
    3.140625
    3.0
    0.0

Shifting by small amounts doesn’t make a noticeable difference, but we lose between one and two significant figures each time we increase t by a multiple of 10. We only have between 15 and 16 significant figures to start with in a standard floating point number, and so eventually we completely run out of significant figures.

When implementing the shoelace formula, we want to do the opposite of this example: instead of shifting coordinates so that they’re similar in size, we want to shift them toward the origin so that they’re not similar in size.

15 thoughts on “How to compute the area of a polygon”

MikeD

26 September 2018 at 16:18

If you draw it out, it’s just the trapezoidal rule that adds up the area under the curve and then subtracts it when the polygon “goes backwards” along x.

You can also get the higher moments of the polygon this way too. I use it all the time to compute the moment of inertia.

And when dealing with geospatial polygons that have large floating point values, be sure to subtract off the min x and min y so that you’re not adding and subtracting lots of large values and losing precision.

Mike Anderson

26 September 2018 at 21:00

OK, that’s a formula that’s dead simple to learn–just remember the name–but no one teaches. I’m beginning to think my public school math teachers were all imposters….

Yakov Shklarov

26 September 2018 at 22:38

In the limit, this becomes the formula for the area enclosed by a smooth simple closed curve, int(xdy – ydx).

Carlos Luna

27 September 2018 at 10:21

One of my favorite things about python’s indexing is that you can do something like:

for i in range(-1, n-1):
    s += x[i]*y[i+1] - y[i]*x[i+1]

Instead of:

for i in range(n-1):
    s += x[i]*y[i+1] - y[i]*x[i+1]
s += x[-1]*y[0] - y[-1]*x[0]

John

27 September 2018 at 10:36

Carlos: That’s clever. I’ll use that.

MikeD

27 September 2018 at 10:50

I don’t use python, but from what I’ve seen with numpy, you probably have all of this stuff vectorizable and stay out of the interpreter. Here’s how I do it in IDL.

x1=x-x[0] ;subtract the offset rather than searching the array for the minimum, I just use the first vertex.
y1=y-y[0]
x2=shift(x1,-1) ;shift the array one unit in a circular buffer
y2=shift(y1,-1)
D=x1*y2-x2*y1 ;D is now an array of all of the differences
S=total(D) ;sum it

I bet this can be done in numpy too. I can process millions of polygons really fast this way.

Randy A MacDonald

27 September 2018 at 15:59

⍝ Dyalog APL

fArea←{÷∘2⊢-/{+⌿×/⍵}¨(0 1)(1 0)⊖¨⊂⍵}
fArea↑⌽(0 3)(3 0)(0 0)
4.5

Svyatoslav Pidgorny

28 September 2018 at 02:43

# Julia

function area(shape)
shoe = vcat(shape,shape[1,:]’)
area = ( sum(shoe[:,1] .* circshift(shoe[:,2],-1)) – sum(shoe[:,2] .* circshift(shoe[:,1],-1)) ) /2
end

shape = [0 0; 1 -1; 2 -1; 1 0; 2 1; 1 1]
area(shape)

# Now, the assumption is that the points are labeled sequentially. What’s the algorithm for labelling a collection of point coordinates sequentially, so that they encircle an area?

MikeD

28 September 2018 at 13:09

Svyatoslav – finding the convex hull of a collection of vertices isn’t very hard (https://en.wikipedia.org/wiki/Graham_scan) but if you want to include all of the vertices, there’s no unique solution. Order matters when you store them.

Sjoerd Visscher

11 October 2018 at 09:37

Why would you subtract the minimum value? I guess the best would be the average value but even just the first seems better than the minimum.

John

11 October 2018 at 10:14

I doubt it matters what value you subtract. Presumably in application all the vertices are going to be roughly the same magnitude. The biggest potential problem is a shift that is large relative to differences between vertices.

MikeD

11 October 2018 at 11:09

Yes, in production I subtract the first because it saves me another pass through the vertices. Average would be more expensive to compute – another pass through the data and a sum – and be less accurate because you have the same addition of large values.

Anonymous

21 October 2018 at 14:05

The Geopoly extension to SQLite uses (x0-x1)(y0+y1) instead of (x0y1-y0x1). It is easy to see that when the one that SQLite uses is expanded you will have (x0y0+x0y1-x1y0-x1y1), and the extra terms (+x0y0) and (-x1y1) are being canceled out, so mathematically it is the same.

Para Parasolian

5 April 2020 at 10:14

Very teachable blog post. Can you say something similar about computing the centroid of a polygon using a similar formula?

John

5 April 2020 at 14:41

Thanks, Para. I just wrote the post you suggested.
https://www.johndcook.com/blog/2020/04/05/center-of-mass-and-vectorization/

Comments are closed.