Calculate a convex hull - The QuickHull algorithm
In need of a fast and efficient way of calculating the convex hull of a point set, I stumbled across the QuickHull algorithm, after doing some research. This algorithm is a divide and conquer approach to solve the given problem, which is quite efficient in the average case. The following article gives an introduction into convex hulls and the QuickHull algorithm itself. Furthermore a implementation of it in PHP is presented for download.
Convex Hull
The Wikipedia article about the convex hull draws a quite intuitive picture of this rather mathematical construct:
For planar objects, i.e., lying in the plane, the convex hull may be
easily visualized by imagining an elastic band stretched open to encompass
the given object; when released, it will assume the shape of the required
convex hull.As always a picture says more than thousand words. Therefore the following picture should make the intuition a lot clearer.
A set of points enclosed by a rubber band. (Image created by wikimedia commons user Maksim and released under public domain)As you can see the convex hull in two dimensional space is the minimal polygon, which encloses all of the given points.
QuickHull
Instead of choosing the naive solution to do a simple linear comparison of all points to find the ones which are lying at the outer most edge, QuickHull uses a divide and conquer approach similar to the QuickSort algorithm. Hence its name.
There are quite a few other efficient algorithms to calculate a complex hull, like Graham-Scan or Jarvis-March. I decided to use QuickHull because benchmarks showed it is quite fast in most average cases, as well as its recursive nature allows a fast and yet clean implementation.
QuickHull performs its calculations in a recursive manner on intelligently chosen subsets of the initial set of points.
Initial input
Initial set of pointsThe initial input to the algorithm is an arbitrary set of points. Under average circumstances the algorithm is quite fast. There are however cases, like highly symmetric point sets (points lying on a circumference for example), which make processing a lot slower.
First two points on the convex hull
Min/Max horizontal pointsStarting with the given set of points the first operation done is the calculation of the two maximal points on the horizontal axis. These two points are guaranteed to be part of the convex hull polygon.
Recursively divide
Divide the point set into left and rightNext the line formed by these two points is used to divide the set into two different parts. Everything left from this line is considered one part, everything right of it is considered another one. Both of these parts are processed recursively.
To show the further steps of the algorithm only the left point set is used as an example. The right point set is handled exactly the same way.
Max distance search
Point with maximal distance to the lineTo determine the next point on the convex hull a search for the point with the greatest distance from the dividing line is done. This point, together with the line start and end point forms a triangle.
Point exclusion
Points inside the triangle are ignoredAll points inside this triangle can not be part of the convex hull polygon, as they are obviously lying in the convex hull of the three selected points. Therefore these points can be ignored for every further processing step.
Recursively divide
Divide the point set into left and right againHaving this in mind the recursive processing can take place again. Everything right of the triangle is used as one subset, everything left of it as another one.
Abort condition
Final result, the convex hull polygonAt some point the recursively processed point subset does only contain the start and end point of the dividing line. If this is case this line has to be a segment of the searched hull polygon and the recursion can come to an end.
Practical use
Everything seen so far is quite theoretical. You might ask which real world application does need something like QuickHull. I want to demonstrate the practical use of convex hulls using an example quite similar the what I am currently doing with them.
The example will cover Bézier curves. Bézier curves are used in computer graphics to mathematically define and draw curve segments. It is not that important to understand a Bézier curve in detail. What is however relevant is the fact that such a curve is defined using an arbitrary amount of control points. These control points affect the way the curve is shaped. You can imagine these points to be small centers of gravity which pull the curve into their direction.
Bézier curve and control pointsA simple example of such a curve is illustrated in the picture to the left. It consists of 4 control points. The first and last control point are always the start and end points of the curve. Every control point in between does not need to be on the resulting curve. They only affect the curve in some way.
The application I wrote the QuickHull implementation for needed to do some sort of visibility checking for these curves. I needed to know which curves weren't visible and therefore would not need to be drawn.
Convex hull of control pointsAt this point convex hulls come into account. Bézier curves have a nice feature. They are mathematically proven always surrounded by the convex hull of their control points. Therefore, the fact that the convex hull is not visible implies the curve is not visible. This check can be done a lot faster than calculating intersections with the real curve.
There are other real world applications of convex hulls like for example linear programming, as well.
Implementation in PHP
As I mentioned before, I implemented this algorithm in PHP. Apart from some vector math, the implementation is quite straight forward. It is a single PHP file containing a well documented class, encapsulating all the needed functionality. An array of points needs to be provided as input and the convex hull polygon points, in clockwise order, are outputted after the calculation has been done.
Jason Judge on Sun, 30 Jan 2011 02:16:21 +0100
Thanks! Just what I was looking for to create polygons around UK postcodes. I have up to 1.7 million points to process at any time, so will be looking for a way to apply this algorithm to a series of database queries rather than trying to pull all points into memory in one go.
Link to commentThe diagrams are great in helping to understand how it all works.
Barry Cornelius on Sun, 30 Jan 2011 19:31:52 +0100
I wanted KML files giving the outline of UK postcode areas (e.g., DH). Surprisingly, I couldn't find these on the web. However, you can get from the Ordnance Survey/Royal Mail, the coordinates of the centre of each postcode (e.g., DH1 3LE). I used your code to generate some points giving an outline of a postcode area and produced some KML from these points. I specifically needed it for a web site I have for displaying future running races in North East England, as shown at http://www.northeastraces.com/running/races/explain/pcaYO?arg=pcaYO. However, I thought I might as well do the rest of the UK at the same time as shown at http://www.northeastraces.com/runningstatic/pcas/. Thanks.
Link to commentTrivial Implementation on Fri, 06 May 2011 23:04:46 +0200
In compliance with GPL3, here's my published code:
Link to comment<code>
<?php
include( 'convex_hull.php' );
$cn = mysql_connect('localhost','xxx','yyy');
mysql_select_db('zzz', $cn);
$regions = array();
$result = mysql_query("SELECT * FROM regions WHERE polygon IS NULL");
while ($region = mysql_fetch_assoc($result)) {
$regions[] = $region;
}
mysql_free_result($result);
$max = 0;
$max_name = '';
foreach($regions as &$region) {
$points = array();
$result = mysql_query("SELECT latitude, longitude FROM postal_codes WHERE region_id = " . $region['id']);
while ($point = mysql_fetch_row($result)) {
$points[] = $point;
}
mysql_free_result($result);
echo "Calculating shape of " . $region['name'] . "...\n";
$hull = new ConvexHull($points);
$region['polygon'] = $hull->getHullPoints();
echo "Updating " . $region['name'] . "...\n";
$polygon = mysql_real_escape_string(serialize($region['polygon']) );
mysql_query('UPDATE regions ' .
'SET polygon = \'' . $polygon . '\' ' .
'WHERE id = ' . $region['id']
);
echo 'Polygon is ' . strlen($polygon) . ' characters.';
if (strlen($polygon) > $max) {
$max = strlen($polygon);
$max_name = $region['name'];
}
}
mysql_close($cn);
echo "Largest polygon is " . $max_name . " at " . $max . " characters.\n";
exit;
</code>
Andrea Grimaldi on Tue, 14 Aug 2012 19:07:49 +0200
hi... thanks a lot... i tried your script... but the output seems not to be correct... i provided 10 point and got 1 repeted twice as an output...
Link to commentPeter Koehler on Thu, 18 Oct 2012 07:00:45 +0200
Hello Mr. Westerhoff,
Link to commentthanks for the very useful and quite well working script. i'm using it for calculation of area and perimeter of so called territories (which is represented by a set of points) in a process to determine their compactness. so far i haven't experienced ANY problems as described before. so, WELL DONE!
now i'm asking for your help :) i need to divide a territory (set of points) into so called 'basic areas'. For this, i need to calculate the Voronoi Partition of a set of points. So far, i didn't find a PHP Script for this purpose.
maybe you can help me out with a link?
thanks a lot and thanks for your script! :)
peter koehler
Peter Koehler on Thu, 18 Oct 2012 10:53:19 +0200
Here is the link, where QuickHull is used to demonstrate how compactness is measured:
Link to commenthttp://www.diedreifragezeichen.de/bachelorarbeit/konvexe_huelle.php?figur=sechseck
Jean-François Tremblay on Fri, 30 Nov 2012 14:04:21 +0100
Yesterday I tried your code in PHP4 and realized I wasn't able to get it to work. So I changed the structure of the class so it can be compliant to version 4.
Link to comment<?
class ConvexHull
{
/**
* Set of points provided as input for the calculation.
*
* @var array( array( float, float ) )
*/
var $inputPoints;
/**
* The points of the convex hull after the quickhull algorithm has been
* executed.
*
* @var array( array( float, float ) )
*/
var $hullPoints;
/**
* Construct a new ConvexHull object using the given points as input.
*
* @param array $points
*/
function ConvexHull ( $pofloats )
{
$this->inputPoints = $pofloats;
$this->hullPoints = null;
}
/**
* Return the pofloats of the convex hull.
*
* The pofloats will be ordered to form a clockwise defined polygon path
* around the convex hull.
*
* @return array( array( float, float ) );
*/
function getHullPoints()
{
if ( $this->hullPoints === null )
{
// Initial run with max and min x value points.
// These points are guaranteed to be points of the convex hull
// Initially the points on both sides of the line are processed.
$maxX = $this->getMaxXPoint();
$minX = $this->getMinXPoint();
$this->hullPoints = array_merge(
$this->quickHull( $this->inputPoints, $minX, $maxX ),
$this->quickHull( $this->inputPoints, $maxX, $minX )
);
}
return $this->hullPoints;
}
/**
* Return the points provided as input point set.
*
* @return array( array( float, float ) )
*/
function getInputPoints()
{
return $this->inputPoints;
}
/**
* Find and return the point with the maximal X value.
*
* @return array( float, float )
*/
function getMaxXPoint()
{
$max = $this->inputPoints[0];
foreach( $this->inputPoints as $p )
{
if ( $p[0] > $max[0] )
{
$max = $p;
}
}
return $max;
}
/**
* Find and return the point with the minimal X value.
*
* @return array( float, float )
*/
function getMinXPoint()
{
$min = $this->inputPoints[0];
foreach( $this->inputPoints as $p )
{
if ( $p[0] < $min[0] )
{
$min = $p;
}
}
return $min;
}
/**
* Calculate a distance indicator between the line defined by $start and
* $end and an arbitrary $point.
*
* The value returned is not the correct distance value, but is sufficient
* to determine the point with the maximal distance from the line. The
* returned distance indicator is therefore directly relative to the real
* distance of the point.
*
* The returned distance value may be positive or negative. Positive values
* indicate the point is left of the specified vector, negative values
* indicate it is right of it. Furthermore if the value is zero the point
* is colinear to the line.
*
* @param float $start
* @param float $end
* @param float $point
* @return float
*/
function calculateDistanceIndicator( $start, $end, $point )
{
/*
* The real distance value could be calculated as follows:
*
* Calculate the 2D Pseudo crossproduct of the line vector ($start
* to $end) and the $start to $point vector.
* ((y2*x1) - (x2*y1))
* The result of this is the area of the parallelogram created by the
* two given vectors. The Area formula can be written as follows:
* A = |$start->$end| * h
* Therefore the distance or height is the Area divided by the length
* of the first vector. This division is not done here for performance
* reasons. The length of the line does not change for each of the
* comparison cycles, therefore the resulting value can be used to
* finde the point with the maximal distance without performing the
* division.
*
* Because the result is not returned as an absolute value its
* algebraic sign indicates of the point is right or left of the given
* line.
*/
$vLine = array(
$end[0] - $start[0],
$end[1] - $start[1]
);
$vPoint = array(
$point[0] - $start[0],
$point[1] - $start[1]
);
return ( ( $vPoint[1] * $vLine[0] ) - ( $vPoint[0] * $vLine[1] ) );
}
/**
* Calculate the distance indicator for each given point and return an
* array containing the point and the distance indicator.
*
* Only points left of the line will be returned. Every point right of the
* line or colinear to the line will be deleted.
*
* @param array $start
* @param array $end
* @param array $points
* @return array( array( point, distance ) )
*/
function getPointDistanceIndicators( $start, $end, $points )
{
$resultSet = array();
foreach( $points as $p )
{
if ( ( $distance = $this->calculateDistanceIndicator( $start, $end, $p ) ) > 0 )
{
$resultSet[] = array(
'point' => $p,
'distance' => $distance
);
}
else
{
continue;
}
}
return $resultSet;
}
/**
* Get the point which has the maximum distance from a given line.
*
* @param array $pointDistanceSet
* @return array( float, float )
*/
function getPointWithMaximumDistanceFromLine( $pointDistanceSet )
{
$maxDistance = 0;
$maxPoint = null;
foreach( $pointDistanceSet as $p )
{
if ( $p['distance'] > $maxDistance )
{
$maxDistance = $p['distance'];
$maxPoint = $p['point'];
}
}
return $maxPoint;
}
/**
* Extract the points from a point distance set.
*
* @param array $pointDistanceSet
* @return array
*/
function getPointsFromPointDistanceSet( $pointDistanceSet )
{
$points = array();
foreach( $pointDistanceSet as $p )
{
$points[] = $p['point'];
}
return $points;
}
/**
* Execute a QuickHull run on the given set of points, using the provided
* line as delimiter of the search space.
*
* Only points left of the given line will be analyzed.
*
* @param array $points
* @param array $start
* @param array $end
* @return array
*/
function quickHull( $points, $start, $end )
{
$pointsLeftOfLine = $this->getPointDistanceIndicators( $start, $end, $points );
$newMaximalPoint = $this->getPointWithMaximumDistanceFromLine( $pointsLeftOfLine );
if ( $newMaximalPoint === null )
{
// The current delimiter line is the only one left and therefore a
// segment of the convex hull. Only the end of the line is returned
// to not have points multiple times in the result set.
return array( $end );
}
// The new maximal point creates a triangle together with $start and
// $end, Everything inside this trianlge can be ignored. Everything
// else needs to handled recursively. Because the quickHull invocation
// only handles points left of the line we can simply call it for the
// different line segements to process the right kind of points.
$newPoints = $this->getPointsFromPointDistanceSet( $pointsLeftOfLine );
return array_merge(
$this->quickHull( $newPoints, $start, $newMaximalPoint ),
$this->quickHull( $newPoints, $newMaximalPoint, $end )
);
}
}
?>
Barry Cornelius on Fri, 15 Mar 2013 11:47:41 +0100
On 30th January 2011, I added a comment to this page explaining how I was using your code on my http://www.northeastraces.com web page. I'm now using it for showing the outlines of council areas on my web page about public rights of way in the UK: http://www.rowmaps.com/posters/ukmap.php
Link to commentSo, thank you again.