You are here

MySQL spatial functionality - points of interest around me

This week I was preparing the exercises for our MySQL/MariaDB for Beginners training. One of the exercises of the training is about MySQL spatial (GIS) features. I always tell customers: "With these features you can answer questions like: Give me all points of interest around me!"

Now I wanted to try out how it really works and if it is that easy at all...

To get myself an idea of what I want to do I did a little sketch first:

poi.png

   My position
   Shops
   Restaurants
   Cafes

To do this I needed a table and some data:

CREATE TABLE poi (
  id INT UNSIGNED NOT NULL AUTO_INCREMENT
, name VARCHAR(40)
, type VARCHAR(20)
, sub_type VARCHAR(20)
, pt POINT NOT NULL
, PRIMARY KEY (id)
, SPATIAL INDEX(pt)
) ENGINE=InnoDB;

INSERT INTO poi (name, type, sub_type, pt) VALUES
  ('Shop 1', 'Shop', 'Cloth', Point(2,2))
, ('Cafe 1', 'Cafe', '',  Point(11,2))
, ('Shop 2', 'Shop', 'Cloth',  Point(5,4))
, ('Restaurant 1', 'Restaurant', 'Portugies',  Point(8,7))
, ('Cafe 2', 'Cafe', '',  Point(3,9))
, ('Shop 3', 'Shop', 'Hardware',  Point(11,9))
;

This looks as follows:

SELECT id, CONCAT(ST_X(pt), '/', ST_Y(pt)) AS "X/Y", name, type, sub_type
  FROM poi;
+----+-----------+--------------+------------+-----------+
| id | X/Y       | name         | type       | sub_type  |
+----+-----------+--------------+------------+-----------+
|  1 | 2/2       | Shop 1       | Shop       | Cloth     |
|  2 | 11/2      | Cafe 1       | Cafe       |           |
|  3 | 5/4       | Shop 2       | Shop       | Cloth     |
|  4 | 8/7       | Restaurant 1 | Restaurant | Portugies |
|  5 | 3/9       | Cafe 2       | Cafe       |           |
|  6 | 11/9      | Shop 3       | Shop       | Hardware  |
+----+-----------+--------------+------------+-----------+

Now the question: "Give me all shops in a distance of 4.5 units around me":

SET @hereami = POINT(9,4);

SELECT id, ST_AsText(pt) AS point, name, ROUND(ST_Distance(@hereami, pt), 2) AS distance
  FROM poi
 WHERE ST_Distance(@hereami, pt) < 4.5
   AND type = 'Shop'
 ORDER BY distance ASC
;
+----+------------+--------+----------+
| id | point      | name   | distance |
+----+------------+--------+----------+
|  3 | POINT(5 4) | Shop 2 |     4.00 |
+----+------------+--------+----------+
1 row in set (0.37 sec)

The query execution plan looks like this:

           id: 1
  select_type: SIMPLE
        table: poi
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 650361
     filtered: 10.00
        Extra: Using where; Using filesort

So no use of the spatial index yet. :-(

Reading the MySQL documentation Using Spatial Indexes provides some more information:

The optimizer investigates whether available spatial indexes can be involved in the search for queries that use a function such as MBRContains() or MBRWithin() in the WHERE clause.

So it looks like the optimizer CAN evaluate function covered fields in this specific case. But not with the function ST_Distance I have chosen.

So my WHERE clause must look like: "Give me all points within a polygon spanned 4.5 units around my position..."

I did not find any such function in the short run. So I created a hexagon which is not too far from a circle...

With this hexagon I tried again:

SET @hereami = POINT(9,4);
SET @hexagon = 'POLYGON((9 8.5, 12.897 6.25, 12.897 1.75, 9 -0.5, 5.103 1.75, 5.103 6.25, 9 8.5))';

SELECT id, ST_AsText(pt) AS point, name, ROUND(ST_Distance(@hereami, pt), 2) AS distance
  FROM poi
 WHERE MBRContains(ST_GeomFromText(@hexagon), pt)
   AND ST_Distance(@hereami, pt) < 4.5
   AND type = 'Shop'
 ORDER BY distance ASC
;
Empty set (0.03 sec)

And tadaaah: Damned fast, but the result is not the same! :-( When you look at the graph above it is obvious why: The missing shop is 0.103 units outside of our hexagon search range but within our circle range. So an octagon would have been the better approach...

At least the index is considered now! :-)

           id: 1
  select_type: SIMPLE
        table: poi
   partitions: NULL
         type: range
possible_keys: pt
          key: pt
      key_len: 34
          ref: NULL
         rows: 31356
     filtered: 10.00
        Extra: Using where; Using filesort

For specifying a an "outer" hexagon I was too lazy. So I was specifying a square:

SET @hereami = POINT(9,4);
SET @square = 'POLYGON((4.5 8.5, 13.5 8.5, 13.5 -0.5, 4.5 -0.5, 4.5 8.5))';

SELECT id, ST_AsText(pt) AS point, name, ROUND(ST_Distance(@hereami, pt), 2) AS distance
  FROM poi
 WHERE MBRContains(ST_GeomFromText(@square), pt)
   AND ST_Distance(@hereami, pt) < 4.5
   AND type = 'Shop'
 ORDER BY distance ASC
;
+----+------------+--------+----------+
| id | point      | name   | distance |
+----+------------+--------+----------+
|  3 | POINT(5 4) | Shop 2 |     4.00 |
+----+------------+--------+----------+
1 row in set (0.02 sec)

So, my shop is in the result again now. And even a bit faster!

Now I wanted to find out if this results are any faster than the conventional method with an index on (x) and (y) or (x, y):

SELECT id, ST_AsText(pt) AS point, name, ROUND(ST_Distance(@hereami, pt), 2) AS distance
  FROM poi
 WHERE x >=  4.5 AND x <= 13.5
   AND y >= -0.5 AND y <=  8.5
   AND ST_Distance(@hereami, pt) < 4.5
   AND type = 'Shop'
 ORDER BY distance ASC
;
1 row in set (0.15 sec)

Here the optimizer chooses the index on x. But I think he could do better. So I forced to optimizer to use the index on (x, y):

SELECT id, ST_AsText(pt) AS point, name, ROUND(ST_Distance(@hereami, pt), 2) AS distance
  FROM poi FORCE INDEX (xy)
 WHERE x >=  4.5 AND x <= 13.5
   AND y >= -0.5 AND y <=  8.5
   AND ST_Distance(@hereami, pt) < 4.5
   AND type = 'Shop'
 ORDER BY distance ASC
;
1 row in set (0.03 sec)

           id: 1
  select_type: SIMPLE
        table: poi
   partitions: NULL
         type: range
possible_keys: xy
          key: xy
      key_len: 10
          ref: NULL
         rows: 115592
     filtered: 1.11
        Extra: Using index condition; Using where; Using filesort

Same performance than with the spatial index. So it looks like for this simple task with my data distribution conventional methods do well enough.

No I wanted to try a polygon which comes as close as possible to a circle. This I solved with a MySQL stored function which returns a polygon:/p>

DROP FUNCTION polygon_circle;

delimiter //

CREATE FUNCTION polygon_circle(pX DOUBLE, pY DOUBLE, pDiameter DOUBLE, pPoints SMALLINT UNSIGNED)
-- RETURNS VARCHAR(4096) DETERMINISTIC
RETURNS POLYGON DETERMINISTIC
BEGIN

  DECLARE i SMALLINT UNSIGNED DEFAULT 0;
  DECLARE vSteps SMALLINT UNSIGNED;
  DECLARE vPolygon VARCHAR(4096) DEFAULT '';

  -- Input validation

  IF pPoints < 3 THEN
    RETURN NULL;
  END IF;
  IF pPoints > 360 THEN
    RETURN NULL;
  END IF;
  IF pPoints > 90 THEN
    RETURN NULL;
  END IF;
  if (360 % pPoints) != 0 THEN
    RETURN NULL;
  END IF;

  -- Start

  SET vSteps = 360 / pPoints;

  WHILE i < 360 DO
    SET vPolygon = CONCAT(vPolygon, (pX + (SIN(i * 2 * PI() / 360) * pDiameter)), ' ', (pY + (COS(i * 2 * PI() / 360) * pDiameter)), ', ');
    SET i = i + vSteps;
  END WHILE;

  -- Add first point again
  SET vPolygon = CONCAT("POLYGON((", vPolygon, (pX + (SIN(0 * 2 * PI() / 360) * pDiameter)), " ",  (pY + (COS(0 * 2 * PI() / 360) * pDiameter)), "))");

  -- RETURN vPolygon;
  RETURN ST_GeomFromText(vPolygon);
END;
//

delimiter ;

SELECT ST_AsText(polygon_circle(9, 4, 4.5, 6));
-- SELECT polygon_circle(9, 4, 4.5, 8);

Then calling the query in the same way:

SET @hereami = POINT(9,4);
SELECT id, ST_AsText(pt) AS point, name, ROUND(ST_Distance(@hereami, pt), 2) AS distance
  FROM poi
 WHERE MBRContains(polygon_circle(9, 4, 4.5, 90), pt)
   AND ST_Distance(@hereami, pt) < 4.5
   AND type = 'Shop'
 ORDER BY distance ASC
;
+----+------------+--------+----------+
| id | point      | name   | distance |
+----+------------+--------+----------+
|  3 | POINT(5 4) | Shop 2 |     4.00 |
+----+------------+--------+----------+
1 row in set (0.03 sec)

This seems not to have any significant negative impact on performance.

Results

Test#rowsoperationlatency
Total655360FTS1300 ms
Spatial exact Circle4128FTS520 ms
Spatial inner Hexagon3916range (pt)20 ms
Spatial outer Square4128range (pt)30 ms
Conventional outer Square on (x)4128range (x) or (y)150 ms
Conventional outer Square on (xy)4128range (x,y)30 ms
Spatial good Polygon4128range (pt)30 ms
Taxonomy upgrade extras: