Jorge Sanz ace759590e
Parallel capability to layer functions (#728)
This PR allows queries to be parallelized on recent versions of Postgres. The `PARALLEL SAFE` modifier has been added to the layer functions and a PLPGSQL function to convert strings into number has been replaced.

`PARALLEL SAFE` is a modifier for `CREATE FUNCTION` available since Postgres 9.6, so this change does not break current OpenMapTiles supported database version. More details about this topic [here](https://www.postgresql.org/docs/current/parallel-safety.html) and at the reference documentation for [`CREATE FUNCTION`](https://www.postgresql.org/docs/current/sql-createfunction.html).

### Testing procedure

The procedure to test this was:

* Imported `spain.pbf` in a clean environment
* Dumped the OpenMapTiles database from the Postgres Docker image
* Created a clean Postgres 12 database using the default Docker image
* Installed `postgis` 3 from the default Debian package and `osml10n` 2.5.8 from the repository (`make`, etc.)
* Restored the dump
* Lowered the postgres planner parameters for triggering parallel plans:
```sql
set parallel_setup_cost = 5;
set parallel_tuple_cost = 0.005;
```
* Manually added the `PARALLEL SAFE` modifier to each function involved in layer queries (not on updates or inserting functions).
* For each layer, run a testing query to confirm parallel workers were created, something like this:
```sql
explain analyze 
select * from layer_aerodrome_label(tilebbox(8,128,95),10,null)
union all
select * from layer_aerodrome_label(tilebbox(8,128,97),10,null);
```
* After all the layers were processed and confirmed to start parallel executions, a more complete example was run. This example just retrieves the geometries for all the layers from the same tile but without using any MVT related function.

<details><summary>Testing query</summary>

```sql
-- Using the function layer_landuse
explain analyze 
select geometry from layer_water(tilebbox(14,8020,6178),14)
union all
select geometry from layer_waterway(tilebbox(14,8020,6178),14)
union all
select geometry from layer_landcover(tilebbox(14,8020,6178),14)
union all
select geometry from layer_landuse(tilebbox(14,8020,6178),14)
union all
select geometry from layer_mountain_peak(tilebbox(14,8020,6178),14)
union all
select geometry from layer_park(tilebbox(14,8020,6178),14)
union all
select geometry from layer_boundary(tilebbox(14,8020,6178),14)
union all
select geometry from layer_aeroway(tilebbox(14,8020,6178),14)
union all
select geometry from layer_transportation(tilebbox(14,8020,6178),14)
union all
select geometry from layer_building(tilebbox(14,8020,6178),14)
union all
select geometry from layer_water_name(tilebbox(14,8020,6178),14)
union all
select geometry from layer_transportation_name(tilebbox(14,8020,6178),14)
union all
select geometry from layer_place(tilebbox(14,8020,6178),14)
union all
select geometry from layer_housenumber(tilebbox(14,8020,6178),14)
union all
select geometry from layer_poi(tilebbox(14,8020,6178),14)
union all
select geometry from layer_aerodrome_label(tilebbox(14,8020,6178),14);
```
</details>

You can inspect the execution plan and results on [this page](https://explain.dalibo.com/plan/3z). Also [attaching](https://github.com/openmaptiles/openmaptiles/files/3951822/explain-tile-simple.tar.gz) the query and JSON output for future reference. The website gives a ton of details, but you may want to search for nodes mentioning `workers` or `parallel` like in this area referring to `osm_border` or `osm_aeroway_linestring` entities

![image](https://user-images.githubusercontent.com/188264/70647153-9cac9300-1c48-11ea-96ea-ac7a1e2f4a79.png)

### Next steps

Since the execution plan is not showing a parallel append at the top level, meaning it's not running each layer individually, I want to continue experimenting with parameters and queries to see if it's possible to even parallelize more the request.

I will post my finding here, even no change in the code should happen.


cc. @nyurik

Co-authored-by: Yuri Astrakhan <yuriastrakhan@gmail.com>
2020-01-31 19:36:02 -05:00

52 lines
2.1 KiB
PL/PgSQL

CREATE OR REPLACE FUNCTION brunnel(is_bridge BOOL, is_tunnel BOOL, is_ford BOOL) RETURNS TEXT AS $$
SELECT CASE
WHEN is_bridge THEN 'bridge'
WHEN is_tunnel THEN 'tunnel'
WHEN is_ford THEN 'ford'
END;
$$
LANGUAGE SQL
IMMUTABLE STRICT PARALLEL SAFE;
-- The classes for highways are derived from the classes used in ClearTables
-- https://github.com/ClearTables/ClearTables/blob/master/transportation.lua
CREATE OR REPLACE FUNCTION highway_class(highway TEXT, public_transport TEXT, construction TEXT) RETURNS TEXT AS $$
SELECT CASE
%%FIELD_MAPPING: class %%
END;
$$
LANGUAGE SQL
IMMUTABLE PARALLEL SAFE;
-- The classes for railways are derived from the classes used in ClearTables
-- https://github.com/ClearTables/ClearTables/blob/master/transportation.lua
CREATE OR REPLACE FUNCTION railway_class(railway TEXT) RETURNS TEXT AS $$
SELECT CASE
WHEN railway IN ('rail', 'narrow_gauge', 'preserved', 'funicular') THEN 'rail'
WHEN railway IN ('subway', 'light_rail', 'monorail', 'tram') THEN 'transit'
END;
$$
LANGUAGE SQL
IMMUTABLE STRICT PARALLEL SAFE;
-- Limit service to only the most important values to ensure
-- we always know the values of service
CREATE OR REPLACE FUNCTION service_value(service TEXT) RETURNS TEXT AS $$
SELECT CASE
WHEN service IN ('spur', 'yard', 'siding', 'crossover', 'driveway', 'alley', 'parking_aisle') THEN service
END;
$$
LANGUAGE SQL
IMMUTABLE STRICT PARALLEL SAFE;
-- Limit surface to only the most important values to ensure
-- we always know the values of surface
CREATE OR REPLACE FUNCTION surface_value(surface TEXT) RETURNS TEXT AS $$
SELECT CASE
WHEN surface IN ('paved', 'asphalt', 'cobblestone', 'concrete', 'concrete:lanes', 'concrete:plates', 'metal', 'paving_stones', 'sett', 'unhewn_cobblestone', 'wood') THEN 'paved'
WHEN surface IN ('unpaved', 'compacted', 'dirt', 'earth', 'fine_gravel', 'grass', 'grass_paver', 'gravel', 'gravel_turf', 'ground', 'ice', 'mud', 'pebblestone', 'salt', 'sand', 'snow', 'woodchips') THEN 'unpaved'
END;
$$
LANGUAGE SQL
IMMUTABLE STRICT PARALLEL SAFE;