[gdal-dev] Check validity of geometries before writing them into vector tiles?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[gdal-dev] Check validity of geometries before writing them into vector tiles?

jratike80

Hi,

 

I admit that my test was somewhat lunatic but I took some random dataset from Finland with 68101 polygons and converted data into MVT with default settings which means that minzoom was 0.  As a result 12196 of the source polygons were written into the 0-level protobuf tile (in EPSG:3067 gridset) and none of the polygons is valid. Most polygons have too few points and those which have enough points have self-intersections.

 

Perhaps there should be some sort of geometry validator in the writer chain?

 

-Jukka Rahkonen-


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Check validity of geometries before writing them into vector tiles?

Even Rouault-2

On mercredi 14 février 2018 15:10:57 CET Rahkonen Jukka (MML) wrote:

> Hi,

>

> I admit that my test was somewhat lunatic but I took some random dataset

> from Finland with 68101 polygons and converted data into MVT with default

> settings which means that minzoom was 0. As a result 12196 of the source

> polygons were written into the 0-level protobuf tile (in EPSG:3067 gridset)

> and none of the polygons is valid. Most polygons have too few points and

> those which have enough points have self-intersections.

>

> Perhaps there should be some sort of geometry validator in the writer chain?

 

There is some geometry validation, but it is not white&black. In my tests, with the ne_10m_admin_1_states_provinces dataset, when I initially implemented very strict geometry validation, a number of polygons were completely dropped, so I ended up implementing a more relaxed logic: for an outer ring, it after conversion to integer coordinates, it is not valid (with GEOSIsValid() testing), then do : buffer( tolerance) followed by bufffer(-tolerance) followed by simplifypreservetopology(tolerance) (where tolerance is 2 * tile_dim_in_crs_unit / EXTENT) followed by a new round of integer coordinates conversion. If that's still not valid, keep it.

For an inner ring, drop it if when included in the outer ring, the resulting polygon is not valid

 

Perhaps you could play with the SIMPLIFICATION and SIMPLIFICATION_MAX_ZOOM options ?

 

Perhaps you should also use an already simplified layer for the lowest zoom level (see the CONF option)

 

Are you sure you get polygons with less than 4 points ? Normally they should be discarded.

 

Even

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Check validity of geometries before writing them into vector tiles?

jratike80
Even Rouault wrote:

> Perhaps you could play with the SIMPLIFICATION and SIMPLIFICATION_MAX_ZOOM options ?
Sure, as I wrote I admit that my test did not make much sense, but trying things before reading the manual sometimes reveals something interesting.  

> Perhaps you should also use an already simplified layer for the lowest zoom level (see the CONF option)
For sure yes, coordinate space of 4096x4096 is far too small for this dataset.

> Are you sure you get polygons with less than 4 points ? Normally they should be discarded.
Quite sure yes by looking at what the ST_IsValidReason from the SQLite dialect prints 
ST_IsValidReason(geometry) (String) = Invalid: Toxic Geometry ... too few points

There are other variants of invalid geometries and actually a few valid as well. Here is one example with an invalid component

  MULTIPOLYGON (((516384 6815744,516384 6815744,532768 6815744,516384 6815744)),
((516384 6815744,516384 6815744)),((516384 6815744,516384 6815744)))

My zero tile is here http://www.latuviitta.org/downloads/0.pbf

-Jukka Rahkonen-
_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Check validity of geometries before writing them into vector tiles?

flippmoke
All,

Geometry validity with polygons is really hard. The only solution at Mapbox we found to fix this properly for Vector Tiles that we create was to modify an existing algorithm to solve all the potential problems of geometry validity. The result of this is https://github.com/mapbox/wagyu/ which we use in our Vector Tile creation scripts now. Unfortunately it doesn't have great performance when using extremely high numbers of points in polygons. However, for vector tiles this typically should not be needed if simplification is done prior to VT creation. If there is supported needed around this problem I am more then willing to lend my expertise. 

Thanks,

Blake Thompson

On Wed, Feb 14, 2018 at 10:53 AM, Rahkonen Jukka (MML) <[hidden email]> wrote:
Even Rouault wrote:

> Perhaps you could play with the SIMPLIFICATION and SIMPLIFICATION_MAX_ZOOM options ?
Sure, as I wrote I admit that my test did not make much sense, but trying things before reading the manual sometimes reveals something interesting.  

> Perhaps you should also use an already simplified layer for the lowest zoom level (see the CONF option)
For sure yes, coordinate space of 4096x4096 is far too small for this dataset.

> Are you sure you get polygons with less than 4 points ? Normally they should be discarded.
Quite sure yes by looking at what the ST_IsValidReason from the SQLite dialect prints 
ST_IsValidReason(geometry) (String) = Invalid: Toxic Geometry ... too few points

There are other variants of invalid geometries and actually a few valid as well. Here is one example with an invalid component

  MULTIPOLYGON (((516384 6815744,516384 6815744,532768 6815744,516384 6815744)),
((516384 6815744,516384 6815744)),((516384 6815744,516384 6815744)))

My zero tile is here http://www.latuviitta.org/downloads/0.pbf

-Jukka Rahkonen-
_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Check validity of geometries before writing them into vector tiles?

Even Rouault-2
In reply to this post by jratike80

On mercredi 14 février 2018 15:53:38 CET Rahkonen Jukka (MML) wrote:

> Even Rouault wrote:

> > Perhaps you could play with the SIMPLIFICATION and SIMPLIFICATION_MAX_ZOOM

> > options ?

> Sure, as I wrote I admit that my test did not make much sense, but trying

> things before reading the manual sometimes reveals something interesting.  

> > Perhaps you should also use an already simplified layer for the lowest

> > zoom level (see the CONF option)

> For sure yes, coordinate space of 4096x4096 is far too small for this

> dataset.

> > Are you sure you get polygons with less than 4 points ? Normally they

> > should be discarded.

> Quite sure yes by looking at what the ST_IsValidReason from the SQLite

> dialect prints  ST_IsValidReason(geometry) (String) = Invalid: Toxic

> Geometry ... too few points

>

> There are other variants of invalid geometries and actually a few valid as

> well. Here is one example with an invalid component

>

> MULTIPOLYGON (((516384 6815744,516384 6815744,532768 6815744,516384

> 6815744)), ((516384 6815744,516384 6815744)),((516384 6815744,516384

> 6815744)))

>

> My zero tile is here http://www.latuviitta.org/downloads/0.pbf

 

OK, I reproduced the issue with Jukka's provided input dataset. The issue was that for zoom level 0 the maximum tile size limit of 500 KB was reached, and the algorithm that reduces coordinate precision in the hope of decreasing the tile size didn't validate geometries as well as the normal code path. Now fixed, for most cases. The remaining invalid cases are validity issues specifici of multipolygons: each polygon should be valid, but duplicate rings / intersecting polygons might still appear. This could proably be fixed by unioning the polygons and re-encoding the resulting polygon.

 

Even

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev