Possible speed improvement for overlay operations

classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Possible speed improvement for overlay operations

Paul van der Linden
I'm currently getting familiar with the GEOS code, especially to see if I can improve the performance, and I saw a potential improvement from a O(n^2) piece to a O(n*m) with according to my findings a rather small m.

It's in the combination of SimpleMCSweepLineIntersector::computeIntersections and SimpleMCSweepLineIntersector::processOverlaps.

For what I can see, (at least in my test case of 2 ordinary polygons without holes) in processOverlaps, the ev0->edgeSet is not equal to nullptr and for the rest all events[i]->edgeSet are the same.

So the condition on https://github.com/libgeos/geos/blob/8bc1ea0f8019bcb83c60126e7883ae428ea7c677/src/geomgraph/index/SimpleMCSweepLineIntersector.cpp#L164 is never true and this whole method is doing nothing but burning cpu cycles.

My proposal involves changes in SimpleMCSweepLineIntersector::computeIntersections:

One solution is to simply test if all edgeSets are equal and if so, skip the whole processOverlaps, but that wouldn't help ofcourse in all situations.
The other solution that I thought of, was to create lists for each different edgesets and only call processOverlaps for edgeSets other than the current one (ev @ https://github.com/libgeos/geos/blob/8bc1ea0f8019bcb83c60126e7883ae428ea7c677/src/geomgraph/index/SimpleMCSweepLineIntersector.cpp#L136).

The latter one probably goes wrong, because I saw that the events are sorted here https://github.com/libgeos/geos/blob/8bc1ea0f8019bcb83c60126e7883ae428ea7c677/src/geomgraph/index/SimpleMCSweepLineIntersector.cpp#L113 and probably for a good reason.
Splitting that list into buckets of equal edgeset messes up that order.

Third solution is to come up with some kind of datastructure to get all events without the ones having edgeset equal to ev, but in the correct order...

Anyone any ideas on that?

Sidenode: I found that an easy way of testing performance is to get a couple of very large polygons and put them through the desired algo (intersect in this case).
Whenever your patience has run out, just press break in the debugger and see where it stopped...


_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Martin Davis-3
I don't think this is an accurate analysis of how the Overlay noding works.

In the current JTS/GEOS Overlay algorithm the noding phase works like this:

1. Compute self-intersection nodes for Geometry A
2. Compute self-intersection nodes for Geometry B
3. Compute intersection nodes for Geometry A against Geometry B

In steps 1 and 2 the noder (MCSweepLineIntersector) has to process all overlaps of all chains, but each step is only processing chains for a single geometry.  Note that this process is NOT O(n^2), it is much more performant due to the use of a sweep-line algorithm.   (If it was O(n^2) overlay would be too slow to be usable..)

In step 3 the noder processes all chains from both geometries, but due to the previous self-noding it only needs to process overlaps between chains from different geometries.

Other ideas about performance improvements are welcome.  For instance, I have had some ideas about only noding geometry in the region of actual overlap in the case of an intersection computation.  That probably has implications for robustness, however.  Needs some research...



On Thu, Nov 29, 2018 at 10:18 AM Paul van der Linden <[hidden email]> wrote:
I'm currently getting familiar with the GEOS code, especially to see if I can improve the performance, and I saw a potential improvement from a O(n^2) piece to a O(n*m) with according to my findings a rather small m.

It's in the combination of SimpleMCSweepLineIntersector::computeIntersections and SimpleMCSweepLineIntersector::processOverlaps.

For what I can see, (at least in my test case of 2 ordinary polygons without holes) in processOverlaps, the ev0->edgeSet is not equal to nullptr and for the rest all events[i]->edgeSet are the same.

So the condition on https://github.com/libgeos/geos/blob/8bc1ea0f8019bcb83c60126e7883ae428ea7c677/src/geomgraph/index/SimpleMCSweepLineIntersector.cpp#L164 is never true and this whole method is doing nothing but burning cpu cycles.

My proposal involves changes in SimpleMCSweepLineIntersector::computeIntersections:

One solution is to simply test if all edgeSets are equal and if so, skip the whole processOverlaps, but that wouldn't help ofcourse in all situations.
The other solution that I thought of, was to create lists for each different edgesets and only call processOverlaps for edgeSets other than the current one (ev @ https://github.com/libgeos/geos/blob/8bc1ea0f8019bcb83c60126e7883ae428ea7c677/src/geomgraph/index/SimpleMCSweepLineIntersector.cpp#L136).

The latter one probably goes wrong, because I saw that the events are sorted here https://github.com/libgeos/geos/blob/8bc1ea0f8019bcb83c60126e7883ae428ea7c677/src/geomgraph/index/SimpleMCSweepLineIntersector.cpp#L113 and probably for a good reason.
Splitting that list into buckets of equal edgeset messes up that order.

Third solution is to come up with some kind of datastructure to get all events without the ones having edgeset equal to ev, but in the correct order...

Anyone any ideas on that?

Sidenode: I found that an easy way of testing performance is to get a couple of very large polygons and put them through the desired algo (intersect in this case).
Whenever your patience has run out, just press break in the debugger and see where it stopped...

_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel

_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Darafei "Komяpa" Praliaskouski
Other ideas about performance improvements are welcome.  For instance, I have had some ideas about only noding geometry in the region of actual overlap in the case of an intersection computation.  That probably has implications for robustness, however.  Needs some research...


I tried to implement this speed up in PostGIS via box clipping.
Current GEOS recangle clipping is unfortunately not robust, but helps quite a lot when not crashing.

I tried redoing clipping by doing just O(N) clamping to the box. Result retains many required properties (ie, raycasting will give correct inside/outside result for it, area is okay) but has holes touching edges and edges touching themselves. GEOSIntersection can't handle that. 

Is there a simple way to recover the boundary for such geometry to feed further to GEOSIntersection? Buffer(,0) does not help in many cases, ST_MakeValid too.

Can we make GEOSIntersection handle such invalid input, or is it completely incompatible with monotone chain conception?
--
Darafei Praliaskouski
Support me: http://patreon.com/komzpa

_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Martin Davis-3
Right, I remember you reporting out on that experiment. 

At the moment I can't think of a *simple* way to convert the "box-clamped" geometry to a valid geometry for use in Overlay.   But there might be a clever way to do this - it's an interesting question.  (One issue as you probably realize is that the clamped geometry can have a LOT of "collapsed" edges, including one which may wrap multiple times around the box, in the worst case...)  But maybe it's better just to implement a robust clipping algorithm - there's a few implementations around to use as inspiration.

Collapsed linework causes problems in the topology-building phase of overlay - the noding doesn't care.   I've been wrestling with this a bit recently, to try and get snap-rounding to work.  Unfortunately it's not easy.  One issue is that the GeometryGraph code is quite complex and supports both overlay and relate computation.  I think these are probably going to have to split apart, and then both can be optimized for their different usages.  

I'm still optimistic that limiting noding to the overlap area (i.e. the rectangle in the case of clipping) might work and be faster.  But I might be missing something key that will make this a non-starter.  


On Thu, Nov 29, 2018 at 12:20 PM Darafei "Komяpa" Praliaskouski <[hidden email]> wrote:
Other ideas about performance improvements are welcome.  For instance, I have had some ideas about only noding geometry in the region of actual overlap in the case of an intersection computation.  That probably has implications for robustness, however.  Needs some research...


I tried to implement this speed up in PostGIS via box clipping.
Current GEOS recangle clipping is unfortunately not robust, but helps quite a lot when not crashing.

I tried redoing clipping by doing just O(N) clamping to the box. Result retains many required properties (ie, raycasting will give correct inside/outside result for it, area is okay) but has holes touching edges and edges touching themselves. GEOSIntersection can't handle that. 

Is there a simple way to recover the boundary for such geometry to feed further to GEOSIntersection? Buffer(,0) does not help in many cases, ST_MakeValid too.

Can we make GEOSIntersection handle such invalid input, or is it completely incompatible with monotone chain conception?
--
Darafei Praliaskouski
Support me: http://patreon.com/komzpa
_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel

_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Darafei "Komяpa" Praliaskouski


On Fri, Nov 30, 2018 at 12:05 AM Martin Davis <[hidden email]> wrote:
Right, I remember you reporting out on that experiment. 

At the moment I can't think of a *simple* way to convert the "box-clamped" geometry to a valid geometry for use in Overlay.   But there might be a clever way to do this - it's an interesting question.  (One issue as you probably realize is that the clamped geometry can have a LOT of "collapsed" edges, including one which may wrap multiple times around the box, in the worst case...)  But maybe it's better just to implement a robust clipping algorithm - there's a few implementations around to use as inspiration.

Can you share one?

I used one from Mapnik, and it has comments on how AGG is tolerant to "box-clamping" so they don't bother.

Fixing the clipped one, even slowly, will be beneficial, as it's used for interval clipping in Z/M dimensions which is not available in GEOSIntersects.
 

Collapsed linework causes problems in the topology-building phase of overlay - the noding doesn't care.   I've been wrestling with this a bit recently, to try and get snap-rounding to work.  Unfortunately it's not easy.  One issue is that the GeometryGraph code is quite complex and supports both overlay and relate computation.  I think these are probably going to have to split apart, and then both can be optimized for their different usages.  

May it happen that snap-rounding for axis aligned overlaps is more trivial than one for generic overlap?
 

I'm still optimistic that limiting noding to the overlap area (i.e. the rectangle in the case of clipping) might work and be faster.  But I might be missing something key that will make this a non-starter.  


On Thu, Nov 29, 2018 at 12:20 PM Darafei "Komяpa" Praliaskouski <[hidden email]> wrote:
Other ideas about performance improvements are welcome.  For instance, I have had some ideas about only noding geometry in the region of actual overlap in the case of an intersection computation.  That probably has implications for robustness, however.  Needs some research...


I tried to implement this speed up in PostGIS via box clipping.
Current GEOS recangle clipping is unfortunately not robust, but helps quite a lot when not crashing.

I tried redoing clipping by doing just O(N) clamping to the box. Result retains many required properties (ie, raycasting will give correct inside/outside result for it, area is okay) but has holes touching edges and edges touching themselves. GEOSIntersection can't handle that. 

Is there a simple way to recover the boundary for such geometry to feed further to GEOSIntersection? Buffer(,0) does not help in many cases, ST_MakeValid too.

Can we make GEOSIntersection handle such invalid input, or is it completely incompatible with monotone chain conception?
--
Darafei Praliaskouski
Support me: http://patreon.com/komzpa
_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel


--
Darafei Praliaskouski

_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Martin Davis-3

On Fri, Nov 30, 2018 at 4:49 AM Darafei "Komяpa" Praliaskouski <[hidden email]> wrote:


On Fri, Nov 30, 2018 at 12:05 AM Martin Davis <[hidden email]> wrote:


Can you share one?
 
But I see comments that indicate it may not produce valid geometry either, just something good enough for rendering.  But the basic approach (Cohen-Sutherland / Liang-Barsky) might be possible to make valid.  
There's a JTS-adapted port here, so you can try it in TestBuilder:  https://github.com/dr-jts/jts-ports
 

Collapsed linework causes problems in the topology-building phase of overlay - the noding doesn't care.   I've been wrestling with this a bit recently, to try and get snap-rounding to work.  Unfortunately it's not easy.  One issue is that the GeometryGraph code is quite complex and supports both overlay and relate computation.  I think these are probably going to have to split apart, and then both can be optimized for their different usages.  

May it happen that snap-rounding for axis aligned overlaps is more trivial than one for generic overlap?

Maybe... although snap-rounding in theory is a global operation that needs to be applied to the entire line arrangement, to ensure that no lines result in further intersections after adjustment.  But as per previous comment, maybe it's possible to restrict it to just the rectangle area. That won't make it more "trivial", however, since a geometry can be highly complex inside the rectangle.  It should be faster though.
 

_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Paul van der Linden
In reply to this post by Paul van der Linden
Well, the part I'm referring to is the 1. and 2 (Compute self-intersection nodes for A and B) as far as I understand.

counter's value at the end of the computeintersections was n*n, and breakpoint never hit, so a lot of checks were done without any useful work.

Not really sure how "representative" my polygons are, but they are squares with zig-zag boundaries
/\/\/\/\/\/\/\/\
\              /
/              \
\              /
/              \
\              /
\/\/\/\/\/\/\/

don't know if that will come out right, but here's the snippet I used to check this:
Geometry* getlargeGeo(double x, double y)
{
    int i;
    const int n = 1000;
    GeometryFactory::Ptr factory = GeometryFactory::create();
    CoordinateSequence* coords = factory->getCoordinateSequenceFactory()->create((size_t)0);
    for (i = 0; i < n; i++)
        coords->add(Coordinate(x + i * 10, y + (i % 2) * 3));
    Coordinate last = coords->getAt(coords->size() - 1);
    for (i = 0; i < n; i++)
        coords->add(Coordinate(last.x + (i % 2) * 3, last.y + i * 10));
    last = coords->getAt(coords->size() - 1);
    for (i = 0; i < n; i++)
        coords->add(Coordinate(last.x - i * 10, last.y + (i % 2) * 3));
    last = coords->getAt(coords->size() - 1);
    for (i = 0; i < n; i++)
        coords->add(Coordinate(last.x + (i % 2) * 3, last.y - i * 10));
    coords->add(coords->getAt(0));

    LinearRing *shell = factory->createLinearRing(coords);
    return factory->createPolygon(shell, NULL);
};

...

    Geometry* lg1 = getlargeGeo(0, 0);
    Geometry* lg2 = getlargeGeo(100, 50);
    Geometry* res3 = lg1->intersection(lg2);



_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Martin Davis-3
Those polygons are a worst-case scenario for MonotoneChains, and they are a sub-optimal case for sweepline as well.  So maybe not surprising you are seeing a n^2 count.  

There doesn't seem much point in working to optimize such an artificial case, especially if it will impact code complexity or performance for the "average" case.   But it's hard to speculate without a working demonstration of a potential improvement.

On Fri, Nov 30, 2018 at 1:13 PM Paul van der Linden <[hidden email]> wrote:
Well, the part I'm referring to is the 1. and 2 (Compute self-intersection nodes for A and B) as far as I understand.

counter's value at the end of the computeintersections was n*n, and breakpoint never hit, so a lot of checks were done without any useful work.

Not really sure how "representative" my polygons are, but they are squares with zig-zag boundaries
/\/\/\/\/\/\/\/\
\              /
/              \
\              /
/              \
\              /
\/\/\/\/\/\/\/



_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Martin Davis-3
Some more thoughts:

- it is always good to validate that algorithms are working as expected.  What happens if you use best-case geometry as input?  E.g. a square with dense but flat edges?

- if the overlay algorithms really were egregiously O(n^2) I'm fairly sure that would have been noticed by now.  n^2 is REALLY slow on large inputs.  You can easily test for n^2 by creating a sequence of increasingly densified inputs.

- One way to improve performance for the worst-case geometries you are using is to switch from a sweep-line to using a true spatial index (such as the STRtree).  The MCIndexNoder uses this approach. One downside of this is that the order of processing overlaps becomes somewhat non-deterministic, which has been problematic for some uses.

On Fri, Nov 30, 2018 at 1:25 PM Martin Davis <[hidden email]> wrote:
Those polygons are a worst-case scenario for MonotoneChains, and they are a sub-optimal case for sweepline as well.  So maybe not surprising you are seeing a n^2 count.  

There doesn't seem much point in working to optimize such an artificial case, especially if it will impact code complexity or performance for the "average" case.   But it's hard to speculate without a working demonstration of a potential improvement.

On Fri, Nov 30, 2018 at 1:13 PM Paul van der Linden <[hidden email]> wrote:
Well, the part I'm referring to is the 1. and 2 (Compute self-intersection nodes for A and B) as far as I understand.

counter's value at the end of the computeintersections was n*n, and breakpoint never hit, so a lot of checks were done without any useful work.

Not really sure how "representative" my polygons are, but they are squares with zig-zag boundaries
/\/\/\/\/\/\/\/\
\              /
/              \
\              /
/              \
\              /
\/\/\/\/\/\/\/



_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Martin Davis-3
In reply to this post by Martin Davis-3
I did some quick metrics, on real-world geometries (outlines of countries, flipped in X and then intersected.  This exercises the sweepline a lot, and the MonoChains are utilized to their potential).

France : 921 vertices   Overlaps = 3,329
Norway: 4155 vertices.   Overlaps = 13,344
Russia: 14,747 vertices   Overlaps = 36,124

So the overlap count is much less than n^2, as expected.  And the overlap count is going up (very)approximately) linearly with the vertex count.

Note that the Monotone Chains also provide efficiency in determining actual segment intersections, since they enable binary search.

On Fri, Nov 30, 2018 at 1:25 PM Martin Davis <[hidden email]> wrote:
Those polygons are a worst-case scenario for MonotoneChains, and they are a sub-optimal case for sweepline as well.  So maybe not surprising you are seeing a n^2 count.  

There doesn't seem much point in working to optimize such an artificial case, especially if it will impact code complexity or performance for the "average" case.   But it's hard to speculate without a working demonstration of a potential improvement.

On Fri, Nov 30, 2018 at 1:13 PM Paul van der Linden <[hidden email]> wrote:
Well, the part I'm referring to is the 1. and 2 (Compute self-intersection nodes for A and B) as far as I understand.

counter's value at the end of the computeintersections was n*n, and breakpoint never hit, so a lot of checks were done without any useful work.

Not really sure how "representative" my polygons are, but they are squares with zig-zag boundaries
/\/\/\/\/\/\/\/\
\              /
/              \
\              /
/              \
\              /
\/\/\/\/\/\/\/



_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Darafei "Komяpa" Praliaskouski
In reply to this post by Martin Davis-3
This is interesting.

In PostGIS 2.5 I changed ST_Subdivide to accommodate similar cases. The splits it's going to produce is a bunch of triangles on the edge and a bunch of box-shaped geometries in the meat of polygon. Putting these in a tree (new year, boxes in the tree) is probably going to help perform several smaller overlays faster.

What are the properties that help in such a split, and what are the ones that don't? For point-in-polygon it's definitely better to have large squares and get the outer void defined by the shape of the tree and not the shape of parts. Will another kind of algorithm for splitting in some predefined dimension help? Like, "if you see shape going up and then down, it's a good split point and it's more preferred than one close to center."

We can probably optimize further if shape is split into all-convex parts.



сб, 1 дек. 2018 г. в 00:25, Martin Davis <[hidden email]>:
Those polygons are a worst-case scenario for MonotoneChains, and they are a sub-optimal case for sweepline as well.  So maybe not surprising you are seeing a n^2 count.  

There doesn't seem much point in working to optimize such an artificial case, especially if it will impact code complexity or performance for the "average" case.   But it's hard to speculate without a working demonstration of a potential improvement.

On Fri, Nov 30, 2018 at 1:13 PM Paul van der Linden <[hidden email]> wrote:
Well, the part I'm referring to is the 1. and 2 (Compute self-intersection nodes for A and B) as far as I understand.

counter's value at the end of the computeintersections was n*n, and breakpoint never hit, so a lot of checks were done without any useful work.

Not really sure how "representative" my polygons are, but they are squares with zig-zag boundaries
/\/\/\/\/\/\/\/\
\              /
/              \
\              /
/              \
\              /
\/\/\/\/\/\/\/


_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
--
Darafei Praliaskouski
Support me: http://patreon.com/komzpa

_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Paul van der Linden
In reply to this post by Paul van der Linden
Ok, you're right. This was just a random example I used to see if I can spot some improvements and get a bit of a feel about the algorithms used.

Ultimately I'm trying to figure out if the overlay-ops (especially intersection and union) can be reworked to use prepared geometries, but will make sure any tests I do involve real-world polygons

_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Martin Davis-3
I've also had some thoughts about being able to "prepare" geometries for overlay.  At least for intersection and difference - it's less clear that there is much of a use case for union and symdifference.

At the theoretical level certainly the self-noding could be cached, and probably a reduction to a set of edges (Monotone Chains) as well. 

This will require fairly major surgery to the codebase, however.   Ideally the relate code can be split out (and hopefully reworked entirely), to reduce the complexity.  There's also snap-rounding lurking in the background...  
   

On Sat, Dec 1, 2018 at 8:15 AM Paul van der Linden <[hidden email]> wrote:
Ok, you're right. This was just a random example I used to see if I can spot some improvements and get a bit of a feel about the algorithms used.

Ultimately I'm trying to figure out if the overlay-ops (especially intersection and union) can be reworked to use prepared geometries, but will make sure any tests I do involve real-world polygons
_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel

_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Martin Davis-3
In reply to this post by Darafei "Komяpa" Praliaskouski
Interesting idea.   Sounds like should be discussed on a different thread, perhaps?

On Sat, Dec 1, 2018 at 2:34 AM Darafei "Komяpa" Praliaskouski <[hidden email]> wrote:
This is interesting.

In PostGIS 2.5 I changed ST_Subdivide to accommodate similar cases. The splits it's going to produce is a bunch of triangles on the edge and a bunch of box-shaped geometries in the meat of polygon. Putting these in a tree (new year, boxes in the tree) is probably going to help perform several smaller overlays faster.

What are the properties that help in such a split, and what are the ones that don't? For point-in-polygon it's definitely better to have large squares and get the outer void defined by the shape of the tree and not the shape of parts. Will another kind of algorithm for splitting in some predefined dimension help? Like, "if you see shape going up and then down, it's a good split point and it's more preferred than one close to center."

We can probably optimize further if shape is split into all-convex parts.



сб, 1 дек. 2018 г. в 00:25, Martin Davis <[hidden email]>:
Those polygons are a worst-case scenario for MonotoneChains, and they are a sub-optimal case for sweepline as well.  So maybe not surprising you are seeing a n^2 count.  

There doesn't seem much point in working to optimize such an artificial case, especially if it will impact code complexity or performance for the "average" case.   But it's hard to speculate without a working demonstration of a potential improvement.

On Fri, Nov 30, 2018 at 1:13 PM Paul van der Linden <[hidden email]> wrote:
Well, the part I'm referring to is the 1. and 2 (Compute self-intersection nodes for A and B) as far as I understand.

counter's value at the end of the computeintersections was n*n, and breakpoint never hit, so a lot of checks were done without any useful work.

Not really sure how "representative" my polygons are, but they are squares with zig-zag boundaries
/\/\/\/\/\/\/\/\
\              /
/              \
\              /
/              \
\              /
\/\/\/\/\/\/\/


_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
--
Darafei Praliaskouski
Support me: http://patreon.com/komzpa
_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel

_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Paul van der Linden
In reply to this post by Paul van der Linden
Well for the union I meant the unary union. didn't do a deep research on possible improvements in that part.
Currently I'm investigating the intersection of some nasty (over 4000 holes) polygon (forest) with a polygon consisting of the world and a very large (area and nr of points) hole in it (selection).
For reference: the forest is fully inside the selection

The selection is going through the computeIntersections really fast,
for the forest polygon the skipped (due to the edgeset being equal) iterations in the processoverlap is about 10% of the total iterations (so not much to gain there i guess),
but for the computeIntersections of the combination of those polygons the skipped iterations are roughly equal to the total (24200775 out of 24574097), so there's room for improvement.
Don't know yet on how big of a part the computeIntersections contributes to the total intersection-time is going to be, so that improvement could be small after all.


Another thing: as that pointinpolygon is going to be called from https://github.com/libgeos/geos/blob/8bc1ea0f8019bcb83c60126e7883ae428ea7c677/src/geomgraph/EdgeEndStar.cpp#L165 and findEdgeRingContaining quite a number of times, perhaps its possible to prepare the lot and use a IndexedPointInAreaLocator?


_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Paul van der Linden
In reply to this post by Paul van der Linden
Ok, first test with my big polygon and the envelope tests in place is giving a 27% speedup!


_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Martin Davis-3
In reply to this post by Paul van der Linden


On Sun, Dec 2, 2018 at 8:04 AM Paul van der Linden <[hidden email]> wrote:

Yes, that's a good point.  The original reason for not including an envelope check is following the JTS design principal of not doing checks which may have already been done externally.  But in this case the check is fairly low-cost, and not including it is (obviously) a common source of inefficiency.  So it makes sense to add an envelope check into the default code path.  (If an unchecked path is needed that could be added as a separate method). 

Another thing: as that pointinpolygon is going to be called from https://github.com/libgeos/geos/blob/8bc1ea0f8019bcb83c60126e7883ae428ea7c677/src/geomgraph/EdgeEndStar.cpp#L165 and findEdgeRingContaining quite a number of times, perhaps its possible to prepare the lot and use a IndexedPointInAreaLocator?

You have found an area in the overlay code which isn't optimized as much as it could be.  Although whether it's worth building an index on the holes depends on the number of holes and rings that are being checked.  But for geometries with many holes it might be advantageous.  Are you going to test this? 

_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Martin Davis-3
In reply to this post by Paul van der Linden
@Paul vdL - A nice speed bump, for sure.

Do you have data or at least an image to share of your test geometry?

Do you know which operation the performance improvement appears in?  Predicate or overlay or both?



On Mon, Dec 3, 2018 at 1:40 PM Paul van der Linden <[hidden email]> wrote:
Ok, first test with my big polygon and the envelope tests in place is giving a 27% speedup!

_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel

_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Paul van der Linden
In reply to this post by Paul van der Linden
Ok, I've created a pull request here: https://github.com/libgeos/geos/pull/138
There's also some remarks for further improvements.
So the discussion can be done in github or continued here (if so, please put me in the cc so that I can properly answer)


_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible speed improvement for overlay operations

Paul van der Linden
In reply to this post by Paul van der Linden
It's in the overlay ops.
Still impressive that this can be done in just over 10 seconds, and I'm sure prepared geometries will help even more!
Also surprising is that for that st_intersection to run in debug-mode in visual studio it never finished within my patience window, and that if run from postgres it takes about 50-60 seconds...


_______________________________________________
geos-devel mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/geos-devel
12