Incorrect behavior of WFS GetFeature

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Incorrect behavior of WFS GetFeature

Tamas Szekeres
Hi Devs,

I did some investigation regarding to the current GetFeature implementation (in master), but I doubt if that is working properly as it stands now. I've executed the query (with pagination) against a postgis database in the debugger and it doesn't seem to be working.

Currently the following approach is used in mapserver:

1. An initial query is issued by msWFSRetrieveFeatures (in mapwfs.c) where the subset of the features are added to the result cache (from startindex limiting to maxfeatures)
2. The total count of the features are computed by msWFSComputeMatchingFeatures (in mapwfs.c) in which the result set is saved and the original layer is closed and reopened and a count(..) aggregate query is initiated afterwards (in msPostGISLayerGetShapeCount).
3. The result set is then restored and GetShape is called for each item using the same layer.

The main problem with this approach that in #2 the original layer is closed and reopened therefore the resultset stored by the driver is invalidated. The driver specific GetShapeCount will create a new resultset either with 1 element (the count), or the entire set of shapes if it falls back to the stock LayerDefaultGetShapeCount. This resultset will then be used in #3 where the resultindex references refer to incorrect features.

Would that be an option to implement msWFSComputeMatchingFeatures in that way so that the original layer is not getting closed and reopened?

Best regards,

Tamas



 






Seth G <[hidden email]> ezt írta (időpont: 2019. aug. 23., P, 20:07):
Hi Tamas,

The MapServer FOSS4G presentation is 90% done - I have a few slides to finish off, and then run through it to check the timings. The latest version is at: https://geographika.s3.eu-west-2.amazonaws.com/mapserver-foss4g/index.html
I'll send it round to the dev list this weekend, let me know if you want any changes to the GISInternals slide (or if you see any other errors/omissions in the presentation). 

With regards to the MSSQL Driver - do you have an idea when you may get to look at this? 
I've noticed any GetFeature request calls the SQL query for the layer twice even without paging causing several performance issues. 

Regards,

Seth



On Sat, Aug 3, 2019, at 10:44 PM, Tamas Szekeres wrote:
Hi Seth,

It is hard to provide accurate estimate, because I don't know which solution will provide better performance.
As a first attempt, I'd probably try the SQLFetchScroll API for both the pagination and the 2phase query access. As you can see the 2phase query executes the query twice, which would not be required if the stay within the same result set. I think this implementation would require 2-3 days only, then we'd need to do some evaluation testing. This implementation would somewhat be ODBC 3.0 specific, but I think we should encourage the using the Native Client driver or the recent MSSQL ODBC 17 driver or any other 3.0 compatible driver at some point. 

Regarding to FOSS4G, you can mention my name certainly. It is great you could take the responsibility to provide the status report. I've already met with most of the mapserver devs / PSC members in peson in some earlier conferences/codesprints (starting from 2006 in Lausanne), but unfortunately, I couldn't attend to the latest ones due to family reasons. Please convey my greetings to the mapserver community, anyway.

I'll also take a look at the mapcache addition shortly.

Best regards,

Tamas



Seth G <[hidden email]> ezt írta (időpont: 2019. aug. 3., Szo, 10:12):

Excellent, thanks Tamas. If you have a rough estimate of effort it would be useful. 

A couple of other things - I'm presenting the "State of MapServer" at FOSS4G in August. Is it ok to include a slide mentioning https://www.gisinternals.com/ and include your name?

Also the offer of funding to add the MapCache CGI exe to the GISInternals builds is still there. There various dependencies for Windows can be seen in https://github.com/mapserver/mapcache/blob/master/appveyor.yml

Regards,

Seth



On Fri, Aug 2, 2019, at 11:32 AM, Tamas Szekeres wrote:
Hi Seth,

Yes, I'm interested in. 
I'll check how much work does it involve.

Best regards,

Tamas


Seth G <[hidden email]> ezt írta (időpont: 2019. aug. 2., P, 10:44):
Hi Tamas,

I've run into some performance problems with the MSSQL driver as outlined at https://github.com/mapserver/mapserver/issues/5842

For a WFS GetFeature request the full SQL query for a layer is run 3 times to return a result. For layers with many fields/complex views this makes paging through records very slow. I think implementing paging would at least speed up 2 of the queries, and changing the 3rd to get a count rather than run the full query would make performance acceptable.

As you've been maintaining the MSSQL driver would you be interested in implementing paging/investigating performance? I have a project that would benefit greatly from this, so would be able to pay for any time spent.

Regards,

Seth

--



_______________________________________________
mapserver-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/mapserver-dev
Reply | Threaded
Open this post in threaded view
|

Re: Incorrect behavior of WFS GetFeature

Even Rouault-2
On dimanche 25 août 2019 20:09:32 CEST Tamas Szekeres wrote:

> Hi Devs,
>
> I did some investigation regarding to the current GetFeature implementation
> (in master), but I doubt if that is working properly as it stands now. I've
> executed the query (with pagination) against a postgis database in the
> debugger and it doesn't seem to be working.
>
> Currently the following approach is used in mapserver:
>
> 1. An initial query is issued by msWFSRetrieveFeatures (in mapwfs.c) where
> the subset of the features are added to the result cache (from startindex
> limiting to maxfeatures)
> 2. The total count of the features are computed by
> msWFSComputeMatchingFeatures (in mapwfs.c) in which the result set is saved
> and the original layer is closed and reopened and a count(..) aggregate
> query is initiated afterwards (in msPostGISLayerGetShapeCount).
> 3. The result set is then restored and GetShape is called for each item
> using the same layer.
>
> The main problem with this approach that in #2 the original layer is closed
> and reopened therefore the resultset stored by the driver is invalidated.
> The driver specific GetShapeCount will create a new resultset either with 1
> element (the count), or the entire set of shapes if it falls back to the
> stock LayerDefaultGetShapeCount. This resultset will then be used in #3
> where the resultindex references refer to incorrect features.

Hum I see...

Several workarounds come to mind:
- make sure that your features are always retrieved in the same order by
adding an ORDER BY clause to your DATA clause
- set the wfs_features_cache_size metadata item
(see https://mapserver.org/ogc/wfs_server.html?highlight=features_cache_count)
so that the result of the first query is entirely cached in RAM

>
> Would that be an option to implement msWFSComputeMatchingFeatures in that
> way so that the original layer is not getting closed and reopened?

This is likely to be involved as I guess those open/closings are the ones done
in mapquery.c

Even

--
Spatialys - Geospatial professional services
http://www.spatialys.com
_______________________________________________
mapserver-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/mapserver-dev
Reply | Threaded
Open this post in threaded view
|

Re: Incorrect behavior of WFS GetFeature

Tamas Szekeres




Several workarounds come to mind:
- make sure that your features are always retrieved in the same order by
adding an ORDER BY clause to your DATA clause

Hi Even,

I've already used orderby in the wfs query, however I'm not sure it helps in this case, since the rowindex is assingned to resultindex which starts from 0 as far as I see.
 
- set the wfs_features_cache_size metadata item
(see https://mapserver.org/ogc/wfs_server.html?highlight=features_cache_count)
so that the result of the first query is entirely cached in RAM


I just wanted to understand the intended behavior at the moment, so as to imagine what optimizations can be implemented in the MSSQL driver for the query and the pagination support. 
This workaround could probably help for the user, however.

 
>
> Would that be an option to implement msWFSComputeMatchingFeatures in that
> way so that the original layer is not getting closed and reopened?

This is likely to be involved as I guess those open/closings are the ones done
in mapquery.c


We could probably call msLayerGetShapeCount directly and the drivers can make sure not to change the resultset within the implementation.

Best regards,

Tamas

_______________________________________________
mapserver-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/mapserver-dev