GeoServer slowing down with bigger CQL filters

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

GeoServer slowing down with bigger CQL filters

Bernhard Kiselka
Hi list!

I did not do so much research as in https://gis.stackexchange.com/a/82305 but as my data increased, my CQL_FILTER containing IDs that shall be rendered increased from 600 to 1800 (20 kilobyte of filter string!) and GeoServer response time increased from 1 second to 5 seconds.

Does someone has any guesses why GeoServer is slowing down exponentially?

I assume that parsing the filter is slow, because the postgres database query is _not_ slowing down that much (the query itself takes less than 100ms).

Thanks for any thoughts!

Best regards,
Bernhard

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Geoserver-users mailing list

Please make sure you read the following two resources before posting to this list:
- Earning your support instead of buying it, but Ian Turton: http://www.ianturton.com/talks/foss4g.html#/
- The GeoServer user list posting guidelines: http://geoserver.org/comm/userlist-guidelines.html

If you want to request a feature or an improvement, also see this: https://github.com/geoserver/geoserver/wiki/Successfully-requesting-and-integrating-new-features-and-improvements-in-GeoServer


[hidden email]
https://lists.sourceforge.net/lists/listinfo/geoserver-users
Reply | Threaded
Open this post in threaded view
|

Re: GeoServer slowing down with bigger CQL filters

Ian Turton
Parsing a CQL filter of that size is bound to take a while and then it has to be translated into an SQL statement which will also take a while.

I suspect there must be an easier way of doing what ever you are doing with out the need for a 20Kb filter.

Ian

On 1 March 2018 at 13:14, Bernhard Kiselka <[hidden email]> wrote:
Hi list!

I did not do so much research as in https://gis.stackexchange.com/a/82305 but as my data increased, my CQL_FILTER containing IDs that shall be rendered increased from 600 to 1800 (20 kilobyte of filter string!) and GeoServer response time increased from 1 second to 5 seconds.

Does someone has any guesses why GeoServer is slowing down exponentially?

I assume that parsing the filter is slow, because the postgres database query is _not_ slowing down that much (the query itself takes less than 100ms).

Thanks for any thoughts!

Best regards,
Bernhard

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Geoserver-users mailing list

Please make sure you read the following two resources before posting to this list:
- Earning your support instead of buying it, but Ian Turton: http://www.ianturton.com/talks/foss4g.html#/
- The GeoServer user list posting guidelines: http://geoserver.org/comm/userlist-guidelines.html

If you want to request a feature or an improvement, also see this: https://github.com/geoserver/geoserver/wiki/Successfully-requesting-and-integrating-new-features-and-improvements-in-GeoServer


[hidden email]
https://lists.sourceforge.net/lists/listinfo/geoserver-users



--
Ian Turton

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Geoserver-users mailing list

Please make sure you read the following two resources before posting to this list:
- Earning your support instead of buying it, but Ian Turton: http://www.ianturton.com/talks/foss4g.html#/
- The GeoServer user list posting guidelines: http://geoserver.org/comm/userlist-guidelines.html

If you want to request a feature or an improvement, also see this: https://github.com/geoserver/geoserver/wiki/Successfully-requesting-and-integrating-new-features-and-improvements-in-GeoServer


[hidden email]
https://lists.sourceforge.net/lists/listinfo/geoserver-users
Reply | Threaded
Open this post in threaded view
|

Re: GeoServer slowing down with bigger CQL filters

Bernhard Kiselka

Hi Ian!

 

Thanks for you answer!

 

I understand that parsing 20KB takes some time.

But why does in slow down exponentically (see digram in https://gis.stackexchange.com/a/82305 )?

 

I might be able to change the filter -  but in fact this is a "simplification" for needing only the ID in the GeoServer layer (and not all attributes the filter relys on initially).

 

I also had a look at the source code of GeoTools (guess https://github.com/geotools/geotools/blob/master/modules/library/main/src/main/java/org/geotools/filter/visitor/SimplifyingFilterVisitor.java is right), but I don't understand it completely.

Most of all I don't understand why GeoTools needs to parse a IN statement (and translates it to multiple ORs).

The only reason I can think of is to disable SQL injections.

But again it should not slow down the request time exponentially.

 

And interestingly if I don't use CQL_FILTER=id IN (1,2,…) but &FEATUREID=fid1,fid2,… GeoServer scales much better!

 

 

Conclusion: I might find an easier way for my filter.

But a exponential speed reduction seems like a bug to me.

As a workaround I will try to use FEATUREID instead of CQL_FILTER!

 

 

Here is a publically available example (I must admit I don't know if the Medford data is stored on a database):

2 Buildings:

https://demo.boundlessgeo.com/geoserver/medford/ows?service=WFS&version=1.0.0&request=GetFeature&typeName=medford:buildings&CQL_FILTER=key%20IN%20(18086%2C%2018087)

takes 360ms

 

https://demo.boundlessgeo.com/geoserver/medford/ows?service=WFS&version=1.0.0&request=GetFeature&typeName=medford:buildings&FEATUREID=buildings.18086,buildings.18087

takes 460ms

 

20 Buildings

https://demo.boundlessgeo.com/geoserver/medford/ows?service=WFS&version=1.0.0&request=GetFeature&typeName=medford:buildings&CQL_FILTER=key%20IN%20(1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15%2C16%2C17%2C18%2C19%2C20)

takes approx. 800ms with CQL_FILTER

and takes 480ms with FEATUREID (2+ time, first is ~700ms)

 

40 Buildings

takes 1270ms with CQL_FILTER

and 480ms with FEATUREID

 

230 Buildings:

takes approx. 8000ms with CQL_FILTER

 

Of course you need to use POST for more IDs…

 

where as

https://demo.boundlessgeo.com/geoserver/medford/ows?service=WFS&version=1.0.0&request=GetFeature&typeName=medford:buildings&maxFeatures=230

takes only 250ms

 

Best regards,

Bernhard

 

Von: Ian Turton [[hidden email]]
Gesendet: Donnerstag, 01. März 2018 16:14
An: Bernhard Kiselka
Cc: [hidden email]
Betreff: Re: [Geoserver-users] GeoServer slowing down with bigger CQL filters

 

Parsing a CQL filter of that size is bound to take a while and then it has to be translated into an SQL statement which will also take a while.

I suspect there must be an easier way of doing what ever you are doing with out the need for a 20Kb filter.

Ian

 

On 1 March 2018 at 13:14, Bernhard Kiselka <[hidden email]> wrote:

Hi list!

I did not do so much research as in https://gis.stackexchange.com/a/82305 but as my data increased, my CQL_FILTER containing IDs that shall be rendered increased from 600 to 1800 (20 kilobyte of filter string!) and GeoServer response time increased from 1 second to 5 seconds.

Does someone has any guesses why GeoServer is slowing down exponentially?

I assume that parsing the filter is slow, because the postgres database query is _not_ slowing down that much (the query itself takes less than 100ms).

Thanks for any thoughts!

Best regards,
Bernhard

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Geoserver-users mailing list

Please make sure you read the following two resources before posting to this list:
- Earning your support instead of buying it, but Ian Turton: http://www.ianturton.com/talks/foss4g.html#/
- The GeoServer user list posting guidelines: http://geoserver.org/comm/userlist-guidelines.html

If you want to request a feature or an improvement, also see this: https://github.com/geoserver/geoserver/wiki/Successfully-requesting-and-integrating-new-features-and-improvements-in-GeoServer


[hidden email]
https://lists.sourceforge.net/lists/listinfo/geoserver-users




--

Ian Turton


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Geoserver-users mailing list

Please make sure you read the following two resources before posting to this list:
- Earning your support instead of buying it, but Ian Turton: http://www.ianturton.com/talks/foss4g.html#/
- The GeoServer user list posting guidelines: http://geoserver.org/comm/userlist-guidelines.html

If you want to request a feature or an improvement, also see this: https://github.com/geoserver/geoserver/wiki/Successfully-requesting-and-integrating-new-features-and-improvements-in-GeoServer


[hidden email]
https://lists.sourceforge.net/lists/listinfo/geoserver-users
Reply | Threaded
Open this post in threaded view
|

Re: GeoServer slowing down with bigger CQL filters

geowolf
On Mon, Mar 5, 2018 at 9:52 AM, Bernhard Kiselka <[hidden email]> wrote:

Hi Ian!

 

Thanks for you answer!

 

I understand that parsing 20KB takes some time.

But why does in slow down exponentically (see digram in https://gis.stackexchange.com/a/82305 )?


Could be a result of the algorithm doing the parsing, or the in memory representation building, or the filter simplification
in the middle, hard to say. For sure none of the filter management in GeoTools was built with such a large filter in mind.
However, it's open source, so you can propose an improvement, or use commercial support to have someone
else to do it for you. 

I might be able to change the filter -  but in fact this is a "simplification" for needing only the ID in the GeoServer layer (and not all attributes the filter relys on initially).

 

I also had a look at the source code of GeoTools (guess https://github.com/geotools/geotools/blob/master/modules/library/main/src/main/java/org/geotools/filter/visitor/SimplifyingFilterVisitor.java is right), but I don't understand it completely.


That simplifies filters post parsing, it's one of the things that might be scaling badly.
 

Most of all I don't understand why GeoTools needs to parse a IN statement (and translates it to multiple ORs).


Because the OGC standard filter model has a fixed list of filters, and "IN" is not one of them, pretty simple.
In the most recent versions of GeoServer there is also a "in" function that can be encoded in SQL (in older version there is, but it is not recognized
by the SQL encoder as thus it's executed in memory instead of being turned into SQL).
 
Regards,

Andrea Aime


==
GeoServer Professional Services from the experts! Visit http://goo.gl/it488V for more information.
==

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via di Montramito 3/A
55054  Massarosa (LU)
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39  339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility  for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Geoserver-users mailing list

Please make sure you read the following two resources before posting to this list:
- Earning your support instead of buying it, but Ian Turton: http://www.ianturton.com/talks/foss4g.html#/
- The GeoServer user list posting guidelines: http://geoserver.org/comm/userlist-guidelines.html

If you want to request a feature or an improvement, also see this: https://github.com/geoserver/geoserver/wiki/Successfully-requesting-and-integrating-new-features-and-improvements-in-GeoServer


[hidden email]
https://lists.sourceforge.net/lists/listinfo/geoserver-users
Reply | Threaded
Open this post in threaded view
|

Re: GeoServer slowing down with bigger CQL filters

Bernhard Kiselka

Hello Andrea!

 

Thanks for your answer!

We will also try a newer GeoServer in future.

 

Just some measures with our postgres-DB containing around 100.000 entries in the table, filtering 1800 of it via IDs and GeoServer 2.7.1.1:

Request without filter: ~150ms

Request with a CQL Filter: ~1500ms

Same request with FeatureID: ~210ms

 

Best Regards,

Bernhard

 

Von: [hidden email] [mailto:[hidden email]] Im Auftrag von Andrea Aime
Gesendet: Montag, 05. März 2018 10:11
An: Bernhard Kiselka
Cc: Ian Turton; [hidden email]
Betreff: Re: [Geoserver-users] GeoServer slowing down with bigger CQL filters

 

On Mon, Mar 5, 2018 at 9:52 AM, Bernhard Kiselka <[hidden email]> wrote:

Hi Ian!

 

Thanks for you answer!

 

I understand that parsing 20KB takes some time.

But why does in slow down exponentically (see digram in https://gis.stackexchange.com/a/82305 )?

 

Could be a result of the algorithm doing the parsing, or the in memory representation building, or the filter simplification

in the middle, hard to say. For sure none of the filter management in GeoTools was built with such a large filter in mind.

However, it's open source, so you can propose an improvement, or use commercial support to have someone

else to do it for you. 

 

I might be able to change the filter -  but in fact this is a "simplification" for needing only the ID in the GeoServer layer (and not all attributes the filter relys on initially).

 

I also had a look at the source code of GeoTools (guess https://github.com/geotools/geotools/blob/master/modules/library/main/src/main/java/org/geotools/filter/visitor/SimplifyingFilterVisitor.java is right), but I don't understand it completely.

 

That simplifies filters post parsing, it's one of the things that might be scaling badly.

 

Most of all I don't understand why GeoTools needs to parse a IN statement (and translates it to multiple ORs).

 

Because the OGC standard filter model has a fixed list of filters, and "IN" is not one of them, pretty simple.

In the most recent versions of GeoServer there is also a "in" function that can be encoded in SQL (in older version there is, but it is not recognized

by the SQL encoder as thus it's executed in memory instead of being turned into SQL).

 

Regards,

Andrea Aime

 

==
GeoServer Professional Services from the experts! Visit http://goo.gl/it488V for more information.
==

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via di Montramito 3/A
55054  Massarosa (LU)
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39  339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility  for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.

 


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Geoserver-users mailing list

Please make sure you read the following two resources before posting to this list:
- Earning your support instead of buying it, but Ian Turton: http://www.ianturton.com/talks/foss4g.html#/
- The GeoServer user list posting guidelines: http://geoserver.org/comm/userlist-guidelines.html

If you want to request a feature or an improvement, also see this: https://github.com/geoserver/geoserver/wiki/Successfully-requesting-and-integrating-new-features-and-improvements-in-GeoServer


[hidden email]
https://lists.sourceforge.net/lists/listinfo/geoserver-users