SQL View Based Layer Performance

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

SQL View Based Layer Performance

Danny Cheng-2

Hello,

 

We have a few layers where the model requires joining a few database tables together. For some we define the join in GeoServer using SQL view, but for others we pre-join the tables together as part of ETL to produce the final desired table.

 

1.      My assumption is that pre-joining the tables and create a layer on top of the output table will yield better performance than defining the join in GeoServer SQL view. Is this correct?

2.      If so, how much performance gains are we talking about?

 

I am trying to see if it’s worth simplifying preprocessing tasks at the cost of some performance because if there is a change in data model I have to rerun the entire preprocessing job which can be time consuming where as if I defined the SQL view in GeoServer I can easily change it on the fly.

 

Thanks,
Danny



_______________________________________________
Geoserver-users mailing list

Please make sure you read the following two resources before posting to this list:
- Earning your support instead of buying it, but Ian Turton: http://www.ianturton.com/talks/foss4g.html#/
- The GeoServer user list posting guidelines: http://geoserver.org/comm/userlist-guidelines.html

If you want to request a feature or an improvement, also see this: https://github.com/geoserver/geoserver/wiki/Successfully-requesting-and-integrating-new-features-and-improvements-in-GeoServer


[hidden email]
https://lists.sourceforge.net/lists/listinfo/geoserver-users
Reply | Threaded
Open this post in threaded view
|

Re: SQL View Based Layer Performance

geowolf
On Fri, May 10, 2019 at 10:05 PM Danny Cheng <[hidden email]> wrote:

Hello,

 

We have a few layers where the model requires joining a few database tables together. For some we define the join in GeoServer using SQL view, but for others we pre-join the tables together as part of ETL to produce the final desired table.

 

1.      My assumption is that pre-joining the tables and create a layer on top of the output table will yield better performance than defining the join in GeoServer SQL view. Is this correct?


Is running a selected against the materialized view faster than running the join on the fly? It normally is.
 

2.      If so, how much performance gains are we talking about?

Impossible to say, depends on too many factors. In particular, was the original query bottlenecked by data fetching or output production (rendering, WFS GML encoding, and so on)?
If it was bottlenecked by the database, you should get a speedup proportional to how much faster loading from the pre-joined table is, if it was bottlenecked by the output production
instead, the difference might be barely noticeable.
 

 I am trying to see if it’s worth simplifying preprocessing tasks at the cost of some performance because if there is a change in data model I have to rerun the entire preprocessing job which can be time consuming where as if I defined the SQL view in GeoServer I can easily change it on the fly.


Creating a materialized view is often as easy as "create table .... as (select ... from t1 join t2 over ... where ....)", add indexes as needed.
Then you can compare the two running a benchmark and decide by yourself it it's worth it or not.
This documentation might be useful in terms of setting up a benchmark:

Cheers
Andrea

==

GeoServer Professional Services from the experts! Visit http://goo.gl/it488V for more information. == Ing. Andrea Aime @geowolf Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054 Massarosa (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549 http://www.geo-solutions.it http://twitter.com/geosolutions_it ------------------------------------------------------- Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia. This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail.



_______________________________________________
Geoserver-users mailing list

Please make sure you read the following two resources before posting to this list:
- Earning your support instead of buying it, but Ian Turton: http://www.ianturton.com/talks/foss4g.html#/
- The GeoServer user list posting guidelines: http://geoserver.org/comm/userlist-guidelines.html

If you want to request a feature or an improvement, also see this: https://github.com/geoserver/geoserver/wiki/Successfully-requesting-and-integrating-new-features-and-improvements-in-GeoServer


[hidden email]
https://lists.sourceforge.net/lists/listinfo/geoserver-users