have aggregate/window expressions ever been discussed?

classic Classic list List threaded Threaded
33 messages Options
12
Reply | Threaded
Open this post in threaded view
|

have aggregate/window expressions ever been discussed?

giohappy
Recently I had to calculate the relative dimension of a polygon relative to the others of the same class. I know there are a lot of way to do it, inside and outside QGIS, but I wondered if field calculator (and expressions in general) couls be extended to accompish this kind of tasks.

The approach would be something similar WINDOW functions in Postgresql [1], where for each record a new value will be calculated on the basis of other records (filtered or not).

Has this ever been discussed? Is it something that could fit QGIS expressions?

giovanni


_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

Régis Haubourg
Hi,
well, this is something I miss, coming from Mapinfo world.

Some tests are being made currently on French ministry of ecology, and by Nathan also, to head to some kind of a layer abstraction, based on ogr virtual layer and ogr sqlite engine.

Once done, that would open a large field of new features and dedicated gui's:
 - SQL assistant (mapinfo like / qspatialite like) on any loaded data source and without any explicit import step. This includes aggregate functions
 - virtual attribute or spatial columns based on functions
 - ETL feature for changing data types and structure
 
FME is providing such kind of SQLexecutor, with some limits currently, and that is extremly powerfull in processing tools, allowing to take full advantage of existing algorithm for geospatial computing, and SQL-spatial SQL for attribute queries AND processing.

If we find a way to be sure that any datasource can be correctly used without caveats hard to understand for the end user, I think we will have the best GIS tool ever done for non 3D uses. Still I see some limits with SQLITE. For instance, many SQL clauses are partially supported. Or calculated fields in view can't be explicitly cast, so QGIS should have to guess data type based on a data scan (a major unadressed issue of sqlite)

I wish I had more time now to investigate by myself.

Nathan , Hugo , could you tell us more on that topic? is ogr sqlite engine the right target or does is have to much limitations?

Cheers
Régis
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

giohappy

Hi Regis,
it's a long time I think to what you're descriibing, and that would definitely be on a higher level then aggregate expressions alone. This level of abstraction would also be basis for a map/reduce pattern analysis.
It would be great to know some more about what has been discussed so far.

giovanni

Il 23/mag/2014 22:44 "Régis Haubourg" <[hidden email]> ha scritto:
Hi,
well, this is something I miss, coming from Mapinfo world.

Some tests are being made currently on French ministry of ecology, and by
Nathan also, to head to some kind of a layer abstraction, based on ogr
virtual layer and ogr sqlite engine.

Once done, that would open a large field of new features and dedicated
gui's:
 - SQL assistant (mapinfo like / qspatialite like) on any loaded data source
and without any explicit import step. This includes aggregate functions
 - virtual attribute or spatial columns based on functions
 - ETL feature for changing data types and structure

FME is providing such kind of SQLexecutor, with some limits currently, and
that is extremly powerfull in processing tools, allowing to take full
advantage of existing algorithm for geospatial computing, and SQL-spatial
SQL for attribute queries AND processing.

If we find a way to be sure that any datasource can be correctly used
without caveats hard to understand for the end user, I think we will have
the best GIS tool ever done for non 3D uses. Still I see some limits with
SQLITE. For instance, many SQL clauses are partially supported. Or
calculated fields in view can't be explicitly cast, so QGIS should have to
guess data type based on a data scan (a major unadressed issue of sqlite)

I wish I had more time now to investigate by myself.

Nathan , Hugo , could you tell us more on that topic? is ogr sqlite engine
the right target or does is have to much limitations?

Cheers
Régis




--
View this message in context: http://osgeo-org.1560.x6.nabble.com/have-aggregate-window-expressions-ever-been-discussed-tp5142215p5142245.html
Sent from the Quantum GIS - Developer mailing list archive at Nabble.com.
_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer

_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

Andreas Neumann-4
In reply to this post by giohappy
Hi,

We would definitely need this functionality for our application modules
- we need functions like "min", "max", "mean", "sum", "average" to work
on 1:n relations.

Now that we have relations in QGIS, I think that aggregate functions are
a logical next step. Something to seriously consider in 2.6.

Andreas

Am 23.05.2014 17:34, schrieb G. Allegri:

> Recently I had to calculate the relative dimension of a polygon relative to
> the others of the same class. I know there are a lot of way to do it,
> inside and outside QGIS, but I wondered if field calculator (and
> expressions in general) couls be extended to accompish this kind of tasks.
>
> The approach would be something similar WINDOW functions in Postgresql [1],
> where for each record a new value will be calculated on the basis of other
> records (filtered or not).
>
> Has this ever been discussed? Is it something that could fit QGIS
> expressions?
>
> giovanni
>
> [1] http://www.postgresql.org/docs/9.1/static/tutorial-window.html
>
>
>
> _______________________________________________
> Qgis-developer mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/qgis-developer
>

_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

Vincent Picavet (ml)
Hello Andreas,

Le mardi 27 mai 2014 10:28:13, Andreas Neumann a écrit :
> Hi,
>
> We would definitely need this functionality for our application modules
> - we need functions like "min", "max", "mean", "sum", "average" to work
> on 1:n relations.
>
> Now that we have relations in QGIS, I think that aggregate functions are
> a logical next step. Something to seriously consider in 2.6.

As already stated before, I am worrying about current developments around a
lot of features looking like database features :
* Table joins
* relations
* "SQL-like" processing
* .. including aggregates

Implementing all these features on top of QGIS seems reinventing the wheel,
and database wheels are particularly hard to design and implement well.
I really think we should re-study the global design of such features to have a
clear and clean strategy instead of stacking features on features, lacking
coherency.

As stated by Regis, basing this work on top of SQLite may be the best option,
but more study has to be done and a general agreement is needed to go this
way.

Vincent

>
> Andreas
>
> Am 23.05.2014 17:34, schrieb G. Allegri:
> > Recently I had to calculate the relative dimension of a polygon relative
> > to the others of the same class. I know there are a lot of way to do it,
> > inside and outside QGIS, but I wondered if field calculator (and
> > expressions in general) couls be extended to accompish this kind of
> > tasks.
> >
> > The approach would be something similar WINDOW functions in Postgresql
> > [1], where for each record a new value will be calculated on the basis
> > of other records (filtered or not).
> >
> > Has this ever been discussed? Is it something that could fit QGIS
> > expressions?
> >
> > giovanni
> >
> > [1] http://www.postgresql.org/docs/9.1/static/tutorial-window.html
> >
> >
> >
> > _______________________________________________
> > Qgis-developer mailing list
> > [hidden email]
> > http://lists.osgeo.org/mailman/listinfo/qgis-developer
>
> _______________________________________________
> Qgis-developer mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/qgis-developer
_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

Nathan Woodrow
On Tue, May 27, 2014 at 7:21 PM, Vincent Picavet <[hidden email]> wrote:
As stated by Regis, basing this work on top of SQLite may be the best option,
but more study has to be done and a general agreement is needed to go this
way.

I have plans to implement this. Exposing QGIS layers as SQlite virtual tables. This would give a lot of query bang with low code change   It was done in OGR and works really well, of course it sucks with performance if you do something really complicated but otherwise it would give QGIS a good edge - if you need more power then more to a better database. A basic query-everything type system.  This is something that MapInfo had and is a bit of pain to loose when coming to QGIS. 

I hope to start work after 2.4 is out.

- Nathan

_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

Hugo Mercier
In reply to this post by Régis Haubourg
Le 23/05/2014 22:44, Régis Haubourg a écrit :

> Hi,
> well, this is something I miss, coming from Mapinfo world.
>
> Some tests are being made currently on French ministry of ecology, and by
> Nathan also, to head to some kind of a layer abstraction, based on ogr
> virtual layer and ogr sqlite engine.
>
> Once done, that would open a large field of new features and dedicated
> gui's:
>  - SQL assistant (mapinfo like / qspatialite like) on any loaded data source
> and without any explicit import step. This includes aggregate functions
>  - virtual attribute or spatial columns based on functions
>  - ETL feature for changing data types and structure
>  
> FME is providing such kind of SQLexecutor, with some limits currently, and
> that is extremly powerfull in processing tools, allowing to take full
> advantage of existing algorithm for geospatial computing, and SQL-spatial
> SQL for attribute queries AND processing.
>
> If we find a way to be sure that any datasource can be correctly used
> without caveats hard to understand for the end user, I think we will have
> the best GIS tool ever done for non 3D uses. Still I see some limits with
> SQLITE. For instance, many SQL clauses are partially supported. Or
> calculated fields in view can't be explicitly cast, so QGIS should have to
> guess data type based on a data scan (a major unadressed issue of sqlite)

Hmmm I wasn't aware of this limitation in SQLITE views :(

>
> I wish I had more time now to investigate by myself.
>
> Nathan , Hugo , could you tell us more on that topic? is ogr sqlite engine
> the right target or does is have to much limitations?

I've added some lines on the wiki about this :
http://hub.qgis.org/wiki/quantum-gis/virtual_layers

The challenge is to integrate it cleaning more or less transparently for
the end-user and without loosing performances.
If we want end-users to only manipulate GUI dialogs and QgsExpressions
at the maximum and let more advanced users tweak the corresponding SQL
clauses, then we probably have to convert from QgsExpressions to SQL ...

Is SQLITE's virtual tables the right target ? Well ... I am not aware of
any other way to embed an SQL parsing/planner/executor. We do not want
to reimplement our own database engine.
So, in my opinion, we would have to deal with SQLite/Spatialite
limitations. If users want more, they have to install PostgreSQL/PostGIS
and convert their data.

And about Giovanni's first concern, SQLITE does not support WINDOW
functions. There might be some ways to mimick by accessing ROWID though

_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

Martin Dobias
In reply to this post by Vincent Picavet (ml)
On Tue, May 27, 2014 at 4:21 PM, Vincent Picavet
<[hidden email]> wrote:

> Hello Andreas,
>
> Le mardi 27 mai 2014 10:28:13, Andreas Neumann a écrit :
>> Hi,
>>
>> We would definitely need this functionality for our application modules
>> - we need functions like "min", "max", "mean", "sum", "average" to work
>> on 1:n relations.
>>
>> Now that we have relations in QGIS, I think that aggregate functions are
>> a logical next step. Something to seriously consider in 2.6.
>
> As already stated before, I am worrying about current developments around a
> lot of features looking like database features :
> * Table joins
> * relations
> * "SQL-like" processing
> * .. including aggregates
>
> Implementing all these features on top of QGIS seems reinventing the wheel,
> and database wheels are particularly hard to design and implement well.
> I really think we should re-study the global design of such features to have a
> clear and clean strategy instead of stacking features on features, lacking
> coherency.

Agreed.

In addition to the features mentioned above, I believe also indexing
and caching of vector data could be handled by SQLite, so it would be
good to keep that in mind during the design. Finally, the "memory"
provider could be implemented on top of in-memory SQLite table, too.

Regards
Martin
_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

Régis Haubourg
In reply to this post by Hugo Mercier
Hugo Mercier wrote
Or
> calculated fields in view can't be explicitly cast, so QGIS should have to
> guess data type based on a data scan (a major unadressed issue of sqlite)

Hmmm I wasn't aware of this limitation in SQLITE views :(
Yes, SQLITE does dynamic typing, so user or provider has to scan values to guess the right type.
Here is a sqlite topic on that [0]
And here my initial post in qgis list [1]

[0] http://sqlite.1065341.n5.nabble.com/Computed-columns-in-VIEWs-return-NULL-but-should-be-able-to-be-typed-Any-ideas-td56769.html#a56770

[1] http://osgeo-org.1560.x6.nabble.com/Spatialite-can-t-type-fields-of-a-view-td5058436.html
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

giohappy
I agree with all of you about not reinventing the wheel and, yes, let's not fragment QGIS' capabilities on data querying/filtering/etc.
My first idea was simply (but I'm not sure how much simple it is) to be able to perform aggregate queries like in WINDOW functions. In my job it's something I find very useful, and I would like not to have to export my layers to a PostGIS DB just to do it.

Some basic examples:

 - get the average area of some polygon features
 - get the relative dimension against some other feature's dimensions
 - etc.

AFAIK these kind of queries cannot be run on Sqlite in one single shot. This is why I wasn't thinking to Sqlite (or SQL anyway) to implement it, but as an extension to expression engine, in a MapReduce pattern perspective...

giovanni



2014-05-27 15:03 GMT+02:00 Régis Haubourg <[hidden email]>:
Hugo Mercier wrote
> Or
>> calculated fields in view can't be explicitly cast, so QGIS should have
>> to
>> guess data type based on a data scan (a major unadressed issue of sqlite)
>
> Hmmm I wasn't aware of this limitation in SQLITE views :(

Yes, SQLITE does dynamic typing, so user or provider has to scan values to
guess the right type.
Here is a sqlite topic on that [0]
And here my initial post in qgis list [1]

[0]
http://sqlite.1065341.n5.nabble.com/Computed-columns-in-VIEWs-return-NULL-but-should-be-able-to-be-typed-Any-ideas-td56769.html#a56770

[1]
http://osgeo-org.1560.x6.nabble.com/Spatialite-can-t-type-fields-of-a-view-td5058436.html




--
View this message in context: http://osgeo-org.1560.x6.nabble.com/have-aggregate-window-expressions-ever-been-discussed-tp5142215p5142714.html
Sent from the Quantum GIS - Developer mailing list archive at Nabble.com.
_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer



--

_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

Régis Haubourg
giohappy wrote
I agree with all of you about not reinventing the wheel and, yes, let's not
fragment QGIS' capabilities on data querying/filtering/etc.
I agree totally.SQLITE could be a very powerfull basement to build user oriented functions. But we need to be sure that it won't add *under the hood* tricks.

My first need here would be to be able to intersect two layers (Polygon X Polygon ) and compute overlap ratios and unique relations of objects.
I made a PGSQL procedure for that, and wish I could offer the same in QGIS client, whatever datasource the user inputs.
My latest tests converged to using processing python script, reading layers in OGR and using Spatial SQL capacities for intersect and aggregate steps.
The hard part is how to get datasource definition from QGIS and make OGR read it. The aggregate capacities of SQLite are easy to handle.  If we could directly use qgis layers as OGR virtual layers, that would be sooo simple.

Régis
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

Andreas Neumann-4
In reply to this post by Vincent Picavet (ml)
Hi all,

I understand your concerns, but on the other hand - how would you treat
all data sources equally well, if QGIS does not implement that stuff
within QGIS for data providers that do not implement this? One could
still write a QGIS expression to SQL compiler for those providers who
support these functions (Matthias Kuhn proposed that).

We will need these aggregate functions in forms for information purposes
and in the print composer for generating reports. Having to write
Postgis-Views would work for database pros. But let's not forget that
not all of our users are database professionals and not all of the users
have their data in Postgis.

Personally, I don't care where these aggregate functions are implemented
and executed, as long as they are easy to use and do not enforce the
usage of a specific data format.

Andreas


Am 27.05.2014 09:21, schrieb Vincent Picavet:

> Hello Andreas,
>
> Le mardi 27 mai 2014 10:28:13, Andreas Neumann a écrit :
>> Hi,
>>
>> We would definitely need this functionality for our application modules
>> - we need functions like "min", "max", "mean", "sum", "average" to work
>> on 1:n relations.
>>
>> Now that we have relations in QGIS, I think that aggregate functions are
>> a logical next step. Something to seriously consider in 2.6.
>
> As already stated before, I am worrying about current developments around a
> lot of features looking like database features :
> * Table joins
> * relations
> * "SQL-like" processing
> * .. including aggregates
>
> Implementing all these features on top of QGIS seems reinventing the wheel,
> and database wheels are particularly hard to design and implement well.
> I really think we should re-study the global design of such features to have a
> clear and clean strategy instead of stacking features on features, lacking
> coherency.
>
> As stated by Regis, basing this work on top of SQLite may be the best option,
> but more study has to be done and a general agreement is needed to go this
> way.
>
> Vincent
>
>>
>> Andreas
>>
>> Am 23.05.2014 17:34, schrieb G. Allegri:
>>> Recently I had to calculate the relative dimension of a polygon relative
>>> to the others of the same class. I know there are a lot of way to do it,
>>> inside and outside QGIS, but I wondered if field calculator (and
>>> expressions in general) couls be extended to accompish this kind of
>>> tasks.
>>>
>>> The approach would be something similar WINDOW functions in Postgresql
>>> [1], where for each record a new value will be calculated on the basis
>>> of other records (filtered or not).
>>>
>>> Has this ever been discussed? Is it something that could fit QGIS
>>> expressions?
>>>
>>> giovanni
>>>
>>> [1] http://www.postgresql.org/docs/9.1/static/tutorial-window.html
>>>
>>>
>>>
>>> _______________________________________________
>>> Qgis-developer mailing list
>>> [hidden email]
>>> http://lists.osgeo.org/mailman/listinfo/qgis-developer
>>
>> _______________________________________________
>> Qgis-developer mailing list
>> [hidden email]
>> http://lists.osgeo.org/mailman/listinfo/qgis-developer

_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

giohappy
We will need these aggregate functions in forms for information purposes
and in the print composer for generating reports. Having to write
Postgis-Views would work for database pros. But let's not forget that
not all of our users are database professionals and not all of the users
have their data in Postgis.

Personally, I don't care where these aggregate functions are implemented
and executed, as long as they are easy to use and do not enforce the
usage of a specific data format.

I totally agree Andreas.
Whatever the way it would be implemented, I hope it won't become a blank textarea where to write some "special" SQL.
I imagine something integrated into the rest of expressions GUI.
 

Andreas


Am <a href="tel:27.05.2014%2009" value="+12705201409">27.05.2014 09:21, schrieb Vincent Picavet:
> Hello Andreas,
>
> Le mardi 27 mai 2014 10:28:13, Andreas Neumann a écrit :
>> Hi,
>>
>> We would definitely need this functionality for our application modules
>> - we need functions like "min", "max", "mean", "sum", "average" to work
>> on 1:n relations.
>>
>> Now that we have relations in QGIS, I think that aggregate functions are
>> a logical next step. Something to seriously consider in 2.6.
>
> As already stated before, I am worrying about current developments around a
> lot of features looking like database features :
> * Table joins
> * relations
> * "SQL-like" processing
> * .. including aggregates
>
> Implementing all these features on top of QGIS seems reinventing the wheel,
> and database wheels are particularly hard to design and implement well.
> I really think we should re-study the global design of such features to have a
> clear and clean strategy instead of stacking features on features, lacking
> coherency.
>
> As stated by Regis, basing this work on top of SQLite may be the best option,
> but more study has to be done and a general agreement is needed to go this
> way.
>
> Vincent
>
>>
>> Andreas
>>
>> Am 23.05.2014 17:34, schrieb G. Allegri:
>>> Recently I had to calculate the relative dimension of a polygon relative
>>> to the others of the same class. I know there are a lot of way to do it,
>>> inside and outside QGIS, but I wondered if field calculator (and
>>> expressions in general) couls be extended to accompish this kind of
>>> tasks.
>>>
>>> The approach would be something similar WINDOW functions in Postgresql
>>> [1], where for each record a new value will be calculated on the basis
>>> of other records (filtered or not).
>>>
>>> Has this ever been discussed? Is it something that could fit QGIS
>>> expressions?
>>>
>>> giovanni
>>>
>>> [1] http://www.postgresql.org/docs/9.1/static/tutorial-window.html
>>>
>>>
>>>
>>> _______________________________________________
>>> Qgis-developer mailing list
>>> [hidden email]
>>> http://lists.osgeo.org/mailman/listinfo/qgis-developer
>>
>> _______________________________________________
>> Qgis-developer mailing list
>> [hidden email]
>> http://lists.osgeo.org/mailman/listinfo/qgis-developer

_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer



--

_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

Hugo Mercier
In reply to this post by Régis Haubourg
Le 27/05/2014 15:03, Régis Haubourg a écrit :

> Hugo Mercier wrote
>> Or
>>> calculated fields in view can't be explicitly cast, so QGIS should have
>>> to
>>> guess data type based on a data scan (a major unadressed issue of sqlite)
>>
>> Hmmm I wasn't aware of this limitation in SQLITE views :(
>
> Yes, SQLITE does dynamic typing, so user or provider has to scan values to
> guess the right type.
> Here is a sqlite topic on that [0]
> And here my initial post in qgis list [1]
>
> [0]
> http://sqlite.1065341.n5.nabble.com/Computed-columns-in-VIEWs-return-NULL-but-should-be-able-to-be-typed-Any-ideas-td56769.html#a56770
>
> [1]
> http://osgeo-org.1560.x6.nabble.com/Spatialite-can-t-type-fields-of-a-view-td5058436.html
>

Ok. Strange that type "afinity" is available for tables but not for views.

Anyway, the point here is to use SQLite and its virtual table feature as
a way to do SQL queries on any QGIS layer source.
In that particular case, original types of columns can be accessed
directly by the function used to expose QGIS layers as SQLite tables.

And if we need views on top of that, we can find some proprietary
representation, either on one side or the other (QGIS or a particular
table in SQLite).

Offering SQL machinery to any QGIS data source can be done by using
SQLite internally. And if we need to add some special constraints on the
SQLite schema for that, then it's not perfect, but it is not a big deal.
We would then have QGIS-enabled SQLite files (as Spatialite geometry
tables must have a correct row in geometry_columns and so on)

In the other hand, being able to use any features of any SQLite file as
a valid QGIS data source is a more difficult problem, I think.

_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

Hugo Mercier
In reply to this post by giohappy
Le 27/05/2014 16:09, G. Allegri a écrit :

>     We will need these aggregate functions in forms for information purposes
>     and in the print composer for generating reports. Having to write
>     Postgis-Views would work for database pros. But let's not forget that
>     not all of our users are database professionals and not all of the users
>     have their data in Postgis.
>
>     Personally, I don't care where these aggregate functions are implemented
>     and executed, as long as they are easy to use and do not enforce the
>     usage of a specific data format.
>
>
> I totally agree Andreas.
> Whatever the way it would be implemented, I hope it won't become a blank
> textarea where to write some "special" SQL.
> I imagine something integrated into the rest of expressions GUI.
>  

Exactly.

But we have to discuss the technical details :)
What we've been building so far is some in-house implementation of a
partial database engine. And when we want to add more new powerful
features, then ... what we have is too limited.
So the idea is to use a "real" database engine to deal with everything
that is related to a database AND to be transparent for most of the
users. So yes, the objective, in my opinion is to keep existing GUI
parts and always have end-users in mind when adding a new feature (like
aggregation).

So yeah, the challenge is how to use SQLite as the underlying engine,
when the user inputs QgsExpressions or don't want to have anything to do
with SQLite (part of the answer should be to have an expression to SQL
converter, indeed)

_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

Matthias Kuhn
Hi All,

As the responsible person for QGIS relations, I feel obliged to share my
thoughts in this discussion.

First of all a few notes about QGIS expressions and their current state:
QGIS expressions are a nice and handy functionality to quickly calculate
values based on a single feature in QGIS. This serves a lot of use-cases
quick and easy. However, in their current state, they don't allow to do
exactly what this thread originates from: aggregate and join data from
other layers or use subqueries.

I like the idea of using sqlites virtual tables to have immediate access
to a huge base of functionality offered by sqlite to do complex queries.
If we introduce this support, we have immediate support for a wide range
of database functionality with a few lines of code. The only thing I am
not sure (and I don't know the virtual tables implementation enough to
answer this question) is if it is able to delegate cross-table queries
to the original database. In short: can I do a request that requires
data from different tables of the same database and have it executed
directly inside the database? My suspicion is no, because the sqlite
virtual tables will be known to sqlite as QGIS tables and it will still
query the tables through the QGIS provider, therefore calculating e.g. a
max functionality by querying the QGIS provider for all features and
then calculating the "max" locally and not on the server side. This
would be a major performance impact for customers having a single
database that could do this calculation for us instead of doing this
ourselves.

Instead, if we have QGIS expressions (or a QGIS query implementation on
top of it) support for this, we are able to tell, if different tables
are from the same database and therefore if we are able to delegate the
whole join/aggregate/subquery whatever job to the database and let the
database do what it's good at.

Therefore my question to the folks who know the sqlite virtual table
code: is it possible to have sqlite virtual tables forward cross table
queries to the database itself? Or is it possible to get access to the
parsed query tree (or whatever the name of that may be) to determine
based on QGIS side based on the parsed query if we are able to optimize
by forwarding to the database.

Concerning joins, merging this code with the relations seems a viable
option for me in the long run. Currently, relations do no caching and
do not "hard join" the data on the other table (meaning, joined fields
are not available as fields) and maybe there are other things missing
(I remember a thread here on the ML about this before)

Kind regards,
Matthias
_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

Hugo Mercier
Le 28/05/2014 08:35, Matthias Kuhn a écrit :

> Hi All,
>
> As the responsible person for QGIS relations, I feel obliged to share my
> thoughts in this discussion.
>
> First of all a few notes about QGIS expressions and their current state:
> QGIS expressions are a nice and handy functionality to quickly calculate
> values based on a single feature in QGIS. This serves a lot of use-cases
> quick and easy. However, in their current state, they don't allow to do
> exactly what this thread originates from: aggregate and join data from
> other layers or use subqueries.

Yes exactly. We have more or less the WHERE clause with expressions.
Aggregates, joins and subqueries could be built using something else
(relations?)

>
> I like the idea of using sqlites virtual tables to have immediate access
> to a huge base of functionality offered by sqlite to do complex queries.
> If we introduce this support, we have immediate support for a wide range
> of database functionality with a few lines of code. The only thing I am
> not sure (and I don't know the virtual tables implementation enough to
> answer this question) is if it is able to delegate cross-table queries
> to the original database. In short: can I do a request that requires
> data from different tables of the same database and have it executed
> directly inside the database? My suspicion is no, because the sqlite
> virtual tables will be known to sqlite as QGIS tables and it will still
> query the tables through the QGIS provider, therefore calculating e.g. a
> max functionality by querying the QGIS provider for all features and
> then calculating the "max" locally and not on the server side. This
> would be a major performance impact for customers having a single
> database that could do this calculation for us instead of doing this
> ourselves.
>
> Instead, if we have QGIS expressions (or a QGIS query implementation on
> top of it) support for this, we are able to tell, if different tables
> are from the same database and therefore if we are able to delegate the
> whole join/aggregate/subquery whatever job to the database and let the
> database do what it's good at.
>
> Therefore my question to the folks who know the sqlite virtual table
> code: is it possible to have sqlite virtual tables forward cross table
> queries to the database itself? Or is it possible to get access to the
> parsed query tree (or whatever the name of that may be) to determine
> based on QGIS side based on the parsed query if we are able to optimize
> by forwarding to the database.
>

AFAIK, you do not have direct access to the query sent to a virtual
table. But the SQLite engine will ask your virtual table driver what is
the best way to resolve a constraint on columns.
For instance, if the original query has a WHERE 'a = 2', then the driver
will be asked if it knows how to quickly resolve this, using indexes
(have a look at xBestIndex at http://www.sqlite.org/vtab.html)
So I guess, you could use the remote database indexes.
But, yes that would still be not very efficient if you want to query two
tables of the same postgis database.

In my mind, SQLite virtual tables are interesting for offering a more or
less relational view on data sources that are not originally designed
for that. But it would be suboptimal for already powerful databases.

We then have two kinds of "database backends" : real databases (postgis
and ... what else ?), and pseudo-databases through SQLite virtual tables.

And we would have a conversion from QgsExpression + relations to
different dialects of SQL (SQLite / PostgreSQL / ...) in each provider ?
Does it make sense for you ?

> Concerning joins, merging this code with the relations seems a viable
> option for me in the long run. Currently, relations do no caching and
> do not "hard join" the data on the other table (meaning, joined fields
> are not available as fields) and maybe there are other things missing
> (I remember a thread here on the ML about this before)
>
Yes in my opinion, we should try to have the same code for what we call
"joins" and more general relations. And on the GUI part, some
redundancies between these two concepts may have to be reduced as well.

_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

Hugo Mercier
In reply to this post by Régis Haubourg
Le 27/05/2014 15:03, Régis Haubourg a écrit :

> Hugo Mercier wrote
>> Or
>>> calculated fields in view can't be explicitly cast, so QGIS should have
>>> to
>>> guess data type based on a data scan (a major unadressed issue of sqlite)
>>
>> Hmmm I wasn't aware of this limitation in SQLITE views :(
>
> Yes, SQLITE does dynamic typing, so user or provider has to scan values to
> guess the right type.
> Here is a sqlite topic on that [0]
> And here my initial post in qgis list [1]
>
> [0]
> http://sqlite.1065341.n5.nabble.com/Computed-columns-in-VIEWs-return-NULL-but-should-be-able-to-be-typed-Any-ideas-td56769.html#a56770
>
> [1]
> http://osgeo-org.1560.x6.nabble.com/Spatialite-can-t-type-fields-of-a-view-td5058436.html

Hi,
Still on this topic. I made a few tests with sqlite views.
Indeed you cannot enforce a particular column type, but :
* you can convert values with CAST (casting 'foobar' to integer gives
0), even if the resulting type is undefined
* QGIS does not seem to have any particular problem with untyped SQLite
columns, they will be reported as TEXT (QString) in the layer properties
* Even if the column is untyped, each value has its own type. So if a
column results from a CAST, to integer say, then the corresponding
attribute will be in a QVariant typed as an integer when it is fetched.
You can then use this value in a QgsExpression or a categorized
symbology as an integer.

Am I missing other use cases where not having a column type is a problem ?
_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

Régis Haubourg
Hugo Mercier wrote
Le 27/05/2014 15:03, Régis Haubourg a écrit :
> Hugo Mercier wrote
>> Or
>>> calculated fields in view can't be explicitly cast, so QGIS should have
>>> to
>>> guess data type based on a data scan (a major unadressed issue of sqlite)
>>
>> Hmmm I wasn't aware of this limitation in SQLITE views :(
>
> Yes, SQLITE does dynamic typing, so user or provider has to scan values to
> guess the right type.
> Here is a sqlite topic on that [0]
> And here my initial post in qgis list [1]
>
> [0]
> http://sqlite.1065341.n5.nabble.com/Computed-columns-in-VIEWs-return-NULL-but-should-be-able-to-be-typed-Any-ideas-td56769.html#a56770
>
> [1]
> http://osgeo-org.1560.x6.nabble.com/Spatialite-can-t-type-fields-of-a-view-td5058436.html

Hi,
Still on this topic. I made a few tests with sqlite views.
Indeed you cannot enforce a particular column type, but :
* you can convert values with CAST (casting 'foobar' to integer gives
0), even if the resulting type is undefined
* QGIS does not seem to have any particular problem with untyped SQLite
columns, they will be reported as TEXT (QString) in the layer properties
* Even if the column is untyped, each value has its own type. So if a
column results from a CAST, to integer say, then the corresponding
attribute will be in a QVariant typed as an integer when it is fetched.
You can then use this value in a QgsExpression or a categorized
symbology as an integer.

Am I missing other use cases where not having a column type is a problem ?
_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
Good news!
I couldn't achieve that with views in QGIS 1.8, all field cast to numeric were converted in Qstrings.

Régis
Reply | Threaded
Open this post in threaded view
|

Re: have aggregate/window expressions ever been discussed?

Matthias Kuhn
In reply to this post by Hugo Mercier

On 05/28/2014 10:30 AM, Hugo Mercier wrote:

> Le 28/05/2014 08:35, Matthias Kuhn a écrit :
>> Hi All,
>>
>> As the responsible person for QGIS relations, I feel obliged to share my
>> thoughts in this discussion.
>>
>> First of all a few notes about QGIS expressions and their current state:
>> QGIS expressions are a nice and handy functionality to quickly calculate
>> values based on a single feature in QGIS. This serves a lot of use-cases
>> quick and easy. However, in their current state, they don't allow to do
>> exactly what this thread originates from: aggregate and join data from
>> other layers or use subqueries.
> Yes exactly. We have more or less the WHERE clause with expressions.
> Aggregates, joins and subqueries could be built using something else
> (relations?)
Well relations would need to be integrated into some form of syntax
first. And IMO this should be integrated nicely into the QgsExpression
system to avoid having yet another candidate for this thread. The output
of a relational query should be accepted as a node in a QgsExpression,
and it should be able to insert a QgsExpression as part of this a
relational query. So I think the best way is to extend expressions to
support iterators (and maybe single features?) as return values. But I
know that there are other opinions out there ;)

>
>> I like the idea of using sqlites virtual tables to have immediate access
>> to a huge base of functionality offered by sqlite to do complex queries.
>> If we introduce this support, we have immediate support for a wide range
>> of database functionality with a few lines of code. The only thing I am
>> not sure (and I don't know the virtual tables implementation enough to
>> answer this question) is if it is able to delegate cross-table queries
>> to the original database. In short: can I do a request that requires
>> data from different tables of the same database and have it executed
>> directly inside the database? My suspicion is no, because the sqlite
>> virtual tables will be known to sqlite as QGIS tables and it will still
>> query the tables through the QGIS provider, therefore calculating e.g. a
>> max functionality by querying the QGIS provider for all features and
>> then calculating the "max" locally and not on the server side. This
>> would be a major performance impact for customers having a single
>> database that could do this calculation for us instead of doing this
>> ourselves.
>>
>> Instead, if we have QGIS expressions (or a QGIS query implementation on
>> top of it) support for this, we are able to tell, if different tables
>> are from the same database and therefore if we are able to delegate the
>> whole join/aggregate/subquery whatever job to the database and let the
>> database do what it's good at.
>>
>> Therefore my question to the folks who know the sqlite virtual table
>> code: is it possible to have sqlite virtual tables forward cross table
>> queries to the database itself? Or is it possible to get access to the
>> parsed query tree (or whatever the name of that may be) to determine
>> based on QGIS side based on the parsed query if we are able to optimize
>> by forwarding to the database.
>>
> AFAIK, you do not have direct access to the query sent to a virtual
> table. But the SQLite engine will ask your virtual table driver what is
> the best way to resolve a constraint on columns.
> For instance, if the original query has a WHERE 'a = 2', then the driver
> will be asked if it knows how to quickly resolve this, using indexes
> (have a look at xBestIndex at http://www.sqlite.org/vtab.html)
> So I guess, you could use the remote database indexes.
> But, yes that would still be not very efficient if you want to query two
> tables of the same postgis database.
>
> In my mind, SQLite virtual tables are interesting for offering a more or
> less relational view on data sources that are not originally designed
> for that. But it would be suboptimal for already powerful databases.
>
> We then have two kinds of "database backends" : real databases (postgis
> and ... what else ?), and pseudo-databases through SQLite virtual tables.
>
> And we would have a conversion from QgsExpression + relations to
> different dialects of SQL (SQLite / PostgreSQL / ...) in each provider ?
> Does it make sense for you ?
It does make sense.

Real Databases: PostgreSQL, Oracle, MSSQL, (sqlite, native SQL through OGR)

What would you like to use virtual tables drivers for?
The idea of having it available as another provider for layer
definitions sounds good to me anyway.
But should it also be possible to use sqlite syntax wherever we have
expression syntax integrated now?
Then the UI would be "Field / Expression / SQLite"?
In terms of control over what gets executed where and to translate it to
native SQL wherever possible.
In terms of quickly available functionality, sqlite virtual tables are
most likely an easy shot. Is there a possibility to overcome the problem
of "control"? Would be nice if yes, but I don't know the people behind
sqlite.

Is it worth adding two different syntaxes that may and will confuse
users for their advantages? In terms of usability I doubt that it is a
good idea (Apple would immediately fire any CTO who proposes having this
redundancy, but then, they have tons more development power than we have
:-) )

>
>> Concerning joins, merging this code with the relations seems a viable
>> option for me in the long run. Currently, relations do no caching and
>> do not "hard join" the data on the other table (meaning, joined fields
>> are not available as fields) and maybe there are other things missing
>> (I remember a thread here on the ML about this before)
>>
> Yes in my opinion, we should try to have the same code for what we call
> "joins" and more general relations. And on the GUI part, some
> redundancies between these two concepts may have to be reduced as well.
Sure thing. My idea is to allow some more control in the relations
window. For this purpose, I think a checkbox to "hard join" some fields
(doesn't need to be all, does it?) should do the trick I think.

Then we can also replace the fields properties "Join" dialog with a
"Relation" dialog that gives some control over relations that include
the current layer, so a user doesn't always have to resort to the
project properties.
_______________________________________________
Qgis-developer mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/qgis-developer
12