Common SQLite-based dictionaries

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Common SQLite-based dictionaries

Howard Butler-3
All,

libgeotiff and GDAL all use a system of bespoke CSV files for their coordinate system dictionaries. proj.4 uses derived dictionaries made from GDAL's. Each is a slightly different subset and/or mix of the EPSG db along with other catalogs, customizations, and overrides. The situation is messy, fragile, and incomplete, especially for folks like me, who are interested in support of ever more complex systems of horizontal and vertical datums, time epochs associated with them, and direct transformation.

There have been multiple attempts to build a C tribe API that handles the coordinate system description problem, but all have failed for various reasons.The one true library to rule them all is probably a pipe dream, but maybe it is possible to collaborate in a slightly messier way -- at the dictionary level.

One significant technology that was not widely available when GDAL, proj.4, and libgeotiff all originated is SQLite. The idea of a single file, sql'able database is a standard assumption in today's software, especially in things like HTML5 (wars between WebSQL and IndexedDB), just about every significant phone application, and your favorite OGC super format [1].

I'd like to propose an attempt to standardize the GDAL, proj.4, and libgeotiff SRS coordinate system handling dictionaries on a SQLite database that starts with EPSG, with each project adding its own auxiliary tables as necessary. I am writing this message to MetaCRS to see if there is support for such an effort, and to determine if there are other related projects who would like to collaborate at this level.

For the GDAL stack, the benefits of this approach are significant. Multiply-defined, potentially conflicting definitions no longer need to be resolved. The dictionaries could release on their own schedule, rather than with each individual project. Powerful new functionality would be much closer to software developers instead of hidden behind a rather opaque and fragile CSV dictionary generating process. Mundane but important details like multithreaded access get handed off to a library and project who do that stuff all the time instead of one-off implementations inside of each individual project.

Database views/queries could be standardized for common lookups across. Lookups would be faster due to indexed query access. Transformation validation, based on EPSG or other databases, could be provided across all three projects. More complex topics, like those described above, could be developed in a way that have impact across all three projects without tedious implementation.

Consider this email a mix of

1) is this a good idea? What other benefits do you see this approach providing?
2) Does your project want to collaborate on this?
3) Does this belong in MetaCRS?
4) What are the pitfalls that make this untenable?

windmill-tilting'ly yours,

Howard

[1] http://www.geopackage.org/
_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Reply | Threaded
Open this post in threaded view
|

Re: Common SQLite-based dictionaries

rgreenwood
Sounds like a great idea. The build sequence for the CSV files is pretty well documented but it does not invite enhancements. It encourages people to just do their own thing for their own needs.

Do you see this as something that can be done entirely in the database? Seems like there would still be some amount of scripting to build a new release but I haven't thought it thru.

Rich

On Sun, Aug 2, 2015 at 7:15 AM, Howard Butler <[hidden email]> wrote:
All,

libgeotiff and GDAL all use a system of bespoke CSV files for their coordinate system dictionaries. proj.4 uses derived dictionaries made from GDAL's. Each is a slightly different subset and/or mix of the EPSG db along with other catalogs, customizations, and overrides. The situation is messy, fragile, and incomplete, especially for folks like me, who are interested in support of ever more complex systems of horizontal and vertical datums, time epochs associated with them, and direct transformation.

There have been multiple attempts to build a C tribe API that handles the coordinate system description problem, but all have failed for various reasons.The one true library to rule them all is probably a pipe dream, but maybe it is possible to collaborate in a slightly messier way -- at the dictionary level.

One significant technology that was not widely available when GDAL, proj.4, and libgeotiff all originated is SQLite. The idea of a single file, sql'able database is a standard assumption in today's software, especially in things like HTML5 (wars between WebSQL and IndexedDB), just about every significant phone application, and your favorite OGC super format [1].

I'd like to propose an attempt to standardize the GDAL, proj.4, and libgeotiff SRS coordinate system handling dictionaries on a SQLite database that starts with EPSG, with each project adding its own auxiliary tables as necessary. I am writing this message to MetaCRS to see if there is support for such an effort, and to determine if there are other related projects who would like to collaborate at this level.

For the GDAL stack, the benefits of this approach are significant. Multiply-defined, potentially conflicting definitions no longer need to be resolved. The dictionaries could release on their own schedule, rather than with each individual project. Powerful new functionality would be much closer to software developers instead of hidden behind a rather opaque and fragile CSV dictionary generating process. Mundane but important details like multithreaded access get handed off to a library and project who do that stuff all the time instead of one-off implementations inside of each individual project.

Database views/queries could be standardized for common lookups across. Lookups would be faster due to indexed query access. Transformation validation, based on EPSG or other databases, could be provided across all three projects. More complex topics, like those described above, could be developed in a way that have impact across all three projects without tedious implementation.

Consider this email a mix of

1) is this a good idea? What other benefits do you see this approach providing?
2) Does your project want to collaborate on this?
3) Does this belong in MetaCRS?
4) What are the pitfalls that make this untenable?

windmill-tilting'ly yours,

Howard

[1] http://www.geopackage.org/
_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs



--
Richard W. Greenwood, PLS
www.greenwoodmap.com

_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs
Reply | Threaded
Open this post in threaded view
|

Re: Common SQLite-based dictionaries

Howard Butler-3

> On Aug 3, 2015, at 8:03 AM, Richard Greenwood <[hidden email]> wrote:
>
> Do you see this as something that can be done entirely in the database? Seems like there would still be some amount of scripting to build a new release but I haven't thought it thru.

I presume there would be a repo with some scripts and a Makefile that the user would start by fetching the EPSG, any necessary grid files, NADCON/VERTCON supporting info, and then issue 'make'. There would still have to be a process to build the thing, much like the CSV files, but if they were managed as their own project, maybe they would attract contribution. The situation we have now rather repels it, IMO.

Howard


_______________________________________________
MetaCRS mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/metacrs