[gdal-dev] Gdal2tiles gains parallel processing features

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[gdal-dev] Gdal2tiles gains parallel processing features

Grégory Bataille
Hi all,

I just wanted to announce that after a few months of work (took long, I got lazy), gdal2tiles has gained parallel computing abilities

It is now on trunk.

A few things to know:
- I took upon me to rewrite the script almost entirely to make it more modular, testable, ...
- Because the rewrite + the pararellization are a big and risky work (since there were no tests really), there is a new gdal2tiles_old.py script on trunk to provide an easy "back-out" for people who would get into trouble in their production
- The script continues to work in a single thread/process mode by default
- Actually, if you chose the default (or explicitely ask for 1 single process), the script will not use any python multiprocessing library (which have been known to be flaky). That should ensure that default behavior is not disturbed.
- To activate parallel processing, you need to pass a new flag --processes=n

Risks:
- It is a major rewrite but algorithms have been barely touched, only the code organization/modularity.
- The new version is running on my company's production. It's mostly one use case but we are tiling ~1000 geotiff (in any SRS), in RGB or grayscale, to a mercator output. It uses python 2.7 (this is actually a mistake and should move to python 3.5 next week). That use case at least works well.


I'm open to remarks if you have any.

Cheers

---
Gregory Bataille

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Gdal2tiles gains parallel processing features

Even Rouault-2

On vendredi 29 septembre 2017 17:50:11 CEST Grégory Bataille wrote:

> Hi all,

>

> I just wanted to announce that after a few months of work (took long, I got

> lazy), *gdal2tiles has gained parallel computing abilities*

>

> It is now *on trunk*.

 

Thanks for your great work on this

 

> - Because the rewrite + the pararellization are a big and risky work (since

> there were no tests really), there is a new gdal2tiles_old.py script on

> trunk to provide an easy "back-out" for people who would get into trouble

> in their production

 

Note: I actually removed gdal2tiles_old.py when merging your work back into SVN (should have told you before). It was the same as the version you can find in the 2.2 branch, so one could still grab it from there if really needed.

 

Even

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Gdal2tiles gains parallel processing features

Grégory Bataille
oh... ok for gdal2tiles_old.py.
I just wished to make a potential rollback "easy"

Anyway, that's good enough for me :)

Thanks


---
Gregory Bataille

On Fri, Sep 29, 2017 at 5:59 PM, Even Rouault <[hidden email]> wrote:

On vendredi 29 septembre 2017 17:50:11 CEST Grégory Bataille wrote:

> Hi all,

>

> I just wanted to announce that after a few months of work (took long, I got

> lazy), *gdal2tiles has gained parallel computing abilities*

>

> It is now *on trunk*.

 

Thanks for your great work on this

 

> - Because the rewrite + the pararellization are a big and risky work (since

> there were no tests really), there is a new gdal2tiles_old.py script on

> trunk to provide an easy "back-out" for people who would get into trouble

> in their production

 

Note: I actually removed gdal2tiles_old.py when merging your work back into SVN (should have told you before). It was the same as the version you can find in the 2.2 branch, so one could still grab it from there if really needed.

 

Even

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Gdal2tiles gains parallel processing features

Angelos Tzotsos
In reply to this post by Even Rouault-2
Thank you for this feature!

Best,
Angelos

On 09/29/2017 06:59 PM, Even Rouault wrote:
On vendredi 29 septembre 2017 17:50:11 CEST Grégory Bataille wrote:
Hi all,

I just wanted to announce that after a few months of work (took long, I got
lazy), *gdal2tiles has gained parallel computing abilities*

It is now *on trunk*.
Thanks for your great work on this

- Because the rewrite + the pararellization are a big and risky work (since
there were no tests really), there is a new gdal2tiles_old.py script on
trunk to provide an easy "back-out" for people who would get into trouble
in their production
Note: I actually removed gdal2tiles_old.py when merging your work back into SVN (should 
have told you before). It was the same as the version you can find in the 2.2 branch, so one 
could still grab it from there if really needed.

Even



_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev


-- 
Angelos Tzotsos, PhD
Charter Member
Open Source Geospatial Foundation
http://users.ntua.gr/tzotsos

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Gdal2tiles gains parallel processing features

Grégory Bataille
In reply to this post by Grégory Bataille
Hey Jeremy,

I would say I don't know anything about GeoPackage or tile grids profile.
What I can say:
- the new version only brings parallelism to the script
- because of the rewrite, I would hope anything to be slightly easier.

Cheers


---
Gregory Bataille

On Fri, Sep 29, 2017 at 9:24 PM, Jeremy Palmer <[hidden email]> wrote:
Thank you for these great improvements Grégory.

I'm wondering does gdal2tiles now support output to GeoPackage to save storage of tiles on disk? Also how hard would it be to support tile grids profiles with custom origins and resolutions in the new architecture?  

Cheers
Jeremy

On Sat, Sep 30, 2017 at 4:50 AM, Grégory Bataille <[hidden email]> wrote:
Hi all,

I just wanted to announce that after a few months of work (took long, I got lazy), gdal2tiles has gained parallel computing abilities

It is now on trunk.

A few things to know:
- I took upon me to rewrite the script almost entirely to make it more modular, testable, ...
- Because the rewrite + the pararellization are a big and risky work (since there were no tests really), there is a new gdal2tiles_old.py script on trunk to provide an easy "back-out" for people who would get into trouble in their production
- The script continues to work in a single thread/process mode by default
- Actually, if you chose the default (or explicitely ask for 1 single process), the script will not use any python multiprocessing library (which have been known to be flaky). That should ensure that default behavior is not disturbed.
- To activate parallel processing, you need to pass a new flag --processes=n




_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Gdal2tiles gains parallel processing features

Even Rouault-2

Jeremy,

 

From a quick look, both should be achievable:

 

- GeoPackage output: I guess you ask because of multiprocessing ? Otherwise a plain gdal_translate will do ;-) Multiprocessing brings the interesting question of write concurrency with sqlite3. So I think "manual" writing with sqlite3 low level API would be needed to add tiles (the GPKG driver could be used to create the structure of the DB). Much likely the resampling/reprojection/creation of PNG/JPEG blob overhead is much larger than writing to the DB, so locking shouldn't hopefully be too much of a bottleneck

 

- custom tiling schemes. should be doable too. If you need a tiling scheme with a discontinuity at the antimeridian ;-), that'd probably a bit more difficult (but that's an issue that must already exists with the existing GlobalMercator or GlobalGeodetic profiles)

 

 

Grégory,

 

GeoPackage is a SQLite3 based container for vector and raster layers. For rasters, it is tiles organized in a pyramid structure, with tiles in PNG or JPEG. So basically it is ""just"" a matter of writing gdal2tiles output files as records in a database.

See http://www.geopackage.org/spec/ and

http://gdal.org/drv_geopackage_raster.html

 

 

Even

 

> Hey Jeremy,

>

> I would say I don't know anything about GeoPackage or tile grids profile.

> What I can say:

> - the new version only brings parallelism to the script

> - because of the rewrite, I would hope anything to be slightly easier.

>

> Cheers

>

>

> ---

> Gregory Bataille

>

> On Fri, Sep 29, 2017 at 9:24 PM, Jeremy Palmer <[hidden email]> wrote:

> > Thank you for these great improvements Grégory.

> >

> > I'm wondering does gdal2tiles now support output to GeoPackage to save

> > storage of tiles on disk? Also how hard would it be to support tile grids

> > profiles with custom origins and resolutions in the new architecture?

> >

> > Cheers

> > Jeremy

> >

> > On Sat, Sep 30, 2017 at 4:50 AM, Grégory Bataille <

> >

> > [hidden email]> wrote:

> >> Hi all,

> >>

> >> I just wanted to announce that after a few months of work (took long, I

> >> got lazy), *gdal2tiles has gained parallel computing abilities*

> >>

> >> It is now *on trunk*.

> >>

> >> *A few things to know:*

> >> - I took upon me to rewrite the script almost entirely to make it more

> >> modular, testable, ...

> >> - Because the rewrite + the pararellization are a big and risky work

> >> (since there were no tests really), there is a new gdal2tiles_old.py

> >> script

> >> on trunk to provide an easy "back-out" for people who would get into

> >> trouble in their production

> >> - The script continues to work in a single thread/process mode by default

> >> - Actually, if you chose the default (or explicitely ask for 1 single

> >> process), the script will not use any python multiprocessing library

> >> (which

> >> have been known to be flaky). That should ensure that default behavior is

> >> not disturbed.

> >> - To activate parallel processing, you need to pass a new flag

> >> *--processes=n*

 

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev