[gdal-dev] GDAL VRT with Cloud Optimized GeoTIFFs and AWS

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[gdal-dev] GDAL VRT with Cloud Optimized GeoTIFFs and AWS

Jeremy Palmer-3
Hi All,

Does anyone have any tips or experience in trying to serve RGB large imagery multi file datasets hosted on S3 to application servers/containers for bulk tile rendering? Is this possible using VRTs and is the performance manageable when compared to other mounted storage options? 

Thanks,
Jeremy

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: GDAL VRT with Cloud Optimized GeoTIFFs and AWS

Jeremy Palmer-3
I've now found this useful Mapserver wiki page https://github.com/mapserver/mapserver/wiki/Render-images-straight-out-of-S3-with-the-vsicurl-driver. It seems to imply that it's better to merge a collection of dataset tiles into a single big Geotiff rather than create a VRT due to the repeated HTTP calls. I'm still interested in people's experiences using imagery direct from S3 and it's performance. 

Note some of my largest datasets have a total size of 1.5TBs uncompressed Geotiff with about 8000 tiles @ 175mb per tile. I'll likely compress the uncompressed Geotiffs and use JPEG compression, which I estimate can bring the size down to about 200GB total.

Cheers
Jeremy

On Sat, Mar 2, 2019 at 9:15 AM Jeremy Palmer <[hidden email]> wrote:
Hi All,

Does anyone have any tips or experience in trying to serve RGB large imagery multi file datasets hosted on S3 to application servers/containers for bulk tile rendering? Is this possible using VRTs and is the performance manageable when compared to other mounted storage options? 

Thanks,
Jeremy

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: GDAL VRT with Cloud Optimized GeoTIFFs and AWS

rgreenwood
What are the advantages of storing the imagery in S3 as opposed to EBS? I'm using the throughput optimized magnetic EBS which I is a little less expensive than S3. I did some testing a couple years ago and didn't see enough performance gain to justify SSD. Have you tested S3 against any of the EBS options?

Rich


On Sat, Mar 2, 2019 at 7:58 AM Jeremy Palmer <[hidden email]> wrote:
I've now found this useful Mapserver wiki page https://github.com/mapserver/mapserver/wiki/Render-images-straight-out-of-S3-with-the-vsicurl-driver. It seems to imply that it's better to merge a collection of dataset tiles into a single big Geotiff rather than create a VRT due to the repeated HTTP calls. I'm still interested in people's experiences using imagery direct from S3 and it's performance. 

Note some of my largest datasets have a total size of 1.5TBs uncompressed Geotiff with about 8000 tiles @ 175mb per tile. I'll likely compress the uncompressed Geotiffs and use JPEG compression, which I estimate can bring the size down to about 200GB total.

Cheers
Jeremy

On Sat, Mar 2, 2019 at 9:15 AM Jeremy Palmer <[hidden email]> wrote:
Hi All,

Does anyone have any tips or experience in trying to serve RGB large imagery multi file datasets hosted on S3 to application servers/containers for bulk tile rendering? Is this possible using VRTs and is the performance manageable when compared to other mounted storage options? 

Thanks,
Jeremy
_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev


--
Richard W. Greenwood, PLS
www.greenwoodmap.com

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: GDAL VRT with Cloud Optimized GeoTIFFs and AWS

Michael Smith

There are multiple advantages to using s3 over ebs. First is availability. S3 is highly available (and replicated). EBS is limited to the single volume and is much more fragile. EBS is also more expensive for storage than S3. And S3 is available to multiple instances, EBS is not, it can only be mounted to one EC2 instance.

 

Mike

 

 

--

Michael Smith

Remote Sensing/GIS Center

US Army Corps of Engineers

 

 

From: gdal-dev <[hidden email]> on behalf of Richard Greenwood <[hidden email]>
Date: Monday, March 4, 2019 at 9:21 AM
To: Jeremy Palmer <[hidden email]>
Cc: gdal dev <[hidden email]>
Subject: Re: [gdal-dev] GDAL VRT with Cloud Optimized GeoTIFFs and AWS

 

What are the advantages of storing the imagery in S3 as opposed to EBS? I'm using the throughput optimized magnetic EBS which I is a little less expensive than S3. I did some testing a couple years ago and didn't see enough performance gain to justify SSD. Have you tested S3 against any of the EBS options?

 

Rich

 

 

On Sat, Mar 2, 2019 at 7:58 AM Jeremy Palmer <[hidden email]> wrote:

I've now found this useful Mapserver wiki page https://github.com/mapserver/mapserver/wiki/Render-images-straight-out-of-S3-with-the-vsicurl-driver. It seems to imply that it's better to merge a collection of dataset tiles into a single big Geotiff rather than create a VRT due to the repeated HTTP calls. I'm still interested in people's experiences using imagery direct from S3 and it's performance. 

 

Note some of my largest datasets have a total size of 1.5TBs uncompressed Geotiff with about 8000 tiles @ 175mb per tile. I'll likely compress the uncompressed Geotiffs and use JPEG compression, which I estimate can bring the size down to about 200GB total.

 

Cheers

Jeremy

 

On Sat, Mar 2, 2019 at 9:15 AM Jeremy Palmer <[hidden email]> wrote:

Hi All,

 

Does anyone have any tips or experience in trying to serve RGB large imagery multi file datasets hosted on S3 to application servers/containers for bulk tile rendering? Is this possible using VRTs and is the performance manageable when compared to other mounted storage options? 

 

Thanks,

Jeremy

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev


 

--

Richard W. Greenwood, PLS
www.greenwoodmap.com

_______________________________________________ gdal-dev mailing list [hidden email] https://lists.osgeo.org/mailman/listinfo/gdal-dev


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: GDAL VRT with Cloud Optimized GeoTIFFs and AWS

Even Rouault-2
In reply to this post by Jeremy Palmer-3
On dimanche 3 mars 2019 01:58:08 CET Jeremy Palmer wrote:
> I've now found this useful Mapserver wiki page
> https://github.com/mapserver/mapserver/wiki/Render-images-straight-out-of-S3
> -with-the-vsicurl-driver. It seems to imply that it's better to merge a
> collection of dataset tiles into a single big Geotiff rather than create a
> VRT due to the repeated HTTP calls. I'm still interested in people's
> experiences using imagery direct from S3 and it's performance.

Jeremy,

A gigantic TIFF will have rather big TileOffset and TileByteCount arrays, but
probably less big than a VRT referencing TIFF tiles. If MapServer rendering is
the final use case, I bet though that a MapServer tileindex (created with
gdaltindex) referencing /vsis3/ files might be rather convenient & have good
performance.

Even

--
Spatialys - Geospatial professional services
http://www.spatialys.com
_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev