Quantcast

[gdal-dev] Write overviews directly to S3

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[gdal-dev] Write overviews directly to S3

Jeremy Palmer-2
Hi All,

Is is possible to directly write external overview to a S3 bucket? With GDAL 2.1.2 I get an error reporting that seek is not supported when writing to vsis3:

gdaladdo /vsis3/my-bucket/data/1000.tif 2
ERROR 6: Seek not supported on writable /vsis3 files
ERROR 1: _tiffSeekProc:Resource temporarily unavailable
ERROR 1: _tiffWriteProc:Resource temporarily unavailable
ERROR 1: /vsis3/my-bucket/data/1000.tif.ovr:Error writing TIFF header
ERROR 6: Seek not supported on writable /vsis3 files
ERROR 1: _tiffSeekProc:Resource temporarily unavailable
ERROR 1: TIFFAdvanceDirectory:/vsis3/my-bucket/data/1000.tif.ovr: Error fetching directory count
ERROR 6: Seek not supported on writable /vsis3 files
ERROR 1: _tiffSeekProc:Resource temporarily unavailable
ERROR 1: TIFFAdvanceDirectory:/vsis3/my-bucket/data/1000.tif.ovr: Error fetching directory count
ERROR 6: Seek not supported on writable /vsis3 files
ERROR 1: _tiffSeekProc:Resource temporarily unavailable
ERROR 1: TIFFLinkDirectory:Error fetching directory count
ERROR 1: Only read-only mode is supported for /vsicurl
Overview building failed.

I have both read and write access to this bucket and gdalinfo works on the source S3 file.

Cheers
Jeremy

This message contains information, which may be in confidence and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or [hidden email]) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank You.
_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Write overviews directly to S3

Even Rouault-2

Hi Jeremy,

 

>

> Is is possible to directly write external overview to a S3 bucket? With GDAL

> 2.1.2 I get an error reporting that seek is not supported when writing to

> vsis3:

 

No, /vsis3/ only supports sequential writing in files (the original use case was to generate and upload a huge CSV file on the fly). I don't have all the details in mind but random writing might not be possible given the S3 API constraints, at least with the multipart upload API which is used currently.

 

And another constraint of the current implementation is that a /vsis3/ file is either read-only or write-only, but not a mix of both, which would be needed for gdaladdo internal overviews. Perhaps external overview would work, but I'm not completely sure as creating a TIFF file might require seeking.

 

Perhaps a fully fledged read-write-update file system would be possible, but that wasn't in my initial design constraints.

 

Even

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Write overviews directly to S3

Jeremy Palmer-2
Hi Even,

On 18/05/2017, at 9:12 PM, Even Rouault <[hidden email]> wrote:
>
> Is is possible to directly write external overview to a S3 bucket? With GDAL
> 2.1.2 I get an error reporting that seek is not supported when writing to
> vsis3:

 

No, /vsis3/ only supports sequential writing in files (the original use case was to generate and upload a huge CSV file on the fly). I don't have all the details in mind but random writing might not be possible given the S3 API constraints, at least with the multipart upload API which is used currently.

OK thanks for clarifying the situation.

 

And another constraint of the current implementation is that a /vsis3/ file is either read-only or write-only, but not a mix of both, which would be needed for gdaladdo internal overviews. Perhaps external overview would work, but I'm not completely sure as creating a TIFF file might require seeking.

 

Perhaps a fully fledged read-write-update file system would be possible, but that wasn't in my initial design constraints.

For now we will work around the issue.

Thank for your help.

Cheers,
Jeremy



This message contains information, which may be in confidence and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or [hidden email]) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank You.

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Write overviews directly to S3

Peter Schmitt
Hi Jeremy,

We have come up with one technique to read/write directly to/from s3 using /vsis3/ and a simple file writer class a colleague wrote: https://gist.github.com/pedros007/55c6e33224596fb4d8e9e6b68b24ed9b  In fact, I used this last week to add internal overviews to some of our images in S3.  Here's a high-level overview of my Python implementation using gdal-2.2.0:

# Open the image as a /vsimem/ file:
vsi_file = '/vsimem/image.tif'
ds = gdal.Translate('/vsis3/bucket/prefix.tif', vsi_file)

# Add overviews
err = ds.BuildOverviews('AVERAGE', [2,4,8,16,32,64])
if err != 0:
raise BuildOverviewsError('Failed to build overviews for %s' % vsi_file)
# Some overview levels wouldn't get written unless we flush & close the data set.
ds.FlushCache()
ds = None
# even though the dataset is gone, the vsi_file still exists in memory.

# Write the data back to s3 using a class my colleague wrote.
try:
  vsimem_file = gdalutil.SimpleVSIMEMFile(vsi_file)
    s3 = boto3.resource('s3')
    obj = s3.Object(bucket_name='bucket', key='prefix.tif')
    # the Object#put API needs a file-like object: http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Object.put
    # the SimpleVSIMemFile implements a file-like object that uses the VSI API to read a /vsimem/ file.
    obj.put(Body=vsimem_file)
finally:
    vsimem_file = None

# Remove the /vsimem/ file. (you probably want this in a try/finally block to ensure it gets deleted)
gdal.Unlink(vsi_file)

The above process will be memory constrained.  Make sure your instance is appropriately sized!  You probably want to run this on an EC2 instance in the same region as the data sitting in s3.


Even - I found methods to check for VSI errors:

Are these intended for public consumption?  When reading from a /vsimem/foo.shp with gdal-2.2.0, gdal.VSIGetLastErrorMsg() reported an error "No such file or directory".   The VSI reader actually works regardless if it's a /vsimem/ file or local path.  A colleague used a local path and the error reported was a little more verbose:  It printed something like "/mnt/data/foo.sbn: No such file or directory".  I did not see that error gdal-2.1.3.  We don't need the sbn file (we're building a qix file instead). I was surprised that the VSI system flagged this as an error.  Seems like it should be more of a warning.  My motivation for using adding error checking:  sometimes the command-line gdal_translate with /vsis3/ paths would yield IFD read errors (I don't have the exact error message handy).  Repeating the command would result in successful translates.  I always assumed it was some transient packet loss/network error.

Cheers,
Pete

On Thu, May 18, 2017 at 3:14 AM, Jeremy Palmer <[hidden email]> wrote:
Hi Even,

On 18/05/2017, at 9:12 PM, Even Rouault <[hidden email]> wrote:
>
> Is is possible to directly write external overview to a S3 bucket? With GDAL
> 2.1.2 I get an error reporting that seek is not supported when writing to
> vsis3:

 

No, /vsis3/ only supports sequential writing in files (the original use case was to generate and upload a huge CSV file on the fly). I don't have all the details in mind but random writing might not be possible given the S3 API constraints, at least with the multipart upload API which is used currently.

OK thanks for clarifying the situation.

 

And another constraint of the current implementation is that a /vsis3/ file is either read-only or write-only, but not a mix of both, which would be needed for gdaladdo internal overviews. Perhaps external overview would work, but I'm not completely sure as creating a TIFF file might require seeking.

 

Perhaps a fully fledged read-write-update file system would be possible, but that wasn't in my initial design constraints.

For now we will work around the issue.

Thank for your help.

Cheers,
Jeremy



This message contains information, which may be in confidence and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or [hidden email]) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank You.

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev



--
Pete

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Write overviews directly to S3

Even Rouault-2

 

>

> Even - I found methods to check for VSI errors:

> https://gist.github.com/pedros007/55c6e33224596fb4d8e9e6b68b24ed9b#file-simp

> levsimemfile-py-L73-L74

>

> Are these intended for public consumption?

 

With care...

 

> When reading from a

> /vsimem/foo.shp with gdal-2.2.0, gdal.VSIGetLastErrorMsg() reported an

> error "No such file or directory".

 

Probably because the shapefile driver probed a side-car file that didn't exist (the .sbn you mention below) but wasn't required. VSIGetLastErrorMsg() is mostly for internal use by GDALOpen() (at least it was designed for that). Users will normally call gdal.GetLastErrorMsg() or gdal.UseExceptions() to have exceptions for user-visible errors.

 

Rule of thumb:

- gdal.VSIGetLastErrorMsg() can be used if you use low-level API like gdal.VSIFOpenL() that aren't verbose normally.

- for higher level like gdal.Open(), use gdal.GetLastErrorMsg() or gdal.UseExceptions()

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Write overviews directly to S3

Jeremy Palmer-2
In reply to this post by Peter Schmitt
Hi Peter,

On 19/05/2017, at 4:38 AM, Peter Schmitt <[hidden email]> wrote:

We have come up with one technique to read/write directly to/from s3 using /vsis3/ and a simple file writer class a colleague wrote: https://gist.github.com/pedros007/55c6e33224596fb4d8e9e6b68b24ed9b  In fact, I used this last week to add internal overviews to some of our images in S3.  Here's a high-level overview of my Python implementation using gdal-2.2.0:

Thanks! We will look at this.

Cheers
Jeremy



This message contains information, which may be in confidence and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or [hidden email]) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank You.

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Loading...