[gdal-dev] Reading remote jp2k files

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[gdal-dev] Reading remote jp2k files

Matt Hanson-2
Hello,

I'm having trouble reading JP2K files remotely. I originally ran into this problem when trying to use remote Sentinel-2 files as input to the GDAL Warp API in C++. It works when I use a local version of the file.

I'm not sure if this is supported behavior, I thought it was, but to test I tried gdal_translate on a remote file, and one with the local file.

$ gdal_translate https://sentinel-s2-l1c.s3.amazonaws.com/tiles/32/T/QR/2018/1/21/0/B04.jp2 sentinel-test-remote.tif -f GTiff
$ gdal_translate B03.jp2 sentinel-test-local.tif -f GTiff

The local file works fine, but when using the remote file gdal_translate just crashes with no error message.

I'm using GDAL 2.3.1 with openjpeg 2.3, I'm using a docker image developmentseed/geolambda:1.0.0 and you can see the Dockerfile here:


Is this supported behavior, should I be able to read these files remotely, and do windowed reads on them?

matt


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Reading remote jp2k files

Even Rouault-2
Matt,

>
> I'm having trouble reading JP2K files remotely. I originally ran into this
> problem when trying to use remote Sentinel-2 files as input to the GDAL
> Warp API in C++. It works when I use a local version of the file.
>
> I'm not sure if this is supported behavior, I thought it was, but to test I
> tried gdal_translate on a remote file, and one with the local file.
>
> $ gdal_translate
> https://sentinel-s2-l1c.s3.amazonaws.com/tiles/32/T/QR/2018/1/21/0/B04.jp2
> sentinel-test-remote.tif -f GTiff
> $ gdal_translate B03.jp2 sentinel-test-local.tif -f GTiff
>
> The local file works fine, but when using the remote file gdal_translate
> just crashes with no error message.

A process crash ? Can you get a stack trace / valgrind output ?

I tried, and it failed in a regular way, since when you use http:// the HTTP
driver triggers. This driver downloads the whole file in memory before passing
it to the real driver, and it deleted the temp file, whereas the JP2OpenJPEG
driver needs to be able to re-open it. Just fixed that issue

>
> I'm using GDAL 2.3.1 with openjpeg 2.3, I'm using a docker image
> developmentseed/geolambda:1.0.0 and you can see the Dockerfile here:
>
> https://github.com/developmentseed/geolambda/blob/master/Dockerfile
>
> Is this supported behavior, should I be able to read these files remotely,
> and do windowed reads on them?

For windowed reads, you need to prefix with /vsicurl/, but I don't guarantee
the efficiency of this with JPEG2000 in general, and with JP2OpenJPEG in
particular.

Even

--
Spatialys - Geospatial professional services
http://www.spatialys.com
_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Reading remote jp2k files

jvail
Hi,

On 07/25/2018 06:00 PM, Even Rouault wrote:
>> Is this supported behavior, should I be able to read these files remotely,
>> and do windowed reads on them?
>
> For windowed reads, you need to prefix with /vsicurl/, but I don't guarantee
> the efficiency of this with JPEG2000 in general, and with JP2OpenJPEG in
> particular.

Regarding "efficiency" I'd like to add two observations from a little
investigation (not really in-depth) I did recently to compare /vsicurl/
with a tiny, experimental JS lib (fetch & decode single jp2 tiles in
pure JS):

- If I try to fetch a window that is completely within one tile it
*seems* vsicurl is iterating through all tiles and does not stop after
reaching & fetching the "requested" tile.

- Instead of starting from the first tile it could make a guess and
start at that offset and search backwards and forward till it finds the
requested tile index. Although a good "guess" of an offset is difficult
given that tile size might vary greatly.

I think this could probably make fetching parts slightly more efficient
(in terms of no. of requests & data transfer). Certainly it would be
much easier if tile offsets could be included in a jp2 header.

Jan

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Reading remote jp2k files

Poughon Victor
In reply to this post by Even Rouault-2
Hi Even,

> For windowed reads, you need to prefix with /vsicurl/, but I don't guarantee
> the efficiency of this with JPEG2000 in general, and with JP2OpenJPEG in
> particular.

I'm a jp2000 noob but do you know if windowed reads of remote jp2 data could be optimized in some way? I am curious if we could see a "cloud optimized jpeg2000" someday?




Victor Poughon

> -----Message d'origine-----
> De : gdal-dev <[hidden email]> De la part de Even Rouault
> Envoyé : mercredi 25 juillet 2018 18:00
> À : [hidden email]
> Objet : Re: [gdal-dev] Reading remote jp2k files
>
> Matt,
>
> >
> > I'm having trouble reading JP2K files remotely. I originally ran into
> > this problem when trying to use remote Sentinel-2 files as input to
> > the GDAL Warp API in C++. It works when I use a local version of the file.
> >
> > I'm not sure if this is supported behavior, I thought it was, but to
> > test I tried gdal_translate on a remote file, and one with the local file.
> >
> > $ gdal_translate
> > https://sentinel-s2-l1c.s3.amazonaws.com/tiles/32/T/QR/2018/1/21/0/B04
> > .jp2
> > sentinel-test-remote.tif -f GTiff
> > $ gdal_translate B03.jp2 sentinel-test-local.tif -f GTiff
> >
> > The local file works fine, but when using the remote file
> > gdal_translate just crashes with no error message.
>
> A process crash ? Can you get a stack trace / valgrind output ?
>
> I tried, and it failed in a regular way, since when you use http:// the HTTP driver triggers. This
> driver downloads the whole file in memory before passing it to the real driver, and it deleted the
> temp file, whereas the JP2OpenJPEG driver needs to be able to re-open it. Just fixed that issue
>
> >
> > I'm using GDAL 2.3.1 with openjpeg 2.3, I'm using a docker image
> > developmentseed/geolambda:1.0.0 and you can see the Dockerfile here:
> >
> > https://github.com/developmentseed/geolambda/blob/master/Dockerfile
> >
> > Is this supported behavior, should I be able to read these files remotely,
> > and do windowed reads on them?
>
> For windowed reads, you need to prefix with /vsicurl/, but I don't guarantee
> the efficiency of this with JPEG2000 in general, and with JP2OpenJPEG in
> particular.
>
> Even
>
> --
> Spatialys - Geospatial professional services
> http://www.spatialys.com
> _______________________________________________
> gdal-dev mailing list
> [hidden email]
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Reading remote jp2k files

Even Rouault-2
Victor,

>
> I'm a jp2000 noob but do you know if windowed reads of remote jp2 data could
> be optimized in some way? I am curious if we could see a "cloud optimized
> jpeg2000" someday?

There is the JPIP protocol that was designed for efficient streaming, but it
requires a dedicated server and client.

For a pure HTTP file-serving solution, there are different aspects
* For non-tiled JPEG2000, coefficients are spread a bit over the whole file,
so you may have to seek a lot to get them. I guess some progression orders
might be more favorable (namely PCRL: Position-component-resolution level-
layer) if you want to have efficient subwindowing, but that should be tested.
* For tiled JPEG2000, the above issue becomes less relevant, but the main
issue is to be able to efficiently locate a tile in the file. For that, the
optional TLM packet marker should be written in the JPEG2000 file

Those were theoretical concerns. Now on the practical side, some
implementations might be better than others in making an efficient use of file
optimizations. For non-tiled JPEG2000, OpenJPEG currently requires ingesting
the whole codestream in memory (even if since openjpeg 2.3 only a subpart of
it will be decompressed for windowed reads). And for tiled JPEG2000, it
ignores the TLM marker and browse through the tilepart headers to identify the
tiles. So work would be needed to improve that. From recollection, Kakadu does
a good job in those areas.

An alternative might be to store JPEG2000 blobs as the payload of a tiled
GeoTIFF. There are some software editors that do that, apparently in slightly
different ways (they might not always put a completely valid JPEG2000
codestream, not sure)

Even

--
Spatialys - Geospatial professional services
http://www.spatialys.com
_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Reading remote jp2k files

Norman Barker-2
Victor, Even

Depande and Zeng did propose a HTTP JPEG2000 solution which is documented by Taubman here - http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.88.2758&rep=rep1&type=pdf 

Precincts and Tiles are different approaches for data access within J2K, and defining a J2K profile (and an encoder that supports it) for grouping packets that contribute to a particular precinct closely together in the file would improve J2K access over HTTP byte-range requests even more than tile access.

I don't necessarily agree that you need index tables to do this within a file, but they wouldn't hurt imo, I have written a commercial JPIP server in the past. For Cloud Optimized Wavelets to work we need J2K encoder/decoders that are designed for this use case and that is non-trivial.

Norman




On Fri, Sep 7, 2018 at 8:40 AM Even Rouault <[hidden email]> wrote:
Victor,

>
> I'm a jp2000 noob but do you know if windowed reads of remote jp2 data could
> be optimized in some way? I am curious if we could see a "cloud optimized
> jpeg2000" someday?

There is the JPIP protocol that was designed for efficient streaming, but it
requires a dedicated server and client.

For a pure HTTP file-serving solution, there are different aspects
* For non-tiled JPEG2000, coefficients are spread a bit over the whole file,
so you may have to seek a lot to get them. I guess some progression orders
might be more favorable (namely PCRL: Position-component-resolution level-
layer) if you want to have efficient subwindowing, but that should be tested.
* For tiled JPEG2000, the above issue becomes less relevant, but the main
issue is to be able to efficiently locate a tile in the file. For that, the
optional TLM packet marker should be written in the JPEG2000 file

Those were theoretical concerns. Now on the practical side, some
implementations might be better than others in making an efficient use of file
optimizations. For non-tiled JPEG2000, OpenJPEG currently requires ingesting
the whole codestream in memory (even if since openjpeg 2.3 only a subpart of
it will be decompressed for windowed reads). And for tiled JPEG2000, it
ignores the TLM marker and browse through the tilepart headers to identify the
tiles. So work would be needed to improve that. From recollection, Kakadu does
a good job in those areas.

An alternative might be to store JPEG2000 blobs as the payload of a tiled
GeoTIFF. There are some software editors that do that, apparently in slightly
different ways (they might not always put a completely valid JPEG2000
codestream, not sure)

Even

--
Spatialys - Geospatial professional services
http://www.spatialys.com
_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev