[gdal-dev] Creating Cloud Optimized GeoTIFFs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[gdal-dev] Creating Cloud Optimized GeoTIFFs

Peter Schmitt
Hi,

I ran into something confusing when using gdal-2.2.0 to generate Cloud Optimized GeoTIFFs following the instructions at: https://trac.osgeo.org/gdal/wiki/CloudOptimizedGeoTIFF

Here's what I did:

1. Fetch an image:
env CPL_VSIL_CURL_ALLOWED_EXTENSIONS=tif GDAL_DISABLE_READDIR_ON_OPEN=YES VSI_CACHE=TRUE gdal_translate https://github.com/mapbox/rasterio/raw/master/tests/data/RGB.byte.tif rgb_byte_jpg.tif -co TILED=YES -co COMPRESS=JPEG -co PHOTOMETRIC=YCBCR

Not a cloud-optimized GeoTIFF as expected:

validate_cloud_optimized_geotiff.py  rgb_byte_jpg.tif
rgb_byte_jpg.tif is NOT a valid cloud optimized GeoTIFF : The file should have overviews


2. Add internal overviews:
gdaladdo -r average rgb_byte_jpg.tif 2 4 8 16 32
 
validate_cloud_optimized_geotiff.py  rgb_byte_jpg.tif

rgb_byte_jpg.tif is NOT a valid cloud optimized GeoTIFF : The offset of the first block of overview of index 3 should be after the one of the overview of index 4


3.  Translate the image again copying source overviews added above.

gdal_translate rgb_byte_jpg.tif rgb_byte_jpg_trans.tif -co TILED=YES -co COMPRESS=JPEG -co PHOTOMETRIC=YCBCR -co COPY_SRC_OVERVIEWS=YES

Now the image _is_ cloud-optimized.

validate_cloud_optimized_geotiff.py  rgb_byte_jpg_trans.tif
rgb_byte_jpg_trans.tif is a valid cloud optimized GeoTIFF


I expected the image to be "cloud optimized" at step 2.  Why do I need an additional translate?


Notes:

* I see similar behavior for COMPRESS=DEFLATE.
* Thinking that the default block size (256) was too big for the tiny 25x23 overview, I set this on the initial translate, but I see the same behavior:
 -co BLOCKXSIZE=16 -co BLOCKYSIZE=16 --config GDAL_TIFF_OVR_BLOCKSIZE 16
* gdal-2.2.0 configured with "--with-libtiff=internal"
* Compiler::
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 8.1.0 (clang-802.0.42)
Target: x86_64-apple-darwin16.5.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

Thanks,
Pete

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Creating Cloud Optimized GeoTIFFs

Even Rouault-2

On mercredi 10 mai 2017 14:41:44 CEST Peter Schmitt wrote:

> Hi,

>

> I ran into something confusing when using gdal-2.2.0 to generate Cloud

> Optimized GeoTIFFs following the instructions at:

> https://trac.osgeo.org/gdal/wiki/CloudOptimizedGeoTIFF

>

> Here's what I did:

>

> 1. Fetch an image:

> env CPL_VSIL_CURL_ALLOWED_EXTENSIONS=tif GDAL_DISABLE_READDIR_ON_OPEN=YES

> VSI_CACHE=TRUE gdal_translate

> https://github.com/mapbox/rasterio/raw/master/tests/data/RGB.byte.tif

> rgb_byte_jpg.tif -co TILED=YES -co COMPRESS=JPEG -co PHOTOMETRIC=YCBCR

>

> Not a cloud-optimized GeoTIFF as expected:

>

> validate_cloud_optimized_geotiff.py rgb_byte_jpg.tif

> rgb_byte_jpg.tif is NOT a valid cloud optimized GeoTIFF : The file should

> have overviews

>

>

> 2. Add internal overviews:

> gdaladdo -r average rgb_byte_jpg.tif 2 4 8 16 32

>

> validate_cloud_optimized_geotiff.py rgb_byte_jpg.tif

>

> rgb_byte_jpg.tif is NOT a valid cloud optimized GeoTIFF : The offset of the

> first block of overview of index 3 should be after the one of the overview

> of index 4

>

>

> 3. Translate the image again copying source overviews added above.

>

> gdal_translate rgb_byte_jpg.tif rgb_byte_jpg_trans.tif -co TILED=YES -co

> COMPRESS=JPEG -co PHOTOMETRIC=YCBCR -co COPY_SRC_OVERVIEWS=YES

>

> Now the image _is_ cloud-optimized.

>

> validate_cloud_optimized_geotiff.py rgb_byte_jpg_trans.tif

> rgb_byte_jpg_trans.tif is a valid cloud optimized GeoTIFF

>

>

> I expected the image to be "cloud optimized" at step 2. Why do I need an

> additional translate?

 

Peter,

 

Because gdaladdo will add the overview IFD at the end of the file, whereas in the definition of cloud optimized GeoTIFF, they must be all at the beginning of the file so as to be efficiently fetchable. And that can only be done with gdal_translate -co COPY_SRC_OVERVIEWS=YES

To avoid quality loss you should only use JPEG compression for the final gdal_translate stage.

 

Even

 

 

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Creating Cloud Optimized GeoTIFFs

Kurt Schwehr-2
Hi Even,

I have some follow up questions on Cloud Optimized GeoTIFFs:

* Is there a preferred/better INTERLEAVE if there is more than one band?
* Is there a preferred tile blocksize?  You have 512 in your examples.  Are there any major trade offs between using 128, 256, 512, or 1024 for x and y block sizes?
* Should tiles be square?  Does it matter?
* Is it better to skip tiling for small images?  If so, at what threshold do you think the switch should happen?  1024?
* Is DEFLATE preferred for compression type over LZW for lossless compression?
* If the writer isn't constrained by compute power, are there preferred ZLEVEL and PREDICTOR values?  Is there a time cost for decompressing ZLEVEL=9 over 1?

I'm a little confused by this code from validate_cloud_optimized_geotiff.py:

    if main_band.XSize >= 512 or main_band.YSize >= 512:
        if check_tiled:
            block_size = main_band.GetBlockSize()
            if block_size[0] == main_band.XSize and block_size[0] > 1024:
                errors += ["The file is greater than 512xH or Wx512," +
                           "but is not tiled"]

Will the above correctly fail an image that is (say) 256x2048 if it is not tiled?

Thanks!
-kurt

On Wed, May 10, 2017 at 1:53 PM, Even Rouault <[hidden email]> wrote:

On mercredi 10 mai 2017 14:41:44 CEST Peter Schmitt wrote:

> Hi,

>

> I ran into something confusing when using gdal-2.2.0 to generate Cloud

> Optimized GeoTIFFs following the instructions at:

> https://trac.osgeo.org/gdal/wiki/CloudOptimizedGeoTIFF

>

> Here's what I did:

>

> 1. Fetch an image:

> env CPL_VSIL_CURL_ALLOWED_EXTENSIONS=tif GDAL_DISABLE_READDIR_ON_OPEN=YES

> VSI_CACHE=TRUE gdal_translate

> https://github.com/mapbox/rasterio/raw/master/tests/data/RGB.byte.tif

> rgb_byte_jpg.tif -co TILED=YES -co COMPRESS=JPEG -co PHOTOMETRIC=YCBCR

>

> Not a cloud-optimized GeoTIFF as expected:

>

> validate_cloud_optimized_geotiff.py rgb_byte_jpg.tif

> rgb_byte_jpg.tif is NOT a valid cloud optimized GeoTIFF : The file should

> have overviews

>

>

> 2. Add internal overviews:

> gdaladdo -r average rgb_byte_jpg.tif 2 4 8 16 32

>

> validate_cloud_optimized_geotiff.py rgb_byte_jpg.tif

>

> rgb_byte_jpg.tif is NOT a valid cloud optimized GeoTIFF : The offset of the

> first block of overview of index 3 should be after the one of the overview

> of index 4

>

>

> 3. Translate the image again copying source overviews added above.

>

> gdal_translate rgb_byte_jpg.tif rgb_byte_jpg_trans.tif -co TILED=YES -co

> COMPRESS=JPEG -co PHOTOMETRIC=YCBCR -co COPY_SRC_OVERVIEWS=YES

>

> Now the image _is_ cloud-optimized.

>

> validate_cloud_optimized_geotiff.py rgb_byte_jpg_trans.tif

> rgb_byte_jpg_trans.tif is a valid cloud optimized GeoTIFF

>

>

> I expected the image to be "cloud optimized" at step 2. Why do I need an

> additional translate?

 

Peter,

 

Because gdaladdo will add the overview IFD at the end of the file, whereas in the definition of cloud optimized GeoTIFF, they must be all at the beginning of the file so as to be efficiently fetchable. And that can only be done with gdal_translate -co COPY_SRC_OVERVIEWS=YES

To avoid quality loss you should only use JPEG compression for the final gdal_translate stage.

 

Even

 

 

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev



--

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Creating Cloud Optimized GeoTIFFs

Even Rouault-2

On mardi 14 novembre 2017 14:20:58 CET Kurt Schwehr wrote:

> Hi Even,

>

> I have some follow up questions on Cloud Optimized GeoTIFFs:

 

The main constraint of C.O.G is that all the IFD definitions are at the beginning of the file, to avoid seeking at various points in it. Other parameters are pretty much unspecified.

 

>

> * Is there a preferred/better INTERLEAVE if there is more than one band?

 

Depends on access patterns. If as soon as you process one pixel you need to access the value for all bands, then INTERLEAVE=PIXEL is better, and it will result in smaller sizes of StripOffsets/TileOffsets and StripByteCount/TileByteCount arrays

 

> * Is there a preferred tile blocksize? You have 512 in your examples. Are

> there any major trade offs between using 128, 256, 512, or 1024 for x and y

> block sizes?

 

Too small blocksizes will result in larger ...Offsets and ...ByteCount arrays.

 

> * Should tiles be square? Does it matter?

 

No

 

> * Is it better to skip tiling for small images? If so, at what threshold

> do you think the switch should happen? 1024?

 

I'm not sure if that has an importance. But it is not wise to have an image whose one dimension is larger than the corresponding block dimension (as blocks are not truncated)

 

> * Is DEFLATE preferred for compression type over LZW for lossless

> compression?

 

Unspecified. DEFLATE is more CPU intensive, but if network times are the limiting factor, it is worth as more eficient

 

> * If the writer isn't constrained by compute power, are there preferred

> ZLEVEL and PREDICTOR values? Is there a time cost for decompressing

> ZLEVEL=9 over 1?

 

PREDICTOR has neglectable CPU inflence (just a add/diff on integer values), but will not always result in smaller file sizes. Depends on the dataset

 

If I trust https://github.com/inikep/lzbench , the time cost for decompression for Deflate/Zlib doesn't seem to vary much with ZLEVEL. So the higher the better.

I don't know for LZW.

 

>

> I'm a little confused by this code from validate_cloud_optimized_geotiff.py:

>

> if main_band.XSize >= 512 or main_band.YSize >= 512:

> if check_tiled:

> block_size = main_band.GetBlockSize()

> if block_size[0] == main_band.XSize and block_size[0] > 1024:

> errors += ["The file is greater than 512xH or Wx512," +

> "but is not tiled"]

>

> Will the above correctly fail an image that is (say) 256x2048 if it is not

> tiled?

 

No, it will pass this test. Since in that case block_size[0] == xsize == 256.

But for such a narrow image, it should probably warn if it is not tiled, as the number of strips, if letting to default strip height, will be larger than really necessary.

 

Even

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev