[gdal-dev] vrt: prevent opening all source files on open?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[gdal-dev] vrt: prevent opening all source files on open?

Vincent Schut-4
Hi,

Is there a way (or: what are the prerequisites) to avoid gdal opening
all source files from a vrt on opening the vrt?

Context: I have created a large vrt, referencing many tif files. Both
the vrt and tifs are remote (using /vsigs/). It works, but opening (e.g.
running gdalinfo) is very slow.

I have a comparable vrt where the main difference is that it references
.hgt files instead of tifs. Running gdalinfo on this vrt is almost
instantaneous.

As there are no other significant differences otherwise, I wonder if
this is because if the vrt machinery encounters a tif, it starts to
check if there are any accompanying overviews (or other important
metadata), while it does not do that for a .hgt file? Or is there a
different reason? And how can I prevent this (without, preferably,
changing to a different source file format)?

Thanks,
Vincent.

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: vrt: prevent opening all source files on open?

Even Rouault-2

Vincent,

 

http://gdal.org/gdal_vrttut.html should help:

 

"""Some characteristics of the source band can be specified in the optional SourceProperties tag to enable the VRT driver to differ the opening of the source dataset until it really needs to read data from it. This is particularly useful when building VRTs with a big number of source datasets. The needed parameters are the raster dimensions, the size of the blocks and the data type. If the SourceProperties tag is not present, the source dataset will be opened at the same time as the VRT itself.

 

[...]

 

<SimpleSource>

<SourceFilename relativeToVRT="1">utm.tif</SourceFilename>

<SourceBand>1</SourceBand>

<SourceProperties RasterXSize="512" RasterYSize="512" DataType="Byte" BlockXSize="128" BlockYSize="128"/>

<SrcRect xOff="0" yOff="0" xSize="512" ySize="512"/>

<DstRect xOff="0" yOff="0" xSize="512" ySize="512"/>

</SimpleSource>

"""

 

Note: in theory, the VRT driver could be improved to implement this lazy opening behaviour without requiring SourceProperties, but until/if this is implemented some day, you have to use SourceProperties.

 

Even

 

 

 

> Hi,

>

> Is there a way (or: what are the prerequisites) to avoid gdal opening

> all source files from a vrt on opening the vrt?

>

> Context: I have created a large vrt, referencing many tif files. Both

> the vrt and tifs are remote (using /vsigs/). It works, but opening (e.g.

> running gdalinfo) is very slow.

>

> I have a comparable vrt where the main difference is that it references

> .hgt files instead of tifs. Running gdalinfo on this vrt is almost

> instantaneous.

>

> As there are no other significant differences otherwise, I wonder if

> this is because if the vrt machinery encounters a tif, it starts to

> check if there are any accompanying overviews (or other important

> metadata), while it does not do that for a .hgt file? Or is there a

> different reason? And how can I prevent this (without, preferably,

> changing to a different source file format)?

>

> Thanks,

> Vincent.

>

> _______________________________________________

> gdal-dev mailing list

> [hidden email]

> https://lists.osgeo.org/mailman/listinfo/gdal-dev

 

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: vrt: prevent opening all source files on open?

Vincent Schut-4
On 08/28/2017 12:30 PM, Even Rouault wrote:

Vincent,

 

http://gdal.org/gdal_vrttut.html should help:

 

"""Some characteristics of the source band can be specified in the optional SourceProperties tag to enable the VRT driver to differ the opening of the source dataset until it really needs to read data from it. This is particularly useful when building VRTs with a big number of source datasets. The needed parameters are the raster dimensions, the size of the blocks and the data type. If the SourceProperties tag is not present, the source dataset will be opened at the same time as the VRT itself.

 

[...]

 

<SimpleSource>

<SourceFilename relativeToVRT="1">utm.tif</SourceFilename>

<SourceBand>1</SourceBand>

<SourceProperties RasterXSize="512" RasterYSize="512" DataType="Byte" BlockXSize="128" BlockYSize="128"/>

<SrcRect xOff="0" yOff="0" xSize="512" ySize="512"/>

<DstRect xOff="0" yOff="0" xSize="512" ySize="512"/>

</SimpleSource>

"""

 

Note: in theory, the VRT driver could be improved to implement this lazy opening behaviour without requiring SourceProperties, but until/if this is implemented some day, you have to use SourceProperties.

 

Even

Unfortunately, it's more complex. Both vrt's have SourceProperties tags for each source.
I did another experiment to test if it is .hgt versus .tif, and it appears to be something else: I removed all but one of the ComplexSource items from each. Now both of them appear to try to open the source raster.

The vrt which apparently does not open it's sources is this one: https://cloud.sdsc.edu/v1/AUTH_opentopography/Raster/SRTM_GL1/SRTM_GL1_srtm.vrt
Even opening that with /vsicurl/ is fast.
Copying this vrt locally (and thus invalidating the source paths) works and does not result in a read error. Remove all but one of the ComplexSource elements, and I get an error on running gdalinfo on it (because it cannot find the source raster). I'm stumped...

sorry, have to go to a meeting now. More later.

thanks!
Vincent.

 

 

 

> Hi,

>

> Is there a way (or: what are the prerequisites) to avoid gdal opening

> all source files from a vrt on opening the vrt?

>

> Context: I have created a large vrt, referencing many tif files. Both

> the vrt and tifs are remote (using /vsigs/). It works, but opening (e.g.

> running gdalinfo) is very slow.

>

> I have a comparable vrt where the main difference is that it references

> .hgt files instead of tifs. Running gdalinfo on this vrt is almost

> instantaneous.

>

> As there are no other significant differences otherwise, I wonder if

> this is because if the vrt machinery encounters a tif, it starts to

> check if there are any accompanying overviews (or other important

> metadata), while it does not do that for a .hgt file? Or is there a

> different reason? And how can I prevent this (without, preferably,

> changing to a different source file format)?

>

> Thanks,

> Vincent.

>

> _______________________________________________

> gdal-dev mailing list

> [hidden email]

> https://lists.osgeo.org/mailman/listinfo/gdal-dev

 

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com



_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: vrt: prevent opening all source files on open?

Even Rouault-2

On lundi 28 août 2017 13:36:36 CEST Vincent Schut wrote:

> On 08/28/2017 12:30 PM, Even Rouault wrote:

> > Vincent,

> >

> > http://gdal.org/gdal_vrttut.html should help:

> >

> > """Some characteristics of the source band can be specified in the

> > optional SourceProperties tag to enable the VRT driver to differ the

> > opening of the source dataset until it really needs to read data from

> > it. This is particularly useful when building VRTs with a big number

> > of source datasets. The needed parameters are the raster dimensions,

> > the size of the blocks and the data type. If the SourceProperties tag

> > is not present, the source dataset will be opened at the same time as

> > the VRT itself.

> >

> > [...]

> >

> > <SimpleSource>

> >

> > <SourceFilename relativeToVRT="1">utm.tif</SourceFilename>

> >

> > <SourceBand>1</SourceBand>

> >

> > <SourceProperties RasterXSize="512" RasterYSize="512" DataType="Byte"

> > BlockXSize="128" BlockYSize="128"/>

> >

> > <SrcRect xOff="0" yOff="0" xSize="512" ySize="512"/>

> >

> > <DstRect xOff="0" yOff="0" xSize="512" ySize="512"/>

> >

> > </SimpleSource>

> >

> > """

> >

> > Note: in theory, the VRT driver could be improved to implement this

> > lazy opening behaviour without requiring SourceProperties, but

> > until/if this is implemented some day, you have to use SourceProperties.

> >

> > Even

>

> Unfortunately, it's more complex. Both vrt's have SourceProperties tags

> for each source.

> I did another experiment to test if it is .hgt versus .tif, and it

> appears to be something else: I removed all but one of the ComplexSource

> items from each. Now both of them appear to try to open the source raster.

>

> The vrt which apparently does not open it's sources is this one:

> https://cloud.sdsc.edu/v1/AUTH_opentopography/Raster/SRTM_GL1/SRTM_GL1_srtm.

> vrt Even opening that with /vsicurl/ is fast.

> Copying this vrt locally (and thus invalidating the source paths) works

> and does not result in a read error. Remove all but one of the

> ComplexSource elements, and I get an error on running gdalinfo on it

> (because it cannot find the source raster). I'm stumped...

>

> sorry, have to go to a meeting now. More later.

 

Oh I see. There are particular code paths in the case where we have a VRT made of a single source for optimization. For example for doing multi-band at once RasterIO operations. Or GetOverviewCount() will call the source GetOverviewCount() to expose them at the VRT level. Normally only GDALOpen() on such VRT shouldn't cause the source to be opened. So I'd suspect you've a call to GetOverviewCount()

 

A way of workarounding this is to add a fake source such as

 

<ComplexSource>

<SourceFilename>/dev/null</SourceFilename>

<SourceBand>1</SourceBand>

<SourceProperties RasterXSize="1" RasterYSize="1" DataType="Byte" BlockXSize="1" BlockYSize="1" />

<SrcRect xOff="0" yOff="0" xSize="0" ySize="0" />

<DstRect xOff="0" yOff="0" xSize="0" ySize="0" />

</ComplexSource>

 

 

>

> thanks!

> Vincent.

>

> > > Hi,

> > >

> > >

> > >

> > > Is there a way (or: what are the prerequisites) to avoid gdal opening

> > >

> > > all source files from a vrt on opening the vrt?

> > >

> > >

> > >

> > > Context: I have created a large vrt, referencing many tif files. Both

> > >

> > > the vrt and tifs are remote (using /vsigs/). It works, but opening (e.g.

> > >

> > > running gdalinfo) is very slow.

> > >

> > >

> > >

> > > I have a comparable vrt where the main difference is that it references

> > >

> > > .hgt files instead of tifs. Running gdalinfo on this vrt is almost

> > >

> > > instantaneous.

> > >

> > >

> > >

> > > As there are no other significant differences otherwise, I wonder if

> > >

> > > this is because if the vrt machinery encounters a tif, it starts to

> > >

> > > check if there are any accompanying overviews (or other important

> > >

> > > metadata), while it does not do that for a .hgt file? Or is there a

> > >

> > > different reason? And how can I prevent this (without, preferably,

> > >

> > > changing to a different source file format)?

> > >

> > >

> > >

> > > Thanks,

> > >

> > > Vincent.

> > >

> > >

> > >

> > > _______________________________________________

> > >

> > > gdal-dev mailing list

> > >

> > > [hidden email]

> > >

> > > https://lists.osgeo.org/mailman/listinfo/gdal-dev

> >

> > Spatialys - Geospatial professional services

> >

> > http://www.spatialys.com

 

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: vrt: prevent opening all source files on open?

Vincent Schut-4
On 08/28/2017 01:51 PM, Even Rouault wrote:

On lundi 28 août 2017 13:36:36 CEST Vincent Schut wrote:

> On 08/28/2017 12:30 PM, Even Rouault wrote:

> > Vincent,

> >

> > http://gdal.org/gdal_vrttut.html should help:

> >

> > """Some characteristics of the source band can be specified in the

> > optional SourceProperties tag to enable the VRT driver to differ the

> > opening of the source dataset until it really needs to read data from

> > it. This is particularly useful when building VRTs with a big number

> > of source datasets. The needed parameters are the raster dimensions,

> > the size of the blocks and the data type. If the SourceProperties tag

> > is not present, the source dataset will be opened at the same time as

> > the VRT itself.

> >

> > [...]

> >

> > <SimpleSource>

> >

> > <SourceFilename relativeToVRT="1">utm.tif</SourceFilename>

> >

> > <SourceBand>1</SourceBand>

> >

> > <SourceProperties RasterXSize="512" RasterYSize="512" DataType="Byte"

> > BlockXSize="128" BlockYSize="128"/>

> >

> > <SrcRect xOff="0" yOff="0" xSize="512" ySize="512"/>

> >

> > <DstRect xOff="0" yOff="0" xSize="512" ySize="512"/>

> >

> > </SimpleSource>

> >

> > """

> >

> > Note: in theory, the VRT driver could be improved to implement this

> > lazy opening behaviour without requiring SourceProperties, but

> > until/if this is implemented some day, you have to use SourceProperties.

> >

> > Even

>

> Unfortunately, it's more complex. Both vrt's have SourceProperties tags

> for each source.

> I did another experiment to test if it is .hgt versus .tif, and it

> appears to be something else: I removed all but one of the ComplexSource

> items from each. Now both of them appear to try to open the source raster.

>

> The vrt which apparently does not open it's sources is this one:

> https://cloud.sdsc.edu/v1/AUTH_opentopography/Raster/SRTM_GL1/SRTM_GL1_srtm.

> vrt Even opening that with /vsicurl/ is fast.

> Copying this vrt locally (and thus invalidating the source paths) works

> and does not result in a read error. Remove all but one of the

> ComplexSource elements, and I get an error on running gdalinfo on it

> (because it cannot find the source raster). I'm stumped...

>

> sorry, have to go to a meeting now. More later.

 

Oh I see. There are particular code paths in the case where we have a VRT made of a single source for optimization. For example for doing multi-band at once RasterIO operations. Or GetOverviewCount() will call the source GetOverviewCount() to expose them at the VRT level. Normally only GDALOpen() on such VRT shouldn't cause the source to be opened. So I'd suspect you've a call to GetOverviewCount()

 

A way of workarounding this is to add a fake source such as

 

<ComplexSource>

<SourceFilename>/dev/null</SourceFilename>

<SourceBand>1</SourceBand>

<SourceProperties RasterXSize="1" RasterYSize="1" DataType="Byte" BlockXSize="1" BlockYSize="1" />

<SrcRect xOff="0" yOff="0" xSize="0" ySize="0" />

<DstRect xOff="0" yOff="0" xSize="0" ySize="0" />

</ComplexSource>

 

 

Ah, thanks Even, that works!

>

> thanks!

> Vincent.

>

> > > Hi,

> > >

> > >

> > >

> > > Is there a way (or: what are the prerequisites) to avoid gdal opening

> > >

> > > all source files from a vrt on opening the vrt?

> > >

> > >

> > >

> > > Context: I have created a large vrt, referencing many tif files. Both

> > >

> > > the vrt and tifs are remote (using /vsigs/). It works, but opening (e.g.

> > >

> > > running gdalinfo) is very slow.

> > >

> > >

> > >

> > > I have a comparable vrt where the main difference is that it references

> > >

> > > .hgt files instead of tifs. Running gdalinfo on this vrt is almost

> > >

> > > instantaneous.

> > >

> > >

> > >

> > > As there are no other significant differences otherwise, I wonder if

> > >

> > > this is because if the vrt machinery encounters a tif, it starts to

> > >

> > > check if there are any accompanying overviews (or other important

> > >

> > > metadata), while it does not do that for a .hgt file? Or is there a

> > >

> > > different reason? And how can I prevent this (without, preferably,

> > >

> > > changing to a different source file format)?

> > >

> > >

> > >

> > > Thanks,

> > >

> > > Vincent.

> > >

> > >

> > >

> > > _______________________________________________

> > >

> > > gdal-dev mailing list

> > >

> > > [hidden email]

> > >

> > > https://lists.osgeo.org/mailman/listinfo/gdal-dev

> >

> > Spatialys - Geospatial professional services

> >

> > http://www.spatialys.com

 

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com



_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev