Re: Multidimensional raster support in GDAL

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Multidimensional raster support in GDAL

Even Rouault-2

Hi

 

(top posting to clearly mark the start of a new thread)

 

> I too think that multidimensional raster support would be useful.

> Besides by drastically redesigning data structures, could we get there

> incrementally?

 

One difficulty is that there are 154 raster drivers that use the current data structures.

 

2D is assumed in a number of base data structures. Non exhaustive list:

 

class GDALDataset:

int nRasterXSize

int nRasterYSize

int GetRasterXSize()

int GetRasterYSize()

GetGeoTransform(double adfGT[6]) // read side

SetGeoTransform(double adfGT[6]) // write side

RasterIO(int nXOff, int nYOff, int nXSize, int nYSize, ... int nBufXSize, int nBufYSize, ...) + IRasterIO()

[...]

 

class GDADriver

Create( int nXSize, int nYSize, ... )

 

class GDALRasterBand

int nRasterXSize

int nRasterYSize

int nBlockXSize

int nBlockYSize

int nBlocksPerRow

int nBlocksPerColumn

int GetXSize()

int GetYSize()

GetBlockSize(int* pnBlockXSize, int *pnBlockYSize)

IReadBlock(int nBlockXOff, int nBlockYOff, ...)

RasterIO() + IRasterIO()

[...]

 

class GDALRasterBlock

int nXOff;

int nYOff;

int nXSize;

int nYSize;

[...]

 

All GDAL algorithms and utilities assume 2D for subsetting, redimensionning, etc...

 

In a pure ND model, this would probably be something like:

 

class GDALDataset

int nAxisCount

char* apszAxisName[nAxisCount] // "X", "Y", "Z", "T", ...

char* apszAxisUnit[nAxisCount] // "deg", "m", "secs since 1970-01-01T00:00:00Z", ...

int panAxisSize[nAxisCount] // generalizes nRasterXSize + nRasterYSize

double padfOrigin[nAxisCount] // generalizes adfGT[0] + adfGT[3]

double papadfAxisVectors[nAxisCount][nAxisCount] // generalizes other coefficients of the geotransform

IMultiDimRasterIO( const int* panWinOffsets, const size_t* panWinSizes, ...., const size_t* panBufSizes )

 

(some members could actually be only present in the driver class extending GDALDataset. GDALDataset would only have getter and setter similarly to the above GetGeoTransform / SetGeoTransform )

 

 

And with that model, we only support regular grids and skewed and rotated grids, but with constant sampling space along each axis. Which might not be enough for the N > 2 'T' or 'Z' dimensions for which I can imagine sampling to be less frequently regular than on the 2D plane.

In netCDF, you can have support for irregular sampling since a variable is indexed by dimensions, and dimensions are generally associated with a 1D variable that describe the coordinate value for each sample point along the axis/dimension.

 

So a more general model would be:

 

class GDALDataset

int nAxisCount

char* apszAxisName[nAxisCount]

char* apszAxisUnit[nAxisCount]

int panAxisSize[nAxisCount]

double papadfAxisCoordinates[nAxisCount][] where the size of the 2nd dimension of this array is the value of panAxisSize[i]

 

(That would still not support fully irregular grids where basically each intersection of the grid should have a coordinate tuple, but we probably don't need to go to that complexity.)

 

Or perhaps put all axis related stuff in a dedicated class, and with a flag to indicate regular vs irregular spacing, so as to simplify some processing in the regular spacing case

 

class GDALGridAxis

char* pszName

char* pszUnit

int nSize

bool bRegularSpacing;

double dfOrigin; // if bRegularSpacing == true

double adfVector[nAxisCount]; // if bRegularSpacing == true

double adfAxisCoordinates[nSize]; // if bRegularSpacing == false

 

 

In a transition, we'd want:

* all existing 2D functionnalities and public API to be preserved. And some rather mechanical way of converting existing driver code to the new internal API (helpers for the 2D case) since 2D only drivers are and will remain 95% of existing drivers.

* addition of a restricted set of ND functionnalities. Among the potential restrictions for a first stage: read-only support, nearest neighbour resampling. Minimum functionnality needed:

- ND read support in netCDF driver

- A base GDALRasterBand::IMultiDimRasterIO() implementation, which requires the GDALRasterBlock / cache mechanism to support ND

- We'd want some support for gdal_translate to be able to do coverage subsetting and slicing (you'd need to slice down to 2D if no drivers support ND output).

- As the VRT format is the fundamental mechanics for any non trivial gdal_translate operation (any switch besides -of and -co goes through VRT internally), it would need to be updated to support ND.

- C API and Python bindings should be updated.

 

Even

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Multidimensional raster support in GDAL

Ari Jolma-2

I'd like to first know / decide what would we mean by a "multidimensional raster"?

What we now have as a raster is a dataset with one or more bands. The bands represent the data dimension and thus there is one of those (2D+1). Would we like to have more data dimensions? What Even sketches below is ND, but would we really want to have 2D + ND?

Also now there is no unit for the data dimension and the value is strictly increasing positive integer, would we like to have an explicit unit and arbitrary value (double, date, time, ...)?

Ari


Even Rouault kirjoitti 25.10.2017 klo 14:01:

Hi

 

(top posting to clearly mark the start of a new thread)

 

> I too think that multidimensional raster support would be useful.

> Besides by drastically redesigning data structures, could we get there

> incrementally?

 

One difficulty is that there are 154 raster drivers that use the current data structures.

 

2D is assumed in a number of base data structures. Non exhaustive list:

 

class GDALDataset:

int nRasterXSize

int nRasterYSize

int GetRasterXSize()

int GetRasterYSize()

GetGeoTransform(double adfGT[6]) // read side

SetGeoTransform(double adfGT[6]) // write side

RasterIO(int nXOff, int nYOff, int nXSize, int nYSize, ... int nBufXSize, int nBufYSize, ...) + IRasterIO()

[...]

 

class GDADriver

Create( int nXSize, int nYSize, ... )

 

class GDALRasterBand

int nRasterXSize

int nRasterYSize

int nBlockXSize

int nBlockYSize

int nBlocksPerRow

int nBlocksPerColumn

int GetXSize()

int GetYSize()

GetBlockSize(int* pnBlockXSize, int *pnBlockYSize)

IReadBlock(int nBlockXOff, int nBlockYOff, ...)

RasterIO() + IRasterIO()

[...]

 

class GDALRasterBlock

int nXOff;

int nYOff;

int nXSize;

int nYSize;

[...]

 

All GDAL algorithms and utilities assume 2D for subsetting, redimensionning, etc...

 

In a pure ND model, this would probably be something like:

 

class GDALDataset

int nAxisCount

char* apszAxisName[nAxisCount] // "X", "Y", "Z", "T", ...

char* apszAxisUnit[nAxisCount] // "deg", "m", "secs since 1970-01-01T00:00:00Z", ...

int panAxisSize[nAxisCount] // generalizes nRasterXSize + nRasterYSize

double padfOrigin[nAxisCount] // generalizes adfGT[0] + adfGT[3]

double papadfAxisVectors[nAxisCount][nAxisCount] // generalizes other coefficients of the geotransform

IMultiDimRasterIO( const int* panWinOffsets, const size_t* panWinSizes, ...., const size_t* panBufSizes )

 

(some members could actually be only present in the driver class extending GDALDataset. GDALDataset would only have getter and setter similarly to the above GetGeoTransform / SetGeoTransform )

 

 

And with that model, we only support regular grids and skewed and rotated grids, but with constant sampling space along each axis. Which might not be enough for the N > 2 'T' or 'Z' dimensions for which I can imagine sampling to be less frequently regular than on the 2D plane.

In netCDF, you can have support for irregular sampling since a variable is indexed by dimensions, and dimensions are generally associated with a 1D variable that describe the coordinate value for each sample point along the axis/dimension.

 

So a more general model would be:

 

class GDALDataset

int nAxisCount

char* apszAxisName[nAxisCount]

char* apszAxisUnit[nAxisCount]

int panAxisSize[nAxisCount]

double papadfAxisCoordinates[nAxisCount][] where the size of the 2nd dimension of this array is the value of panAxisSize[i]

 

(That would still not support fully irregular grids where basically each intersection of the grid should have a coordinate tuple, but we probably don't need to go to that complexity.)

 

Or perhaps put all axis related stuff in a dedicated class, and with a flag to indicate regular vs irregular spacing, so as to simplify some processing in the regular spacing case

 

class GDALGridAxis

char* pszName

char* pszUnit

int nSize

bool bRegularSpacing;

double dfOrigin; // if bRegularSpacing == true

double adfVector[nAxisCount]; // if bRegularSpacing == true

double adfAxisCoordinates[nSize]; // if bRegularSpacing == false

 

 

In a transition, we'd want:

* all existing 2D functionnalities and public API to be preserved. And some rather mechanical way of converting existing driver code to the new internal API (helpers for the 2D case) since 2D only drivers are and will remain 95% of existing drivers.

* addition of a restricted set of ND functionnalities. Among the potential restrictions for a first stage: read-only support, nearest neighbour resampling. Minimum functionnality needed:

- ND read support in netCDF driver

- A base GDALRasterBand::IMultiDimRasterIO() implementation, which requires the GDALRasterBlock / cache mechanism to support ND

- We'd want some support for gdal_translate to be able to do coverage subsetting and slicing (you'd need to slice down to 2D if no drivers support ND output).

- As the VRT format is the fundamental mechanics for any non trivial gdal_translate operation (any switch besides -of and -co goes through VRT internally), it would need to be updated to support ND.

- C API and Python bindings should be updated.

 

Even

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com



_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Multidimensional raster support in GDAL

Even Rouault-2

On mercredi 25 octobre 2017 14:42:54 CEST Ari Jolma wrote:

> I'd like to first know / decide what would we mean by a

> "multidimensional raster"?

 

The current rasters handled by GDAL are already multidimensional, with the number of dimensions being fixed to 2, and being 2 horizontal dimensions (X/Y, lon/lat). The idea here would be to at least address the 3D (X,Y,Z / X,Y,T) and 4D case (X,Y,Z,T), and when you need to do that, arbitrary number of dimensions is at hand.

 

The dimensions/ones here are mostly spatio-temporal ones, along which you can measure a physical phenomenon (a temperature field measures at different position, elevation, time)

 

>

> What we now have as a raster is a dataset with one or more bands. The

> bands represent the data dimension and thus there is one of those

> (2D+1).

 

Bands do not represent a dimension. In OGC SIS

( http://docs.opengeospatial.org/is/09-146r6/09-146r6.html ) or WCS, bands are called fields of a rangeType.

Bands are different from the above mentionned dimensions in which they do not really consider an axis along which you have some consistency. Along the X/Y axis, you can mark ticks every n meters. Along the "band axis", you will have a temperature field in Kelvin, a wind direction in degrees, etc.

 

> Would we like to have more data dimensions? What Even sketches

> below is ND, but would we really want to have 2D + ND?

 

My scetch of ND intended to capture the current 2D case, as a particular case.

 

>

> Also now there is no unit for the data dimension and the value is

> strictly increasing positive integer,

 

> would we like to have an explicit

> unit

 

Was the GDALDataset::char* apszAxisUnit[nAxisCount] or GDALGridAxis::pszUnit in my scetch

 

> and arbitrary value (double, date, time, ...)?

 

In my scetch, I had indeed some hesitation about the appropriate type for adfAxisCoordinates array. I think that for the cases mentionned above (4D spatio-temporal) we can always represent the coordinates along each axis as a double value. The case where that's the less obvious is for date/time, but using some reference like the Unix Epoch.

 

 

Even

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Multidimensional raster support in GDAL

Ari Jolma-2

Even Rouault kirjoitti 25.10.2017 klo 22:22:

On mercredi 25 octobre 2017 14:42:54 CEST Ari Jolma wrote:

> I'd like to first know / decide what would we mean by a

> "multidimensional raster"?

 

The current rasters handled by GDAL are already multidimensional


I was aiming at the 'multidimensional raster' to a case where the location (raster cell) data is multidimensional. You get multidimensional data whenever you have an axis along which you can do measurements (time, depth, electromagnetic spectrum, simulation number, etc). I'd take the location as the basis that probably shouldn't need extending for GDAL. Or do we want to support taking X,Y,T data from a format that supports it and slice it to X,T for another format? Maybe that makes sense for some visualization and analysis purposes, but is that what GDAL should support at some point? Are there compelling use cases?

Would it be possible to keep X,Y as it is and add support for multidimensional data as an extension?

Ari


, with the number of dimensions being fixed to 2, and being 2 horizontal dimensions (X/Y, lon/lat). The idea here would be to at least address the 3D (X,Y,Z / X,Y,T) and 4D case (X,Y,Z,T), and when you need to do that, arbitrary number of dimensions is at hand.

 

The dimensions/ones here are mostly spatio-temporal ones, along which you can measure a physical phenomenon (a temperature field measures at different position, elevation, time)

 

>

> What we now have as a raster is a dataset with one or more bands. The

> bands represent the data dimension and thus there is one of those

> (2D+1).

 

Bands do not represent a dimension. In OGC SIS

( http://docs.opengeospatial.org/is/09-146r6/09-146r6.html ) or WCS, bands are called fields of a rangeType.

Bands are different from the above mentionned dimensions in which they do not really consider an axis along which you have some consistency. Along the X/Y axis, you can mark ticks every n meters. Along the "band axis", you will have a temperature field in Kelvin, a wind direction in degrees, etc.

 

> Would we like to have more data dimensions? What Even sketches

> below is ND, but would we really want to have 2D + ND?

 

My scetch of ND intended to capture the current 2D case, as a particular case.

 

>

> Also now there is no unit for the data dimension and the value is

> strictly increasing positive integer,

 

> would we like to have an explicit

> unit

 

Was the GDALDataset::char* apszAxisUnit[nAxisCount] or GDALGridAxis::pszUnit in my scetch

 

> and arbitrary value (double, date, time, ...)?

 

In my scetch, I had indeed some hesitation about the appropriate type for adfAxisCoordinates array. I think that for the cases mentionned above (4D spatio-temporal) we can always represent the coordinates along each axis as a double value. The case where that's the less obvious is for date/time, but using some reference like the Unix Epoch.

 

 

Even

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com



_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Multidimensional raster support in GDAL

Edzer Pebesma-2
In reply to this post by Even Rouault-2


On 10/25/2017 09:22 PM, Even Rouault wrote:

> On mercredi 25 octobre 2017 14:42:54 CEST Ari Jolma wrote:
>
>> I'd like to first know / decide what would we mean by a
>
>> "multidimensional raster"?
>
>  
>
> The current rasters handled by GDAL are already multidimensional, with
> the number of dimensions being fixed to 2, and being 2 horizontal
> dimensions (X/Y, lon/lat). The idea here would be to at least address
> the 3D (X,Y,Z / X,Y,T) and 4D case (X,Y,Z,T), and when you need to do
> that, arbitrary number of dimensions is at hand.
>
>  
>
> The dimensions/ones here are mostly spatio-temporal ones, along which
> you can measure a physical phenomenon (a temperature field measures at
> different position, elevation, time)
>
>  
>
>>
>
>> What we now have as a raster is a dataset with one or more bands. The
>
>> bands represent the data dimension and thus there is one of those
>
>> (2D+1).
>
>  
>
> Bands do not represent a dimension. In OGC SIS
>
> ( http://docs.opengeospatial.org/is/09-146r6/09-146r6.html ) or WCS,
> bands are called fields of a rangeType.
>
> Bands are different from the above mentionned dimensions in which they
> do not really consider an axis along which you have some consistency.
> Along the X/Y axis, you can mark ticks every n meters. Along the "band
> axis", you will have a temperature field in Kelvin, a wind direction in
> degrees, etc.
>

I think band is a dimension in the implementation in GDAL because for
every (x,y,band) we give access to a pixel; if it were not a dimension
but an attribute, we would for each pixel (x,y) give access to a record
with named fields e.g. { red, green, blue }.

In the case where bands reflect colors (or wavelength) they work well as
a dimension, since they can be uniquely ordered. Like elevation in
weather data they may be irregularly spaced; in practice, they refer to
wavelength intervals rather than point values.

In the case where there is no such ordering, e.g. where bands reflect
raster layers with abundance of different plant species, we can still
use it as a dimension to solve a problem, but we can't do meaningful
operations on it that assume ordering: you can smooth a spectral curve
or depth profile to reduce noise, but doing this over an arbitrarily
ordered set of species doesn't make sense.


>  
>
>> Would we like to have more data dimensions? What Even sketches
>
>> below is ND, but would we really want to have 2D + ND?
>
>  
>
> My scetch of ND intended to capture the current 2D case, as a particular
> case.
>
>  
>
>>
>
>> Also now there is no unit for the data dimension and the value is
>
>> strictly increasing positive integer,
>
>  
>
>> would we like to have an explicit
>
>> unit
>
>  
>
> Was the GDALDataset::char* apszAxisUnit[nAxisCount] or
> GDALGridAxis::pszUnit in my scetch
>
>  
>
>> and arbitrary value (double, date, time, ...)?
>
>  
>
> In my scetch, I had indeed some hesitation about the appropriate type
> for adfAxisCoordinates array. I think that for the cases mentionned
> above (4D spatio-temporal) we can always represent the coordinates along
> each axis as a double value. The case where that's the less obvious is
> for date/time, but using some reference like the Unix Epoch.
>
>  
>
>  
>
> Even
>
>  
>
> --
>
> Spatialys - Geospatial professional services
>
> http://www.spatialys.com
>
>
>
> _______________________________________________
> gdal-dev mailing list
> [hidden email]
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>

--
Edzer Pebesma
Institute for Geoinformatics
Heisenbergstrasse 2, 48151 Muenster, Germany
Phone: +49 251 8333081
_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Multidimensional raster support in GDAL

Edzer Pebesma-2
In reply to this post by Ari Jolma-2


On 10/26/2017 08:05 AM, Ari Jolma wrote:

> Even Rouault kirjoitti 25.10.2017 klo 22:22:
>
>> On mercredi 25 octobre 2017 14:42:54 CEST Ari Jolma wrote:
>>
>> > I'd like to first know / decide what would we mean by a
>>
>> > "multidimensional raster"?
>>
>>  
>>
>> The current rasters handled by GDAL are already multidimensional
>>
>
> I was aiming at the 'multidimensional raster' to a case where the
> location (raster cell) data is multidimensional. You get
> multidimensional data whenever you have an axis along which you can do
> measurements (time, depth, electromagnetic spectrum, simulation number,
> etc). I'd take the location as the basis that probably shouldn't need
> extending for GDAL. Or do we want to support taking X,Y,T data from a
> format that supports it and slice it to X,T for another format? Maybe
> that makes sense for some visualization and analysis purposes, but is
> that what GDAL should support at some point? Are there compelling use cases?

I'm thinking of handling and analysing multi-dimensional spatio-temporal
imagery-like array data in general, of which there are many examples.

Two big use case groups: detecting change from remote sensing image time
series, and analysing climate model predictions.

>
> Would it be possible to keep X,Y as it is and add support for
> multidimensional data as an extension?

I think yes, and Even's proposal of yesterday sketches the path for this
(I still need more time to understand it).

More in general, we have multidimensional arrays where horizontal space
is not an x y image (GDAL raster band) but for instance a sequence of
feature geometries (e.g., in situ sensor data, or demographic data by
administrative region, time, and age class). We may want to keep those
use cases in mind when working on this from the GDAL perspective, since
pretty soon you will want to cut out a point or region, or aggregate
grid values over a polygon area while keeping all other dimensions.

>
> Ari
>
>
>> , with the number of dimensions being fixed to 2, and being 2
>> horizontal dimensions (X/Y, lon/lat). The idea here would be to at
>> least address the 3D (X,Y,Z / X,Y,T) and 4D case (X,Y,Z,T), and when
>> you need to do that, arbitrary number of dimensions is at hand.
>>
>>  
>>
>> The dimensions/ones here are mostly spatio-temporal ones, along which
>> you can measure a physical phenomenon (a temperature field measures at
>> different position, elevation, time)
>>
>>  
>>
>> >
>>
>> > What we now have as a raster is a dataset with one or more bands. The
>>
>> > bands represent the data dimension and thus there is one of those
>>
>> > (2D+1).
>>
>>  
>>
>> Bands do not represent a dimension. In OGC SIS
>>
>> ( http://docs.opengeospatial.org/is/09-146r6/09-146r6.html ) or WCS,
>> bands are called fields of a rangeType.
>>
>> Bands are different from the above mentionned dimensions in which they
>> do not really consider an axis along which you have some consistency.
>> Along the X/Y axis, you can mark ticks every n meters. Along the "band
>> axis", you will have a temperature field in Kelvin, a wind direction
>> in degrees, etc.
>>
>>  
>>
>> > Would we like to have more data dimensions? What Even sketches
>>
>> > below is ND, but would we really want to have 2D + ND?
>>
>>  
>>
>> My scetch of ND intended to capture the current 2D case, as a
>> particular case.
>>
>>  
>>
>> >
>>
>> > Also now there is no unit for the data dimension and the value is
>>
>> > strictly increasing positive integer,
>>
>>  
>>
>> > would we like to have an explicit
>>
>> > unit
>>
>>  
>>
>> Was the GDALDataset::char* apszAxisUnit[nAxisCount] or
>> GDALGridAxis::pszUnit in my scetch
>>
>>  
>>
>> > and arbitrary value (double, date, time, ...)?
>>
>>  
>>
>> In my scetch, I had indeed some hesitation about the appropriate type
>> for adfAxisCoordinates array. I think that for the cases mentionned
>> above (4D spatio-temporal) we can always represent the coordinates
>> along each axis as a double value. The case where that's the less
>> obvious is for date/time, but using some reference like the Unix Epoch.
>>
>>  
>>
>>  
>>
>> Even
>>
>>  
>>
>> --
>>
>> Spatialys - Geospatial professional services
>>
>> http://www.spatialys.com
>>
>
>
>
> _______________________________________________
> gdal-dev mailing list
> [hidden email]
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>

--
Edzer Pebesma
Institute for Geoinformatics
Heisenbergstrasse 2, 48151 Muenster, Germany
Phone: +49 251 8333081
_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Multidimensional raster support in GDAL

Ari Jolma-2
Edzer Pebesma kirjoitti 26.10.2017 klo 10:26:
> More in general, we have multidimensional arrays where horizontal space
> is not an x y image (GDAL raster band) but for instance a sequence of
> feature geometries (e.g., in situ sensor data, or demographic data by
> administrative region, time, and age class). We may want to keep those
> use cases in mind when working on this from the GDAL perspective, since
> pretty soon you will want to cut out a point or region, or aggregate
> grid values over a polygon area while keeping all other dimensions.

With vector data the case is different, yes. You may have 'a thing'
(say, a province), whose spatial representation changes significantly in
time along with its characteristics (say, population). Or you may have a
moving sensor.

But with raster data, I'd stick very strongly to the notion that the
spatial representation is immutable. The only thing that may be
important(? I may disregard something here) is that when you do
reprojections, then the area that the cell represents changes. Is this
too strong statement?

Ari

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Multidimensional raster support in GDAL

Edzer Pebesma-2


On 10/26/2017 09:39 AM, Ari Jolma wrote:

> Edzer Pebesma kirjoitti 26.10.2017 klo 10:26:
>> More in general, we have multidimensional arrays where horizontal space
>> is not an x y image (GDAL raster band) but for instance a sequence of
>> feature geometries (e.g., in situ sensor data, or demographic data by
>> administrative region, time, and age class). We may want to keep those
>> use cases in mind when working on this from the GDAL perspective, since
>> pretty soon you will want to cut out a point or region, or aggregate
>> grid values over a polygon area while keeping all other dimensions.
>
> With vector data the case is different, yes. You may have 'a thing'
> (say, a province), whose spatial representation changes significantly in
> time along with its characteristics (say, population). Or you may have a
> moving sensor.

For me, the static vs dynamic space discretization choice does not align
with the vector vs. raster choice, but for the case of representing
spatiotemporal phenomena with dense arrays, which is what we discuss
here, we'd better stick to static space discretization.

>
> But with raster data, I'd stick very strongly to the notion that the
> spatial representation is immutable. The only thing that may be

Yes.

> important(? I may disregard something here) is that when you do
> reprojections, then the area that the cell represents changes. Is this
> too strong statement?

That's a different discussion, really. But an interesting one!

>
> Ari
>
> _______________________________________________
> gdal-dev mailing list
> [hidden email]
> https://lists.osgeo.org/mailman/listinfo/gdal-dev

--
Edzer Pebesma
Institute for Geoinformatics
Heisenbergstrasse 2, 48151 Muenster, Germany
Phone: +49 251 8333081
_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Multidimensional raster support in GDAL

Even Rouault-2
In reply to this post by Edzer Pebesma-2

>

> I think band is a dimension in the implementation in GDAL

 

I disagree with that. There's clearly a GDALRasterBand object to model that.

 

> because for

> every (x,y,band) we give access to a pixel; if it were not a dimension

> but an attribute, we would for each pixel (x,y) give access to a record

> with named fields e.g. { red, green, blue }.

 

This could be done, but wouldn't be really practical given that bands can have different data types. But fundamentally band/field/variables is a concept that is different from the dimensions that index it. This is really clear in the related standards (WCS, CIS), and in the format implementations (netCDF), so I think we shouldn't "merge" the band concept as an extra dimension.

 

If you have variable[x][y][z][t], then whatever the value of x, y, z, t the units, data type and semantics of variable[][][][] is the same

 

whereas dataset[x][y][band] doesn't have that property when you change band.

 

>

> In the case where bands reflect colors (or wavelength) they work well as

> a dimension, since they can be uniquely ordered. Like elevation in

> weather data they may be irregularly spaced; in practice, they refer to

> wavelength intervals rather than point values.

>

> In the case where there is no such ordering, e.g. where bands reflect

> raster layers with abundance of different plant species, we can still

> use it as a dimension to solve a problem

 

I see your point, but that's more a particular case, than something on which we can base an abstraction on. Note also that in some datasets the bands are not ordered in a logical fashion. For example in a GeoTIFF file, it is not uncommon to find R, G, B, Near-Infrared in that order.

And if you look at Sentinel-2 dataset, all bands are not sampled at the same resolutions, so band couldn't be used as an index through the whole band set.

 

Even

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Multidimensional raster support in GDAL

Even Rouault-2
In reply to this post by Ari Jolma-2

On jeudi 26 octobre 2017 09:05:27 CEST Ari Jolma wrote:

> Even Rouault kirjoitti 25.10.2017 klo 22:22:

> > On mercredi 25 octobre 2017 14:42:54 CEST Ari Jolma wrote:

> > > I'd like to first know / decide what would we mean by a

> > >

> > > "multidimensional raster"?

> >

> > The current rasters handled by GDAL are already multidimensional

>

> I was aiming at the 'multidimensional raster' to a case where the

> location (raster cell) data is multidimensional. You get

> multidimensional data whenever you have an axis along which you can do

> measurements (time, depth, electromagnetic spectrum, simulation number,

> etc). I'd take the location as the basis that probably shouldn't need

> extending for GDAL.

 

> Or do we want to support taking X,Y,T data from a

> format that supports it and slice it to X,T for another format? Maybe

> that makes sense for some visualization and analysis purposes, but is

> that what GDAL should support at some point? Are there compelling use cases?

 

I don't have an opinon about slicing X,Y,T to X,T

 

>

> Would it be possible to keep X,Y as it is and add support for

> multidimensional data as an extension?

 

That could be indeed an option if we are sure that the 2 first dimensions are X and Y, and we want only to slice higher dimensionnal variables to X,Y.

 

So instead of having a general purpose indexing mechanism as I scetched initially an option would be to expose the extra dimensions/axis, their bounds. The user could set the values at which to set the values at which those extra dimensions must be considered, so as to get a fully constrainted X,Y indexed field.

The main advantage of this is that it would have presumably no impact on 2D-only drivers.

 

Even

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Multidimensional raster support in GDAL

Even Rouault-2
In reply to this post by Edzer Pebesma-2

> More in general, we have multidimensional arrays where horizontal space

> is not an x y image (GDAL raster band) but for instance a sequence of

> feature geometries (e.g., in situ sensor data, or demographic data by

> administrative region, time, and age class).

 

I didn't grasp what you meant there. This doesn't sound like gridded data, does it ? Can you point to such datasets ?

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Multidimensional raster support in GDAL

Ari Jolma-2
In reply to this post by Even Rouault-2
Even Rouault kirjoitti 26.10.2017 klo 20:17:
 

>

> Would it be possible to keep X,Y as it is and add support for

> multidimensional data as an extension?

 

That could be indeed an option if we are sure that the 2 first dimensions are X and Y, and we want only to slice higher dimensionnal variables to X,Y.


Assuming we have a X,Y,Z,T dataset of sea temperatures (Z is depth), we may want to warp it from projection A to B and then select a location and look a Z,T image from that location. Warping would work on X,Y and compute the Z,T as an extension what it now does for the pixel values (possibly multiple bands). I could imagine that the user would want to use GDAL for slicing the Z,T into an image - GDAL can write image files, so why not?

So perhaps the assumption for only slicing to X,Y is not ok.

 

So instead of having a general purpose indexing mechanism as I scetched initially an option would be to expose the extra dimensions/axis, their bounds. The user could set the values at which to set the values at which those extra dimensions must be considered, so as to get a fully constrainted X,Y indexed field.

The main advantage of this is that it would have presumably no impact on 2D-only drivers.


I assume that would be very good from the point of view of actual feasibility of the whole project.

Ari


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Multidimensional raster support in GDAL

Even Rouault-2

> > The main advantage of this is that it would have presumably no impact

> > on 2D-only drivers.

>

> I assume that would be very good from the point of view of actual

> feasibility of the whole project.

 

Yes, although the generic approach could probably be more doable by adding a few conveniency methods to make the transition easier for 2D-only drivers.

 

For example, currently you'll find in a typical Open() method

 

poDS->nRasterXSize = x

poDS->nRasterYSize = y

 

In a generic multidimensional approach, this could be replaced by a

 

poDS->SetXYSize( x, y )

 

that would take care of setting nAxisCount = 2 and initializing the 2 axis.

 

Even

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Multidimensional raster support in GDAL

Edzer Pebesma-2
In reply to this post by Even Rouault-2


On 10/26/2017 07:25 PM, Even Rouault wrote:

>> More in general, we have multidimensional arrays where horizontal space
>
>> is not an x y image (GDAL raster band) but for instance a sequence of
>
>> feature geometries (e.g., in situ sensor data, or demographic data by
>
>> administrative region, time, and age class).
>
>  
>
> I didn't grasp what you meant there. This doesn't sound like gridded
> data, does it ? Can you point to such datasets ?

Not gridded in space, but gridded in spacetime. Examples would be, for
point geometries the EEA air quality database [1], where static station
properties come in one file, and measured air quality measurements (time
series) come in other files, for many parameters; an example where
geometries are regions would be any socio-economic data; concrete
example [2] for each county you'd have a time series of population
counts by age class (2-dimensional array for each county).

[1]
https://www.eea.europa.eu/data-and-maps/data/airbase-the-european-air-quality-database-7
[2] https://www.census.gov/data/tables/2016/demo/popest/counties-total.html

Also look for OLAP cube; the concept is quite old, and not restricted to
spatially gridded data. But I'm happy to restrict the discussion here
(first) to spatially gridded data.

--
Edzer Pebesma
Institute for Geoinformatics
Heisenbergstrasse 2, 48151 Muenster, Germany
Phone: +49 251 8333081
_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev