# [gdal-dev] Raster statistics

4 messages
Open this post in threaded view
|
Report Content as Inappropriate

## [gdal-dev] Raster statistics

 I have a drone raster file which I want to use for some calculation.Before the calculation, I need to loose some extreme values.I want to do something like a percentile calculation where you get all values, order them and loose the top 10%.For this, I need to get all values first which can be slow when using a large file.I looked at the statistics (band.GetStatistics) but that doesn't work well.I thought I could use 2 times the standard deviation added to the mean to get roughly 97%.But with these statistics:    STATISTICS_MAXIMUM=33.186080932617    STATISTICS_MEAN=24.840205979603    STATISTICS_MINIMUM=1.5951598882675    STATISTICS_STDDEV=4.7285348016053Mean + 2*std is larger than the max.So I moved to the histogram. It is also very fast, but I'm not sure how to use it.I have this:  256 buckets from 1.53322 to 33.248:  410 77 66 66 65 58 56 45 42 87 57 72 61 65 68 70 73 82 93 ...Does this mean, bucket 1 = 410 that I have 410 pixels of value 1.53322 and the second bucket means I have 77 pixels between 1.53322 and 1.657? 1.657 = 1.53322 + ((33.248 - 1.53322)/256)Is this a good approach? Or can/should I use a different one.Thanks, Paul _______________________________________________ gdal-dev mailing list [hidden email] https://lists.osgeo.org/mailman/listinfo/gdal-dev
Open this post in threaded view
|
Report Content as Inappropriate

## Re: Raster statistics

 I would not use gdal for this particular task. I presume you have the band data in a 2D numpy array. Then I’d get the 80th percentile for example with np.percentile() and use a boolean expression to generate a mask for the array (droneraster > perc80value ).Chris -- Christine (Chris) Waigl - [hidden email] -  +1-907-474-5483 - Skype: cwaigl_workGeophysical Institute, UAF, 903 Koyukuk Drive, Fairbanks, AK 99775-7320, USA On Aug 3, 2017, at 5:43 AM, Paul Meems <[hidden email]> wrote:I have a drone raster file which I want to use for some calculation.Before the calculation, I need to loose some extreme values.I want to do something like a percentile calculation where you get all values, order them and loose the top 10%.For this, I need to get all values first which can be slow when using a large file.I looked at the statistics (band.GetStatistics) but that doesn't work well.I thought I could use 2 times the standard deviation added to the mean to get roughly 97%.But with these statistics:    STATISTICS_MAXIMUM=33.186080932617    STATISTICS_MEAN=24.840205979603    STATISTICS_MINIMUM=1.5951598882675    STATISTICS_STDDEV=4.7285348016053Mean + 2*std is larger than the max.So I moved to the histogram. It is also very fast, but I'm not sure how to use it.I have this:  256 buckets from 1.53322 to 33.248:  410 77 66 66 65 58 56 45 42 87 57 72 61 65 68 70 73 82 93 ...Does this mean, bucket 1 = 410 that I have 410 pixels of value 1.53322 and the second bucket means I have 77 pixels between 1.53322 and 1.657? 1.657 = 1.53322 + ((33.248 - 1.53322)/256)Is this a good approach? Or can/should I use a different one.Thanks, Paul _______________________________________________gdal-dev mailing list[hidden email]https://lists.osgeo.org/mailman/listinfo/gdal-dev_______________________________________________ gdal-dev mailing list [hidden email] https://lists.osgeo.org/mailman/listinfo/gdal-dev
Open this post in threaded view
|
Report Content as Inappropriate

## Re: Raster statistics

 Thanks Chris for your reply.I forgot to mention I'm not using GDAL with Python.I use it with C++ and/or C#. Paul Paul Meems Release manager, configuration managerand forum moderator of MapWindow GIS.www.mapwindow.orgOwner of MapWindow.nl - Support forDutch speaking users.www.mapwindow.nl The MapWindow GIS project has moved to GitHub!Download the latest MapWindow 5 open source desktop application. 2017-08-03 20:05 GMT+02:00 Chris Waigl :I would not use gdal for this particular task. I presume you have the band data in a 2D numpy array. Then I’d get the 80th percentile for example with np.percentile() and use a boolean expression to generate a mask for the array (droneraster > perc80value ).Chris -- Christine (Chris) Waigl - [hidden email] -  +1-907-474-5483 - Skype: cwaigl_workGeophysical Institute, UAF, 903 Koyukuk Drive, Fairbanks, AK 99775-7320, USA On Aug 3, 2017, at 5:43 AM, Paul Meems <[hidden email]> wrote:I have a drone raster file which I want to use for some calculation.Before the calculation, I need to loose some extreme values.I want to do something like a percentile calculation where you get all values, order them and loose the top 10%.For this, I need to get all values first which can be slow when using a large file.I looked at the statistics (band.GetStatistics) but that doesn't work well.I thought I could use 2 times the standard deviation added to the mean to get roughly 97%.But with these statistics:    STATISTICS_MAXIMUM=33.186080932617    STATISTICS_MEAN=24.840205979603    STATISTICS_MINIMUM=1.5951598882675    STATISTICS_STDDEV=4.7285348016053Mean + 2*std is larger than the max.So I moved to the histogram. It is also very fast, but I'm not sure how to use it.I have this:  256 buckets from 1.53322 to 33.248:  410 77 66 66 65 58 56 45 42 87 57 72 61 65 68 70 73 82 93 ...Does this mean, bucket 1 = 410 that I have 410 pixels of value 1.53322 and the second bucket means I have 77 pixels between 1.53322 and 1.657? 1.657 = 1.53322 + ((33.248 - 1.53322)/256)Is this a good approach? Or can/should I use a different one.Thanks, Paul _______________________________________________gdal-dev mailing list[hidden email]https://lists.osgeo.org/mailman/listinfo/gdal-dev _______________________________________________ gdal-dev mailing list [hidden email] https://lists.osgeo.org/mailman/listinfo/gdal-dev