error while using i.segment.stats

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

error while using i.segment.stats

Veronica Andreo
Hi,

We are trying to use i.segment.stats for a map with 800000+ segments and already in two different laptops, we get:

bands=`g.list rast pat=IGUAZU_IMG_* sep=,`
RASTER_STATS=(min,max,range,mean,stddev,median,first_quart,third_quart,perc_90)
AREA_STATS=(area,perimeter,compact_circle,compact_square,fd)

i.segment.stats -rc \
                 map=segments_full_region \
                 rasters=$bands \
                 raster_statistics=$RASTER_STATS \
                 area_measures=$AREA_STATS \
                 vectormap=segs_stats_map \
                 processes=4
Calculating geometry statistics...
Calculating statistics for raster maps...
Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 463, in _handle_results
    task = get()
  File "/usr/lib/python3.6/multiprocessing/connection.py", line 251, in recv
    return _ForkingPickler.loads(buf.getbuffer())
TypeError: __init__() missing 3 required positional arguments: 'module', 'code', and 'returncode'

Does it have to do with memory? I used the module a month ago with 500000+ segments and it worked just fine...

Any hints are more than welcome

Best,
Vero

_______________________________________________
grass-user mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/grass-user
Reply | Threaded
Open this post in threaded view
|

Re: error while using i.segment.stats

Moritz Lennert
Hi Vero,

Le Mon, 2 Dec 2019 13:43:30 +0100,
Veronica Andreo <[hidden email]> a écrit :

> Hi,
>
> We are trying to use i.segment.stats for a map with 800000+ segments
> and already in two different laptops, we get:
>
> bands=`g.list rast pat=IGUAZU_IMG_* sep=,`
> RASTER_STATS=(min,max,range,mean,stddev,median,first_quart,third_quart,perc_90)
> AREA_STATS=(area,perimeter,compact_circle,compact_square,fd)
>
> i.segment.stats -rc \
>                  map=segments_full_region \
>                  rasters=$bands \
>                  raster_statistics=$RASTER_STATS \
>                  area_measures=$AREA_STATS \
>                  vectormap=segs_stats_map \
>                  processes=4
> Calculating geometry statistics...
> Calculating statistics for raster maps...
> Exception in thread Thread-3:
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/threading.py", line 916, in
> _bootstrap_inner self.run()
>   File "/usr/lib/python3.6/threading.py", line 864, in run
>     self._target(*self._args, **self._kwargs)
>   File "/usr/lib/python3.6/multiprocessing/pool.py", line 463, in
> _handle_results
>     task = get()
>   File "/usr/lib/python3.6/multiprocessing/connection.py", line 251,
> in recv return _ForkingPickler.loads(buf.getbuffer())
> TypeError: __init__() missing 3 required positional arguments:
> 'module', 'code', and 'returncode'
>
> Does it have to do with memory? I used the module a month ago with
> 500000+ segments and it worked just fine...

I don't think memory is the issue, but I find the error message
pretty cryptic, so wouldn't exclude altogether. Could it be some
difference between Python 2 and 3 in the multiprocessing module ? Would
it be possible for you try running it in Python 2 ?

Maybe you could also try to run it on a smaller subset of the
segmentation result ?

Moritz


--
Département Géosciences, Environnement et Société Université Libre de
Bruxelles Bureau: S.DB.6.138
CP 130/03
Av. F.D. Roosevelt 50
1050 Bruxelles
Belgique

tél. + 32 2 650.68.12 / 68.11 (secr.)
fax  + 32 2 650.68.30
_______________________________________________
grass-user mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/grass-user
Reply | Threaded
Open this post in threaded view
|

Re: error while using i.segment.stats

Veronica Andreo
Hi Moritz,

Thanks for your answer :)

In smaller subsets (per cutlines tiles), it works just fine.
I will try later today or tomorrow to run it with grass76.

Best,
Vero

El mar., 3 dic. 2019 a las 12:41, Moritz Lennert (<[hidden email]>) escribió:
Hi Vero,

Le Mon, 2 Dec 2019 13:43:30 +0100,
Veronica Andreo <[hidden email]> a écrit :

> Hi,
>
> We are trying to use i.segment.stats for a map with 800000+ segments
> and already in two different laptops, we get:
>
> bands=`g.list rast pat=IGUAZU_IMG_* sep=,`
> RASTER_STATS=(min,max,range,mean,stddev,median,first_quart,third_quart,perc_90)
> AREA_STATS=(area,perimeter,compact_circle,compact_square,fd)
>
> i.segment.stats -rc \
>                  map=segments_full_region \
>                  rasters=$bands \
>                  raster_statistics=$RASTER_STATS \
>                  area_measures=$AREA_STATS \
>                  vectormap=segs_stats_map \
>                  processes=4
> Calculating geometry statistics...
> Calculating statistics for raster maps...
> Exception in thread Thread-3:
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/threading.py", line 916, in
> _bootstrap_inner self.run()
>   File "/usr/lib/python3.6/threading.py", line 864, in run
>     self._target(*self._args, **self._kwargs)
>   File "/usr/lib/python3.6/multiprocessing/pool.py", line 463, in
> _handle_results
>     task = get()
>   File "/usr/lib/python3.6/multiprocessing/connection.py", line 251,
> in recv return _ForkingPickler.loads(buf.getbuffer())
> TypeError: __init__() missing 3 required positional arguments:
> 'module', 'code', and 'returncode'
>
> Does it have to do with memory? I used the module a month ago with
> 500000+ segments and it worked just fine...

I don't think memory is the issue, but I find the error message
pretty cryptic, so wouldn't exclude altogether. Could it be some
difference between Python 2 and 3 in the multiprocessing module ? Would
it be possible for you try running it in Python 2 ?

Maybe you could also try to run it on a smaller subset of the
segmentation result ?

Moritz


--
Département Géosciences, Environnement et Société Université Libre de
Bruxelles Bureau: S.DB.6.138
CP 130/03
Av. F.D. Roosevelt 50
1050 Bruxelles
Belgique

tél. + 32 2 650.68.12 / 68.11 (secr.)
fax  + 32 2 650.68.30

_______________________________________________
grass-user mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/grass-user
Reply | Threaded
Open this post in threaded view
|

Re: error while using i.segment.stats

Moritz Lennert
On 3/12/19 13:26, Veronica Andreo wrote:
> Hi Moritz,
>
> Thanks for your answer :)
>
> In smaller subsets (per cutlines tiles), it works just fine.

With grass78 using python 3 ? Then probably it's not a python version issue.

What is the region size ? When it is calculating the raster statistics
it uses r.univar with extended statistics (-e flag), if the region size
is too big, actually you are right that this might lead to memory errors
if it tries to run 4 instances in parallel.

Unfortunately, r.univar does not have a memory option allowing to limit
memory usage. Maybe i.segment.stats should check the region size and
available memory and bail out if there's not enough memory. Not sure how
to calculate the needed memory size, though (especially since r.univar's
-t flag is also set).

You could also try to reduce parallelization, i.e. run it with two
processes only. It will obviously be slower.

Moritz


>
> El mar., 3 dic. 2019 a las 12:41, Moritz Lennert
> (<[hidden email] <mailto:[hidden email]>>)
> escribió:
>
>     Hi Vero,
>
>     Le Mon, 2 Dec 2019 13:43:30 +0100,
>     Veronica Andreo <[hidden email] <mailto:[hidden email]>>
>     a écrit :
>
>      > Hi,
>      >
>      > We are trying to use i.segment.stats for a map with 800000+ segments
>      > and already in two different laptops, we get:
>      >
>      > bands=`g.list rast pat=IGUAZU_IMG_* sep=,`
>      >
>     RASTER_STATS=(min,max,range,mean,stddev,median,first_quart,third_quart,perc_90)
>      > AREA_STATS=(area,perimeter,compact_circle,compact_square,fd)
>      >
>      > i.segment.stats -rc \
>      >                  map=segments_full_region \
>      >                  rasters=$bands \
>      >                  raster_statistics=$RASTER_STATS \
>      >                  area_measures=$AREA_STATS \
>      >                  vectormap=segs_stats_map \
>      >                  processes=4
>      > Calculating geometry statistics...
>      > Calculating statistics for raster maps...
>      > Exception in thread Thread-3:
>      > Traceback (most recent call last):
>      >   File "/usr/lib/python3.6/threading.py", line 916, in
>      > _bootstrap_inner self.run()
>      >   File "/usr/lib/python3.6/threading.py", line 864, in run
>      >     self._target(*self._args, **self._kwargs)
>      >   File "/usr/lib/python3.6/multiprocessing/pool.py", line 463, in
>      > _handle_results
>      >     task = get()
>      >   File "/usr/lib/python3.6/multiprocessing/connection.py", line 251,
>      > in recv return _ForkingPickler.loads(buf.getbuffer())
>      > TypeError: __init__() missing 3 required positional arguments:
>      > 'module', 'code', and 'returncode'
>      >
>      > Does it have to do with memory? I used the module a month ago with
>      > 500000+ segments and it worked just fine...
>
>     I don't think memory is the issue, but I find the error message
>     pretty cryptic, so wouldn't exclude altogether. Could it be some
>     difference between Python 2 and 3 in the multiprocessing module ? Would
>     it be possible for you try running it in Python 2 ?
>
>     Maybe you could also try to run it on a smaller subset of the
>     segmentation result ?
>
>     Moritz
>
>
>     --
>     Département Géosciences, Environnement et Société Université Libre de
>     Bruxelles Bureau: S.DB.6.138
>     CP 130/03
>     Av. F.D. Roosevelt 50
>     1050 Bruxelles
>     Belgique
>
>     tél. + 32 2 650.68.12 / 68.11 (secr.)
>     fax  + 32 2 650.68.30
>


_______________________________________________
grass-user mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/grass-user
Reply | Threaded
Open this post in threaded view
|

Re: error while using i.segment.stats

Veronica Andreo
Hi Moritz,

thanks for coming back to this topic :)

Here, the original region settings:

g.region -p
projection: 1 (UTM)
zone:       -21
datum:      wgs84
ellipsoid:  wgs84
north:      7168182
south:      7157384
west:       737450
east:       750236
nsres:      0.5
ewres:      0.5
rows:       21596
cols:       25572
cells:      552252912

Because of the error we reduced it by ~half, but still got the same error...

[..]
rows:       20478
cols:       13553
cells:      277538334

In the end, we solved it by extracting the stats per tile (cutlines irregular tiles) and then patching...

best,
Vero


El jue., 5 dic. 2019 a las 9:35, Moritz Lennert (<[hidden email]>) escribió:
On 3/12/19 13:26, Veronica Andreo wrote:
> Hi Moritz,
>
> Thanks for your answer :)
>
> In smaller subsets (per cutlines tiles), it works just fine.

With grass78 using python 3 ? Then probably it's not a python version issue.

What is the region size ? When it is calculating the raster statistics
it uses r.univar with extended statistics (-e flag), if the region size
is too big, actually you are right that this might lead to memory errors
if it tries to run 4 instances in parallel.

Unfortunately, r.univar does not have a memory option allowing to limit
memory usage. Maybe i.segment.stats should check the region size and
available memory and bail out if there's not enough memory. Not sure how
to calculate the needed memory size, though (especially since r.univar's
-t flag is also set).

You could also try to reduce parallelization, i.e. run it with two
processes only. It will obviously be slower.

Moritz


>
> El mar., 3 dic. 2019 a las 12:41, Moritz Lennert
> (<[hidden email] <mailto:[hidden email]>>)
> escribió:
>
>     Hi Vero,
>
>     Le Mon, 2 Dec 2019 13:43:30 +0100,
>     Veronica Andreo <[hidden email] <mailto:[hidden email]>>
>     a écrit :
>
>      > Hi,
>      >
>      > We are trying to use i.segment.stats for a map with 800000+ segments
>      > and already in two different laptops, we get:
>      >
>      > bands=`g.list rast pat=IGUAZU_IMG_* sep=,`
>      >
>     RASTER_STATS=(min,max,range,mean,stddev,median,first_quart,third_quart,perc_90)
>      > AREA_STATS=(area,perimeter,compact_circle,compact_square,fd)
>      >
>      > i.segment.stats -rc \
>      >                  map=segments_full_region \
>      >                  rasters=$bands \
>      >                  raster_statistics=$RASTER_STATS \
>      >                  area_measures=$AREA_STATS \
>      >                  vectormap=segs_stats_map \
>      >                  processes=4
>      > Calculating geometry statistics...
>      > Calculating statistics for raster maps...
>      > Exception in thread Thread-3:
>      > Traceback (most recent call last):
>      >   File "/usr/lib/python3.6/threading.py", line 916, in
>      > _bootstrap_inner self.run()
>      >   File "/usr/lib/python3.6/threading.py", line 864, in run
>      >     self._target(*self._args, **self._kwargs)
>      >   File "/usr/lib/python3.6/multiprocessing/pool.py", line 463, in
>      > _handle_results
>      >     task = get()
>      >   File "/usr/lib/python3.6/multiprocessing/connection.py", line 251,
>      > in recv return _ForkingPickler.loads(buf.getbuffer())
>      > TypeError: __init__() missing 3 required positional arguments:
>      > 'module', 'code', and 'returncode'
>      >
>      > Does it have to do with memory? I used the module a month ago with
>      > 500000+ segments and it worked just fine...
>
>     I don't think memory is the issue, but I find the error message
>     pretty cryptic, so wouldn't exclude altogether. Could it be some
>     difference between Python 2 and 3 in the multiprocessing module ? Would
>     it be possible for you try running it in Python 2 ?
>
>     Maybe you could also try to run it on a smaller subset of the
>     segmentation result ?
>
>     Moritz
>
>
>     --
>     Département Géosciences, Environnement et Société Université Libre de
>     Bruxelles Bureau: S.DB.6.138
>     CP 130/03
>     Av. F.D. Roosevelt 50
>     1050 Bruxelles
>     Belgique
>
>     tél. + 32 2 650.68.12 / 68.11 (secr.)
>     fax  + 32 2 650.68.30
>



_______________________________________________
grass-user mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/grass-user