[PROJ] Performance of proj_create_crs_to_crs()

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[PROJ] Performance of proj_create_crs_to_crs()

Paul Ramsey
So, having gotten all the axis swapping tap dancing working, I went to
run some of my favourite transforms around BC, finishing up with one
of my favourites...

 st_transform('SRID=3005;POINT(1000000 0)',4267)

This takes a point from a NAD83 projected system (EPSG:3005) to a
NAD27 geodetic system (EPSG:4267).

Here's the crazy part: this transformation takes 400ms, and the time
is all spend in in proj, getting the PJ.

I ran 20-30 of them in a row and captured the workload in Instruments
in case these function calls ring any bells WRT overhead, screenshot
attached.

Fortunately for bulk conversion PostGIS already caches the projection
object, in fact most of my work this week was in renovating that part
of the code, but older versions of Proj are much much faster in
resolving projections from projection strings.

Thoughts?

P.
_______________________________________________
PROJ mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/proj
Reply | Threaded
Open this post in threaded view
|

Re: Performance of proj_create_crs_to_crs()

Even Rouault-2
Paul,

This is a very interesting use case, and one that has probably the worse
runtime.

On my system, the following runs in 16.9 s, so 170 ms per instanciation

        for(int i = 0; i < 100; i++)
        {
            PJ *pj = proj_create_crs_to_crs(NULL, "EPSG:3005", "EPSG:4267",
NULL);
            proj_destroy(pj);
        }

so not as dramatic as 400 ms.

The reason is that the NAD27 <-> NAD83 conversion has several dozains of
potential paths (one per US state), and PROJ explore them. But, after
analysis, the main reason for the slowness was that it also tried by default
to find an intermediate CRS to explore better alternate transformation paths,
and WGS84 was a very prolific one in that case. But actually going through
WGS84 doesn't really give better pipelines at the end.

I've just issued PR https://github.com/OSGeo/proj.4/pull/1276 which disables
by default research of intermediate CRS when there is at least a direct
transformation. This should be good enough for most use cases. With this
change, in your use case, the above loop runs in 1.56 s, so 16 ms per
instanciation.

Even

> So, having gotten all the axis swapping tap dancing working, I went to
> run some of my favourite transforms around BC, finishing up with one
> of my favourites...
>
>  st_transform('SRID=3005;POINT(1000000 0)',4267)
>
> This takes a point from a NAD83 projected system (EPSG:3005) to a
> NAD27 geodetic system (EPSG:4267).
>
> Here's the crazy part: this transformation takes 400ms, and the time
> is all spend in in proj, getting the PJ.
>
> I ran 20-30 of them in a row and captured the workload in Instruments
> in case these function calls ring any bells WRT overhead, screenshot
> attached.
>
> Fortunately for bulk conversion PostGIS already caches the projection
> object, in fact most of my work this week was in renovating that part
> of the code, but older versions of Proj are much much faster in
> resolving projections from projection strings.
>
> Thoughts?
>
> P.


--
Spatialys - Geospatial professional services
http://www.spatialys.com
_______________________________________________
PROJ mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/proj
Reply | Threaded
Open this post in threaded view
|

Re: Performance of proj_create_crs_to_crs()

Paul Ramsey
Even,
Thanks so much, I can confirm that your PR makes my test run
(literally) 100x faster. (Takes about 1.5ms now)
Wonderful, wonderful!
P.

On Sun, Feb 17, 2019 at 10:46 AM Even Rouault
<[hidden email]> wrote:

>
> Paul,
>
> This is a very interesting use case, and one that has probably the worse
> runtime.
>
> On my system, the following runs in 16.9 s, so 170 ms per instanciation
>
>         for(int i = 0; i < 100; i++)
>         {
>             PJ *pj = proj_create_crs_to_crs(NULL, "EPSG:3005", "EPSG:4267",
> NULL);
>             proj_destroy(pj);
>         }
>
> so not as dramatic as 400 ms.
>
> The reason is that the NAD27 <-> NAD83 conversion has several dozains of
> potential paths (one per US state), and PROJ explore them. But, after
> analysis, the main reason for the slowness was that it also tried by default
> to find an intermediate CRS to explore better alternate transformation paths,
> and WGS84 was a very prolific one in that case. But actually going through
> WGS84 doesn't really give better pipelines at the end.
>
> I've just issued PR https://github.com/OSGeo/proj.4/pull/1276 which disables
> by default research of intermediate CRS when there is at least a direct
> transformation. This should be good enough for most use cases. With this
> change, in your use case, the above loop runs in 1.56 s, so 16 ms per
> instanciation.
>
> Even
>
> > So, having gotten all the axis swapping tap dancing working, I went to
> > run some of my favourite transforms around BC, finishing up with one
> > of my favourites...
> >
> >  st_transform('SRID=3005;POINT(1000000 0)',4267)
> >
> > This takes a point from a NAD83 projected system (EPSG:3005) to a
> > NAD27 geodetic system (EPSG:4267).
> >
> > Here's the crazy part: this transformation takes 400ms, and the time
> > is all spend in in proj, getting the PJ.
> >
> > I ran 20-30 of them in a row and captured the workload in Instruments
> > in case these function calls ring any bells WRT overhead, screenshot
> > attached.
> >
> > Fortunately for bulk conversion PostGIS already caches the projection
> > object, in fact most of my work this week was in renovating that part
> > of the code, but older versions of Proj are much much faster in
> > resolving projections from projection strings.
> >
> > Thoughts?
> >
> > P.
>
>
> --
> Spatialys - Geospatial professional services
> http://www.spatialys.com
_______________________________________________
PROJ mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/proj
Reply | Threaded
Open this post in threaded view
|

Re: Performance of proj_create_crs_to_crs()

Greg Troxel-2
In reply to this post by Even Rouault-2
Even Rouault <[hidden email]> writes:

> The reason is that the NAD27 <-> NAD83 conversion has several dozains of
> potential paths (one per US state), and PROJ explore them.

(understood about not using WGS84 as an intermediate being the real
issue)

I am curious about the "one per state" comment.  I realize NAD27 is
complicated, but I would think that if there are NAD27-NAD83 transforms
that are valid for a state (because of a local model, or maybe because
there is some NAD83 HARN that is specific for the state, but I'd think
that's a different CRS codepoint), that there would be some process to
only use them in their region of validity.  But the basic pipeline
request is just SRIDs and not region (which is as I expect it).

Can you elaborate?  Or is it just that there are multiple intermediate
datums, but really these all end being the same?
_______________________________________________
PROJ mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/proj
Reply | Threaded
Open this post in threaded view
|

Re: Performance of proj_create_crs_to_crs()

Even Rouault-2
On lundi 18 février 2019 19:57:10 CET Greg Troxel wrote:

> Even Rouault <[hidden email]> writes:
> > The reason is that the NAD27 <-> NAD83 conversion has several dozains of
> > potential paths (one per US state), and PROJ explore them.
>
> (understood about not using WGS84 as an intermediate being the real
> issue)
>
> I am curious about the "one per state" comment.  I realize NAD27 is
> complicated, but I would think that if there are NAD27-NAD83 transforms
> that are valid for a state (because of a local model, or maybe because
> there is some NAD83 HARN that is specific for the state, but I'd think
> that's a different CRS codepoint), that there would be some process to
> only use them in their region of validity.  But the basic pipeline
> request is just SRIDs and not region (which is as I expect it).
>
> Can you elaborate?  Or is it just that there are multiple intermediate
> datums, but really these all end being the same?

OK, I was indeed a bit approximative in my answer.

There are "just" 6 transformations registered in the EPSG database between
NAD27 and NAD83 using different grids (conus, ntv1, ntv2, etc...)

But before my change, the code that lists datum transformations also did a
self-join to find all CRS X for which we have NAD27 --> X and X --> NAD83
transformations. For X == WGS84, you have 85 NAD27 -> WGS84 transformations. A
lot of them are actually registered in the EPSG database as concatenated
transformations from NAD27 to NAD83 and NAD83 to WGS84 (the NAD83 to WGS84
transformations are actually duplicate of the same operations doing NAD83 to
NAD83(HARN) as you rightly mentionned). So in the end, you had (NAD27 -> NAD83
-> X=WGS84) -> NAD83. The code eventually realized that there was a NAD83-
>WGS84->NAD83 useless roundtrip and simplified all the results, but there was
of course significant processing time in doing all those lookups and
filtering.

In the case of EPSG:3005 which is a projected CRS with "Canada - British
Columbia" area of use, the research space of datum transformation can be
restricted, but currently the spatial testing is done quite late in the
process in code, and not at the SQL level.

Even

--
Spatialys - Geospatial professional services
http://www.spatialys.com
_______________________________________________
PROJ mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/proj