what does it really mean for one geometry to be equal to another

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

what does it really mean for one geometry to be equal to another

Rhys A.D. Stewart
Greetings all,

I maintain a medium size table of customer locations, which, for business purposes now needs to not have any coincident points. Table definition follows:
=====================================================================
service.location
(
  premises text NOT NULL,
  matchtype text,
  matchdate date,
  connectedtransformer text,
  g geometry(Point,3448),
  CONSTRAINT servicelocation_pkey PRIMARY KEY (premises),
  CONSTRAINT servicelocation_premisesnumber_check CHECK (char_length(premises) = 6 OR char_length(premises) = 7),
  CONSTRAINT servicelocation_premisesnumber_is_a_number_check CHECK (premises !~* '[A-z]+'::text)
)
=====================================================================

There are approximately 866k rows, and a gist index on g. I update the table so that no geometries are coincident ( see https://gist.github.com/rhysallister/bcb4bb07a99d69938fff88f150883bee for the sql to remove the coincident geoms) I ran the sql in the gist until it said 0 rows affected.

To prevent one from inserting or updating a coincident geometry I try to create a unique index on g. Since gist doesn't support unique indices I do:

=====================================================================
CREATE unique INDEX unique_g ON service.location (st_astext(g) );
---------------------------------------------------------------------
ERROR:  could not create unique index "unique_g"
DETAIL:  Key (st_astext(g))=(POINT(727895.4 663599.3)) is duplicated.
=====================================================================
This makes me slightly flummoxed. I'm pretty sure the query in the gist returned 0 affected rows. But, maybe I missed a step. I try to find the offending rows with:

=====================================================================
select premises, st_astext(g), g from service.location 
where st_equals(g, 'SRID=3448;POINT(727895.4 663599.3)'::geometry)
---------------------------------------------------------------------
premises st_astext g
267077 POINT(727895.4 663599.3) 0101000020780D0000CDCCCCCCAE3626419A9999995E402441
=====================================================================
Strange. I now move to being slightly perturbed. I'm very sure the previous error message made mention of duplicity. I then run 

=====================================================================
select premises, st_astext(g), g from service.location 
where st_astext(g) = 'POINT(727895.4 663599.3)'
---------------------------------------------------------------------
premises st_astext g
267077 POINT(727895.4 663599.3) 0101000020780D0000CDCCCCCCAE3626419A9999995E402441
267053 POINT(727895.4 663599.3) 0101000020780D0000CDCCCCCCAE362641999999995E402441
=====================================================================

Now I'm just confused, the 2 premises have the same st_astext, but different wkb representations and as such are not being caught in the st_equals call.


Is there some gotcha that I don't know about, maybe something in the docs that I missed or is this not supposed to happen?

Rhys
Peace & Love|Live Long & Prosper

_______________________________________________
postgis-users mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|

Re: what does it really mean for one geometry to be equal to another

Sandro Santilli-4
On Thu, Sep 15, 2016 at 08:53:25PM -0500, Rhys A.D. Stewart wrote:

> Now I'm just confused, the 2 premises have the same st_astext, but
> different wkb representations and as such are not being caught in the
> st_equals call.
>
> Is there some gotcha that I don't know about, maybe something in the docs
> that I missed or is this not supposed to happen?

What you describe is expected:
the text representation is approximated/truncated.

You could create your index on ST_AsBinary(geometry) if you wanted
to check binary-level equality, but as your subject asks: what's
your concept of equality ? Do you really want those practically
identical points in your db ? Or you could create an index on
ST_SnapToGrid(geom, <tolerance>), for points to be no closer
than <tolerance>. Or (for lines) you could use ST_HausdorffDistance.

Many ways to interpret equality, which is why the equality operator
is currently just checking for approximated minimum bounding box equality
(probably still equal for your two almost-identical points).

--strk;

  ()   Free GIS & Flash consultant/developer
  /\   https://strk.kbt.io/services.html
_______________________________________________
postgis-users mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|

Re: what does it really mean for one geometry to be equal to another

Paul Norman
In reply to this post by Rhys A.D. Stewart
On 9/15/2016 6:53 PM, Rhys A.D. Stewart wrote:
=====================================================================
select premises, st_astext(g), g from service.location 
where st_astext(g) = 'POINT(727895.4 663599.3)'
---------------------------------------------------------------------
premises st_astext g
267077 POINT(727895.4 663599.3) 0101000020780D0000CDCCCCCCAE3626419A9999995E402441
267053 POINT(727895.4 663599.3) 0101000020780D0000CDCCCCCCAE362641999999995E402441
=====================================================================

Now I'm just confused, the 2 premises have the same st_astext, but different wkb representations and as such are not being caught in the st_equals call.


Is there some gotcha that I don't know about, maybe something in the docs that I missed or is this not supposed to happen?

A safe bet is that you're hitting floating point issues. EWKB is the canonical format for geometries, but the conversion to a text representation could lose some precision. If you did want to require unique geometries, you could do it with a btree index on the geometry, not st_astext of the geometry.

As a general rule, comparing two floating point numbers for equality is tricky. What you probably want is an exclusion constraint which prevents two points from being within a small distance of each other.

I don't know of a great way to do this, but a bad way that might work is EXCLUDE USING GIST (ST_Buffer(geom, 0.1) WITH &&). See https://www.postgresql.org/docs/current/static/rangetypes.html#RANGETYPES-CONSTRAINT and the links from there. I haven't tested this.

_______________________________________________
postgis-users mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|

Re: what does it really mean for one geometry to be equal to another

Paul Norman
On 9/15/2016 11:19 PM, Paul Norman wrote:
> I don't know of a great way to do this, but a bad way that might work
> is EXCLUDE USING GIST (ST_Buffer(geom, 0.1) WITH &&). See
> https://www.postgresql.org/docs/current/static/rangetypes.html#RANGETYPES-CONSTRAINT 
> and the links from there. I haven't tested this.

After further though, this is doing a bounding box comparison so it's
not quite 0.1 projected units and you can use ST_Expand instead of
ST_Buffer.
_______________________________________________
postgis-users mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|

Re: what does it really mean for one geometry to be equal to another

Paul Norman
In reply to this post by Sandro Santilli-4
On 9/15/2016 10:11 PM, Sandro Santilli wrote:
> you could create an index on
> ST_SnapToGrid(geom, <tolerance>), for points to be no closer
> than <tolerance>.

This won't check that points are no closer than <tolerance>, nor will it
solve floating point issues. It's the same as rounding. If two points
are on either side of the line where they round to different grid
points, even if they are arbitrarily close, up to the limit of precision.

It does mean that you will see fewer values where you get floating point
equality issues, but the difference between the rounded floating points
will increase.

_______________________________________________
postgis-users mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|

Re: what does it really mean for one geometry to be equal to another

Sandro Santilli-4
In reply to this post by Paul Norman
On Thu, Sep 15, 2016 at 11:19:49PM -0700, Paul Norman wrote:

> If you did want to require
> unique geometries, you could do it with a btree index on the
> geometry, not st_astext of the geometry.

The btree opclass for geometry only checks bounding box equality,
not geometry equality (see lwgeom_eq, aka OPERATOR=).

--strk;

  ()   Free GIS & Flash consultant/developer
  /\   https://strk.kbt.io/services.html
_______________________________________________
postgis-users mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/postgis-users