[postgis] Indexing....ack!

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

[postgis] Indexing....ack!

kelly harmon
Before I start complaining, let me lead in with this:  The postgis
extensions that you've done are very impressive!   Good job!



Okay, now I complain:

I have a table with about 90,000 rows in it.  The table has a
geometry column that always contains a 2D POINT.  I created a GIST
index on this column.  

I do a query against the table to select the records inside a certain
area, and it will NOT use the index, even though if I force it to, by
turning off the table scans, the query executes MUCH faster!  Is
there anything that can be done about this?  I would greatly
appreciate any help.


Here is my table:

create table pole (bec_record_id numeric(8), ipid numeric(20),
featuresymgeometries geometry);




Here is my index:

CREATE INDEX POLE_NDX_1 ON Pole USING GIST ( featuresymgeometries
GIST_GEOMETRY_OPS ) WITH ( ISLOSSY );




Here is my select:

explain select count(*) from  Pole where  Pole.featuresymgeometries
&& 'BOX3D(-69.9392089498987 18.47719733442078,-69.9368322996733
18.478711603947218)'::box3d   and on_image is null




Results with seqscan enabled:

Aggregate  (cost=8699.73..8699.73 rows=1 width=0)
  ->  Seq Scan on pole  (cost=0.00..8677.16 rows=9025 width=0)



Results with seqscan off:

set enable_seqscan=off;

explain select count(*) from  Pole where  Pole.featuresymgeometries
&& 'BOX3D(-69.9392089498987 18.47719733442078,-69.9368322996733
18.478711603947218)'::box3d   and on_image is null

Aggregate  (cost=22441.74..22441.74 rows=1 width=0)
  ->  Index Scan using pole_ndx_1 on pole  (cost=0.00..22419.18
rows=9025 width=0)





To unsubscribe from this group, send an email to:
[hidden email]

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



Reply | Threaded
Open this post in threaded view
|

Re: [postgis] Indexing....ack!

Paul Ramsey-2
This is kind of a "known problem" and is pointed out in the
documentation: in order to get the spatial indexes to work, you have to
force them on with 'set enable_seqscan = off'. So, does this suck? Oh
yeah, baby, yeah!

Dave has made noises in the past that it might be possible to get the
planner to do the "right thing" with respect to spatial indexes by
jimmying with the weighting and scores for our index types in the system
tables. Perhaps he could look into this... *cough*


[hidden email] wrote:

>
> Before I start complaining, let me lead in with this:  The postgis
> extensions that you've done are very impressive!   Good job!
>
> Okay, now I complain:
>
> I have a table with about 90,000 rows in it.  The table has a
> geometry column that always contains a 2D POINT.  I created a GIST
> index on this column.
>
> I do a query against the table to select the records inside a certain
> area, and it will NOT use the index, even though if I force it to, by
> turning off the table scans, the query executes MUCH faster!  Is
> there anything that can be done about this?  I would greatly
> appreciate any help.
>
> Here is my table:
>
> create table pole (bec_record_id numeric(8), ipid numeric(20),
> featuresymgeometries geometry);
>
> Here is my index:
>
> CREATE INDEX POLE_NDX_1 ON Pole USING GIST ( featuresymgeometries
> GIST_GEOMETRY_OPS ) WITH ( ISLOSSY );
>
> Here is my select:
>
> explain select count(*) from  Pole where  Pole.featuresymgeometries
> && 'BOX3D(-69.9392089498987 18.47719733442078,-69.9368322996733
> 18.478711603947218)'::box3d   and on_image is null
>
> Results with seqscan enabled:
>
> Aggregate  (cost=8699.73..8699.73 rows=1 width=0)
>   ->  Seq Scan on pole  (cost=0.00..8677.16 rows=9025 width=0)
>
> Results with seqscan off:
>
> set enable_seqscan=off;
>
> explain select count(*) from  Pole where  Pole.featuresymgeometries
> && 'BOX3D(-69.9392089498987 18.47719733442078,-69.9368322996733
> 18.478711603947218)'::box3d   and on_image is null
>
> Aggregate  (cost=22441.74..22441.74 rows=1 width=0)
>   ->  Index Scan using pole_ndx_1 on pole  (cost=0.00..22419.18
> rows=9025 width=0)
>
> To unsubscribe from this group, send an email to:
> [hidden email]
>
>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/

To unsubscribe from this group, send an email to:
[hidden email]

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



Reply | Threaded
Open this post in threaded view
|

Re: [postgis] Indexing....ack!

Paul Ramsey-2
And a further note: the reason your EXPLAINS give the essentially the
same effort scores regardless of whether you have seqscans on or not is
a byproduct of the incorrect scoring being applied to the spatial
indexes. Since the scoring being applied by postgresql is wrong, the
indexing never kicks in. But you'll note, incorrect scores or not, when
you *force* the index, it actually *is* alot faster :)

Paul Ramsey wrote:

>
> This is kind of a "known problem" and is pointed out in the
> documentation: in order to get the spatial indexes to work, you have to
> force them on with 'set enable_seqscan = off'. So, does this suck? Oh
> yeah, baby, yeah!
>
> Dave has made noises in the past that it might be possible to get the
> planner to do the "right thing" with respect to spatial indexes by
> jimmying with the weighting and scores for our index types in the system
> tables. Perhaps he could look into this... *cough*
>
> [hidden email] wrote:
> >
> > Before I start complaining, let me lead in with this:  The postgis
> > extensions that you've done are very impressive!   Good job!
> >
> > Okay, now I complain:
> >
> > I have a table with about 90,000 rows in it.  The table has a
> > geometry column that always contains a 2D POINT.  I created a GIST
> > index on this column.
> >
> > I do a query against the table to select the records inside a certain
> > area, and it will NOT use the index, even though if I force it to, by
> > turning off the table scans, the query executes MUCH faster!  Is
> > there anything that can be done about this?  I would greatly
> > appreciate any help.
> >
> > Here is my table:
> >
> > create table pole (bec_record_id numeric(8), ipid numeric(20),
> > featuresymgeometries geometry);
> >
> > Here is my index:
> >
> > CREATE INDEX POLE_NDX_1 ON Pole USING GIST ( featuresymgeometries
> > GIST_GEOMETRY_OPS ) WITH ( ISLOSSY );
> >
> > Here is my select:
> >
> > explain select count(*) from  Pole where  Pole.featuresymgeometries
> > && 'BOX3D(-69.9392089498987 18.47719733442078,-69.9368322996733
> > 18.478711603947218)'::box3d   and on_image is null
> >
> > Results with seqscan enabled:
> >
> > Aggregate  (cost=8699.73..8699.73 rows=1 width=0)
> >   ->  Seq Scan on pole  (cost=0.00..8677.16 rows=9025 width=0)
> >
> > Results with seqscan off:
> >
> > set enable_seqscan=off;
> >
> > explain select count(*) from  Pole where  Pole.featuresymgeometries
> > && 'BOX3D(-69.9392089498987 18.47719733442078,-69.9368322996733
> > 18.478711603947218)'::box3d   and on_image is null
> >
> > Aggregate  (cost=22441.74..22441.74 rows=1 width=0)
> >   ->  Index Scan using pole_ndx_1 on pole  (cost=0.00..22419.18
> > rows=9025 width=0)
> >
> > To unsubscribe from this group, send an email to:
> > [hidden email]
> >
> >
> >
> > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
> To unsubscribe from this group, send an email to:
> [hidden email]
>
>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/

To unsubscribe from this group, send an email to:
[hidden email]

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



Reply | Threaded
Open this post in threaded view
|

Re: [postgis] Indexing....ack!

David Blasby-3
Okay, I've tricked postgresql to (pretty much) always use the GiST index
WITHOUT having to do a 'set enable_seqscan=off'.

Its available in the CVS version.

dave

To unsubscribe from this group, send an email to:
[hidden email]

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



Reply | Threaded
Open this post in threaded view
|

Re: [postgis] Indexing....ack!

Paul Ramsey-2
And how did you *do* that, grasshopper :)
(And are we sure we want to *always* use the index?)

Dave Blasby wrote:

>
> Okay, I've tricked postgresql to (pretty much) always use the GiST index
> WITHOUT having to do a 'set enable_seqscan=off'.
>
> Its available in the CVS version.
>
> dave
>
> To unsubscribe from this group, send an email to:
> [hidden email]
>
>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/

--
      __
     /
     | Paul Ramsey
     | Refractions Research
     | Email: [hidden email]
     | Phone: (250) 885-0632
     \_

To unsubscribe from this group, send an email to:
[hidden email]

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



Reply | Threaded
Open this post in threaded view
|

Re: [postgis] Indexing....ack!

kelly harmon
In reply to this post by David Blasby-3
Excellent!  Many thanks!

--- Dave Blasby <[hidden email]> wrote:
> Okay, I've tricked postgresql to (pretty much)
> always use the GiST index
> WITHOUT having to do a 'set enable_seqscan=off'.
>
> Its available in the CVS version.
>
> dave
>


__________________________________________________
Do You Yahoo!?
Make international calls for as low as $.04/minute with Yahoo! Messenger
http://phonecard.yahoo.com/

To unsubscribe from this group, send an email to:
[hidden email]

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



Reply | Threaded
Open this post in threaded view
|

[postgis] Error in creating WKBPolygon by Postgis in Cygwin PostgreSQL

Hisaji ONO-3
 Hi.

  I've got the following errors for.Error in creating WKBPolygon from
WKTMultiPolygon, created by shp2pgml,  by PostGis 0.5.1 in Cygwin PostgreSQL
7.1.2 on WinNT Server 4.0 SP6a (Memory 256MBytes).

> #create table ku2 as select * from ku;
> #alter table ku2 add column geo2_val wkb;
> # update ku2 set geo2_val = asbinary(geo_value,'NDR');
>
> requested NDR
> TopMemoryContext: 24600 total in 3 blocks; 19016 free (30 chunks); 5584
> used
> TopTransactionContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16
> used
> DeferredTriggerXact: 8192 total in 1 blocks; 8176 free (0 chunks); 16
> used
> TransactionCommandContext: 8192 total in 1 blocks; 2712 free (3 chunks);
> 5480 us
> ed
> JunkFilterContext: 3072 total in 2 blocks; 1480 free (0 chunks); 1592
> used
> PlanExprContext: 169071400 total in 3 blocks; 8152 free (8 chunks);
> 169063248 us
> ed
> QueryContext: 8192 total in 1 blocks; 6136 free (6 chunks); 2056 used
> DeferredTriggerSession: 8192 total in 1 blocks; 8176 free (0 chunks); 16
> used
> CacheMemoryContext: 516096 total in 6 blocks; 176952 free (145 chunks);
> 339144 u
> sed
> pg_user: 5120 total in 5 blocks; 912 free (0 chunks); 4208 used
> PortalMemory: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
> MdSmgr: 8192 total in 1 blocks; 6120 free (0 chunks); 2072 used
> DynaHash: 24576 total in 2 blocks; 6392 free (2 chunks); 18184 used
> ErrorContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
> ERROR:  Memory exhausted in AllocSetAlloc(169026186)
> ERROR:  Memory exhausted in AllocSetAlloc(169026186)
>

 After above result, nothing created in new wkb column.

And I also have troubles in creating WKBPoint from WKTPolygon. No error
message, but created WKB contained only 0x010180s (01 = little endian, 01 =
WKBPoint, 80=?).

 Please tell me how to solve these?




To unsubscribe from this group, send an email to:
[hidden email]

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



Reply | Threaded
Open this post in threaded view
|

Re: [postgis] Error in creating WKBPolygon by Postgis in Cygwin PostgreSQL

Paul Ramsey-2
> I've got the following errors for.Error in creating WKBPolygon from
> WKTMultiPolygon, created by shp2pgml,  by PostGis 0.5.1 in Cygwin
> PostgreSQL 7.1.2 on WinNT Server 4.0 SP6a (Memory 256MBytes).

> #create table ku2 as select * from ku;
> #alter table ku2 add column geo2_val wkb;
> # update ku2 set geo2_val = asbinary(geo_value,'NDR');

OK, the first thing to remember is that our internal storage format is
not WKB, it is 'geometry' :)
WKB is a standard external representation, so is WKT. So your update
command should look like this:

  # update ku2 set geo2_val = geo_value;

That will make the second column be identical to the first. There is no
point in trying to change the endianness of the stored features: they
will always be stored in the endianness of the underlying server
platform. If you want a different endianness for your WKB, specify it
when *extracting* them, not when storing them.

> And I also have troubles in creating WKBPoint from WKTPolygon. No error
> message, but created WKB contained only 0x010180s (01 = little endian, 01 =
> WKBPoint, 80=?).

I'm not sure what you're trying to do here... but as an example, you
should be able to do

select AsBinary(PointN('POLYGON(0 0,1 0,1 1,0 1,0 0)'::geometry,1));

That would turn a text string into a postgis geometry object, turn the
polygon geometry into a point, then return the binary representation of
that point.

------------------------ Yahoo! Groups Sponsor ---------------------~-->
The Nissan Sentra
Everything but compact
http://NissanDriven.com
http://us.click.yahoo.com/3vsIKC/txlCAA/ySSFAA/PhFolB/TM
---------------------------------------------------------------------~->

To unsubscribe from this group, send an email to:
[hidden email]

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



Reply | Threaded
Open this post in threaded view
|

Re: [postgis] Error in creating WKBPolygon by Postgis in Cygwin PostgreSQL

Paul Ramsey-2
In reply to this post by Hisaji ONO-3

Hisaji ONO wrote:

> > #create table ku2 as select * from ku;
> > #alter table ku2 add column geo2_val wkb;
> > # update ku2 set geo2_val = asbinary(geo_value,'NDR');

Actually, I should have read your example more closely, I did not see
that you had set the type of the destination column to wkb.

This is not "recommended" behavior: wkb is not supposed to be a storage
type, just a type used in transition on the way out of the system.
However, in theory your commands should work, so we'll have a look, if
only to figure out why they failed. :)

To unsubscribe from this group, send an email to:
[hidden email]

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



Reply | Threaded
Open this post in threaded view
|

Re: [postgis] Indexing....ack!

David Blasby-3
In reply to this post by Paul Ramsey-2
Paul Ramsey wrote:
>
> And how did you *do* that, grasshopper :)
> (And are we sure we want to *always* use the index?)

I changed the definition of the '&&' (BBox overlap) operator;

CREATE OPERATOR && (
   LEFTARG = GEOMETRY, RIGHTARG = GEOMETRY, PROCEDURE =
geometry_overlap,
   COMMUTATOR = '&&',
   RESTRICT = postgis_gist_sel, JOIN = positionjoinsel
);

The change is in the RESTRICT function.  Its supposed to provide an
estimate of how many rows will be returned.  The old postgresql geometry
types used a function that estimated way too many rows. The planner
thought it would be more efficient to, therefore, not use the index.

The postgis_gist_sel() function estimates a much smaller percentage of
rows returned, so the planner is much more likely to use the index.

The default estimate was, I think, 0.5%.  The new one is, I think,
0.005%.  Its kinda of silly to use a constant number, but to actually
compute an actual estimate would probably take longer than to do the
actual query.

My bet is that an index scan will always outperform a sequence scan when
the resulting query returns less than 1/2 the rows in the table.
For cases where more than 1/2 the rows are returned, the index may
actually slow things down because postgresql will have to read both the
index from the disk as well as the actual geometries.

But, if any of the geometries are large they could be compressed or
TOASTed.  If thats the case, the index will greatly improve speed
because it can pull several BBoxes with one disk page access while the
sequence scan will have to pull the entire geometry (several disk ops
and CPU time) to check its bounding box.

A slightly more intellegent RESTRICT function would look at the total
extent of all the geometries compared to the size of the query box.
Unfortunately, i dont know how to tag more information into the actual
GiST index tree. It would take either a read of the entire geometry
column or the entire index to compute the total extent.

dave

To unsubscribe from this group, send an email to:
[hidden email]

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



Reply | Threaded
Open this post in threaded view
|

Re: [postgis] Indexing....ack!

Paul Ramsey-2

> The default estimate was, I think, 0.5%.  The new one is, I think,
> 0.005%.  Its kinda of silly to use a constant number, but to actually
> compute an actual estimate would probably take longer than to do the
> actual query.

How do the BTree indexes do this? Do they not use some information
gathered by VACUUM to do a "guess" about the extent of the query? Would
it be possible for us to do similar things?

To unsubscribe from this group, send an email to:
[hidden email]

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



Reply | Threaded
Open this post in threaded view
|

Re: [postgis] Error in creating WKBPolygon by Postgis in Cygwin PostgreSQL

David Blasby-3
In reply to this post by Hisaji ONO-3
Hisaji ONO wrote:
>   I've got the following errors for.Error in creating WKBPolygon from
> WKTMultiPolygon, created by shp2pgml,  by PostGis 0.5.1 in Cygwin PostgreSQL
> 7.1.2 on WinNT Server 4.0 SP6a (Memory 256MBytes).
>
> > #create table ku2 as select * from ku;
> > #alter table ku2 add column geo2_val wkb;
> > # update ku2 set geo2_val = asbinary(geo_value,'NDR');

> > ERROR:  Memory exhausted in AllocSetAlloc(169026186)
> > ERROR:  Memory exhausted in AllocSetAlloc(169026186)

That probably because you're not supposed to actually use the 'wkb' as a
type in columns.  I only created it to 'tag' the wkb so it could be
recognized.  Its basically the same as the TEXT type (4 byte size
header, then a chuck of data).  I'll look into the WKB type and see if I
can spruce it up for actual use in tables.
 
>  After above result, nothing created in new wkb column.
>
> And I also have troubles in creating WKBPoint from WKTPolygon. No error
> message, but created WKB contained only 0x010180s (01 = little endian, 01 =
> WKBPoint, 80=?).

I'm following Frank's WKB spec for 3d points.  The '80' means that its a
3d geometry, so you can expect 3 coords for each point.  If you want
openGIS spec WKB, you'll have to do a force_2d() on the geometry.

dave

To unsubscribe from this group, send an email to:
[hidden email]

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



Reply | Threaded
Open this post in threaded view
|

Re: [postgis] Error in creating WKBPolygon by Postgis in Cygwin PostgreSQL

Frank Warmerdam
Dave Blasby wrote:
> I'm following Frank's WKB spec for 3d points.  The '80' means that its a
> 3d geometry, so you can expect 3 coords for each point.  If you want
> openGIS spec WKB, you'll have to do a force_2d() on the geometry.

Folks,

Just for reference, the 2.5D extension was designed by Adam Gawne-Cain
of Cadcorp.  It was formally submitted to the Simple Features for COM
revision working group, but I don't know what became of it after that.
I don't think the published specification was ever updated to include
it. However, PostGIS, OGR and FME have adopted it. :-)


Best regards,
--
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam,
[hidden email]
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | Geospatial Programmer for Rent


To unsubscribe from this group, send an email to:
[hidden email]

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



Reply | Threaded
Open this post in threaded view
|

Re: [postgis] Error in creating WKBPolygon by Postgis

David Blasby-3
In reply to this post by Hisaji ONO-3
I've added full[er] support for the WKB type.

You can now have columns of type WKB, and you can pg_dump them.

Their external representation is a hexidecimal string.  For example;

              astext              
|                                                      
mywkb                                                      
 
-------------------------------------+-------------------------------------------------------------------------------------------------------------------
-
 POINT(1 1)                          |
00000000013FF00000000000003FF0000000000000
 POINT(100 100 100)                  |
0000008001405900000000000040590000000000004059000000000000
 LINESTRING(100 100,100 100)         |
0000000002000000024059000000000000405900000000000040590000000000004059000000000000
 LINESTRING(100 100 100,100 100 100) |
000000800200000002405900000000000040590000000000004059000000000000405900000000000040590000000000004059000000000000
(4 rows)

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Secure your servers with 128-bit SSL encryption! Grab your copy of VeriSign's FREE Guide: "Securing Your Web Site for Business." Get it Now!
http://us.click.yahoo.com/n7RbFC/zhwCAA/yigFAA/PhFolB/TM
---------------------------------------------------------------------~->

To unsubscribe from this group, send an email to:
[hidden email]

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



Reply | Threaded
Open this post in threaded view
|

Re: [postgis] Error in creating WKBPolygon by Postgis

David Blasby-3
Hisaji ONO wrote:
>
>  Thank you very much, Dave.
>
> > I've added full[er] support for the WKB type.
> >
>
>  "full[er] support for the WKB type" includes creating spatial indexes for
> WKB-type items or not?

No, there are no operations you can perform on the WKB types.

If you want spatial indexing you have a few choices.

One of them is to just have a GEOMETRY column.  You can use the spatial
index on this column, and retrieve results using the asbinary() command.

The other is to create two columns, one with GEOMETRY and the other as
the WKB representation of the geometry.  You can use the spatial index
on the GEOMETRY column, but get information out of the WKB column.

For example,
The first method, queries would look like;
SELECT asbinary(<geometry column>) FROM <table> WHERE <geometry column>
&& <bounding box>

And the second method would look like;
SELECT <wkb column> FROM <table> WHERE <geometry column> && <bounding
box>

The major difference being that the 1st method requires translating the
internal GEOMETRY format to the WKB format for every query.

dave

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Secure your servers with 128-bit SSL encryption! Grab your copy of VeriSign's FREE Guide: "Securing Your Web Site for Business." Get it Now!
http://us.click.yahoo.com/n7RbFC/zhwCAA/yigFAA/PhFolB/TM
---------------------------------------------------------------------~->

To unsubscribe from this group, send an email to:
[hidden email]

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/