Query plan improvement when identical st_intersects filter is repeated twice

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Query plan improvement when identical st_intersects filter is repeated twice

Trang Nguyen

Hi,

 

I am tuning query performance on the  following query and am seeing two completely different query plans for the following. The only difference is that the second query repeats the same filter “st_intersects(waypoints, zone_geom)” a second time, which somehow  seems to trigger an index scan on the gist indexed LINESTRING column “waypoints” that doesn’t occur in the first query.

 

I’ve run analyze on the table so DB stats are current.

 

 

View:

REATE OR REPLACE VIEW od1.v_trip_zone AS

 SELECT z.uuid AS zone_id,

    z.name AS zone_name,

    z.namespace AS zone_namespace,

    z.geom AS zone_geom,

        CASE

            WHEN st_intersects(t.startloc, z.geom) AND st_intersects(t.endloc, z.geom) THEN 2

            WHEN st_intersects(t.startloc, z.geom) THEN 0

            WHEN st_intersects(t.endloc, z.geom) THEN 1

            WHEN st_intersects(t.waypoints, z.geom) THEN 3

            ELSE (-1)

        END AS match_cond,

    t.pkey,

    t.trip_id,

    t.startts,

    t.endts,

    t.startloc,

    t.endloc,

    t.probe_id,

    t.provider_id,

    t.movement_type,

    t.mode,

    t.trip_dist_m,

    t.trip_mean_speed_metersph,

    t.trip_max_speed_metersph,

    t.is_start_home,

    t.is_end_home,

    t.waypoints,

    t.createdts

   FROM od1.trip_v1_partitioned t,

    od1.zone z;

 

 

Query 1:

 

select pkey, trip_id, startts, endts, startloc, endloc, probe_id, provider_id, movement_type, mode, trip_dist_m, trip_mean_speed_metersph, trip_max_speed_metersph, is_start_home, is_end_home, waypoints from od1.v_trip_zone  where zone_id in ('kansas_303','kansas_601','kansas_603','kansas_604','kansas_10','kansas_11','kansas_9','kansas_310','kansas_311','kansas_315','kansas_301','kansas_302','kansas_307','kansas_306','kansas_305','kansas_204','kansas_201','kansas_3','kansas_2','kansas_1')

and st_intersects(waypoints, zone_geom)

and startts>=TIMESTAMP '2015-01-16' and endts<TIMESTAMP '2015-01-31'   and startts<TIMESTAMP '2015-01-31'

order by pkey

limit 2

 

Plan:

"Limit  (cost=82.43..1774.34 rows=2 width=1257)"

"  ->  Nested Loop  (cost=82.43..120241818.53 rows=142137 width=1257)"

"        Join Filter: ((t.waypoints && z.geom) AND _st_intersects(t.waypoints, z.geom))"

"        ->  Merge Append  (cost=0.70..7242945.66 rows=21320513 width=1257)"

"              Sort Key: t.pkey"

"              ->  Index Scan using trip_partitioned_v1_pkey on trip_v1_partitioned t  (cost=0.12..8.15 rows=1 width=216)"

"                    Filter: ((startts >= '2015-01-16 00:00:00'::timestamp without time zone) AND (endts < '2015-01-31 00:00:00'::timestamp without time zone) AND (startts < '2015-01-31 00:00:00'::timestamp without time zone))"

"              ->  Index Scan using trip_v1_y2015m01_pkey on trip_v1_y2015m01 t_1  (cost=0.56..6976431.09 rows=21320512 width=1257)"

"                    Filter: ((startts >= '2015-01-16 00:00:00'::timestamp without time zone) AND (endts < '2015-01-31 00:00:00'::timestamp without time zone) AND (startts < '2015-01-31 00:00:00'::timestamp without time zone))"

"        ->  Materialize  (cost=81.73..154.02 rows=20 width=9864)"

"              ->  Bitmap Heap Scan on zone z  (cost=81.73..153.92 rows=20 width=9864)"

"                    Recheck Cond: ((uuid)::text = ANY ('{kansas_303,kansas_601,kansas_603,kansas_604,kansas_10,kansas_11,kansas_9,kansas_310,kansas_311,kansas_315,kansas_301,kansas_302,kansas_307,kansas_306,kansas_305,kansas_204,kansas_201,kansas_3,kansas_ (...)"

"                    ->  Bitmap Index Scan on idx_uuid  (cost=0.00..81.72 rows=20 width=0)"

"                          Index Cond: ((uuid)::text = ANY ('{kansas_303,kansas_601,kansas_603,kansas_604,kansas_10,kansas_11,kansas_9,kansas_310,kansas_311,kansas_315,kansas_301,kansas_302,kansas_307,kansas_306,kansas_305,kansas_204,kansas_201,kansas_3,kan (...)"

 

Query 2 (Much improved):

 

select pkey, trip_id, startts, endts, startloc, endloc, probe_id, provider_id, movement_type, mode, trip_dist_m, trip_mean_speed_metersph, trip_max_speed_metersph, is_start_home, is_end_home, waypoints from od1.v_trip_zone  where zone_id in ('kansas_303','kansas_601','kansas_603','kansas_604','kansas_10','kansas_11','kansas_9','kansas_310','kansas_311','kansas_315','kansas_301','kansas_302','kansas_307','kansas_306','kansas_305','kansas_204','kansas_201','kansas_3','kansas_2','kansas_1')

and st_intersects(waypoints, zone_geom) and st_intersects(waypoints, zone_geom)

and startts>=TIMESTAMP '2015-01-16' and endts<TIMESTAMP '2015-01-31'   and startts<TIMESTAMP '2015-01-31'

order by pkey

limit 2

 

Plan:

"Limit  (cost=333.74..333.75 rows=2 width=1257)"

"  ->  Sort  (cost=333.74..333.86 rows=47 width=1257)"

"        Sort Key: t.pkey"

"        ->  Nested Loop  (cost=81.73..333.27 rows=47 width=1257)"

"              ->  Bitmap Heap Scan on zone z  (cost=81.73..153.92 rows=20 width=9864)"

"                    Recheck Cond: ((uuid)::text = ANY ('{kansas_303,kansas_601,kansas_603,kansas_604,kansas_10,kansas_11,kansas_9,kansas_310,kansas_311,kansas_315,kansas_301,kansas_302,kansas_307,kansas_306,kansas_305,kansas_204,kansas_201,kansas_3,kansas_ (...)"

"                    ->  Bitmap Index Scan on idx_uuid  (cost=0.00..81.72 rows=20 width=0)"

"                          Index Cond: ((uuid)::text = ANY ('{kansas_303,kansas_601,kansas_603,kansas_604,kansas_10,kansas_11,kansas_9,kansas_310,kansas_311,kansas_315,kansas_301,kansas_302,kansas_307,kansas_306,kansas_305,kansas_204,kansas_201,kansas_3,kan (...)"

"              ->  Append  (cost=0.00..8.95 rows=2 width=736)"

"                    ->  Seq Scan on trip_v1_partitioned t  (cost=0.00..0.00 rows=1 width=216)"

"                          Filter: ((startts >= '2015-01-16 00:00:00'::timestamp without time zone) AND (endts < '2015-01-31 00:00:00'::timestamp without time zone) AND (startts < '2015-01-31 00:00:00'::timestamp without time zone) AND (waypoints && z.geom) (...)"

"                    ->  Index Scan using idx_trip_v1_y2015m01_waypoints on trip_v1_y2015m01 t_1  (cost=0.42..8.95 rows=1 width=1257)"

"                          Index Cond: ((waypoints && z.geom) AND (waypoints && z.geom))"

"                          Filter: ((startts >= '2015-01-16 00:00:00'::timestamp without time zone) AND (endts < '2015-01-31 00:00:00'::timestamp without time zone) AND (startts < '2015-01-31 00:00:00'::timestamp without time zone) AND _st_intersects(waypoi (...)"

 

 

Could someone shed some light on the difference in query planner results?


Thanks,
Trang

 


_______________________________________________
postgis-users mailing list
[hidden email]
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
Reply | Threaded
Open this post in threaded view
|

Re: Query plan improvement when identical st_intersects filter is repeated twice

Sandro Santilli-2
On Thu, May 21, 2015 at 12:53:00AM +0000, Trang Nguyen wrote:
> Hi,
>
> I am tuning query performance on the  following query and am seeing two completely different query plans for the following. The only difference is that the second query repeats the same filter “st_intersects(waypoints, zone_geom)” a second time, which somehow  seems to trigger an index scan on the gist indexed LINESTRING column “waypoints” that doesn’t occur in the first query.

I guess it depends on the cost of the ST_Intersects function, which is
likely run twice when specified twice. Someone explained it to me but I
keep not getting why an immutable function should be ever called twice...

You might want to report this to PostgreSQL to try at understanding the
gory details.

--strk;
_______________________________________________
postgis-users mailing list
[hidden email]
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users