Discussion:
Function runtime increases after 5 calls in the same session.
Ilya I. Ashchepkov
2014-10-21 09:53:16 UTC
Permalink
Hi, all.

I wrote a function and during testing it I came across the strange
behaviour.
Function runtime is about 200ms first 5 times, 6th and futher calls takes
~22000 ms.
I simplified my schema, you can see it in attached file.
I've tested on 9.4beta3 and 9.3.5.

When I removed security_barrier from the view or alter column "time" type
to integer (bigint, I don't remember) all goes as expected.

It looks very strange for me.
I will be pleased if somebody checks this.
--
С уважеМОеЌ,
АщепкПв Илья ***@gmail.com
Marti Raudsepp
2014-10-21 10:20:28 UTC
Permalink
Hi
Post by Ilya I. Ashchepkov
I wrote a function and during testing it I came across the strange
behaviour.
Function runtime is about 200ms first 5 times, 6th and futher calls takes
~22000 ms.
I simplified my schema, you can see it in attached file.
I've tested on 9.4beta3 and 9.3.5.
I didn't look at the test case. But most likely the problem is that
after 5 executions, plancache decides that a generic plan is no more
expensive than a custom plan, and reverts to executing the generic
one. But in reality there is a big difference in execution time.

See this for how the logic works:
https://github.com/postgres/postgres/blob/master/src/backend/utils/cache/plancache.c#L1036

As a workaround you could use PL/pgSQL EXECUTE to run the query, which
isn't subject to generic plans:
http://www.postgresql.org/docs/current/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN

Regards,
Marti
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Ilya Ashchepkov
2014-10-22 02:43:00 UTC
Permalink
Post by Marti Raudsepp
Hi
Post by Ilya I. Ashchepkov
I wrote a function and during testing it I came across the strange
behaviour.
Function runtime is about 200ms first 5 times, 6th and futher calls takes
~22000 ms.
I simplified my schema, you can see it in attached file.
I've tested on 9.4beta3 and 9.3.5.
I didn't look at the test case. But most likely the problem is that
after 5 executions, plancache decides that a generic plan is no more
expensive than a custom plan, and reverts to executing the generic
one. But in reality there is a big difference in execution time.
https://github.com/postgres/postgres/blob/master/src/backend/utils/cache/plancache.c#L1036
As a workaround you could use PL/pgSQL EXECUTE to run the query, which
http://www.postgresql.org/docs/current/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN
Regards,
Marti
--
http://www.postgresql.org/mailpref/pgsql-general
Thank you. I tried execute and it helps.
--
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Tom Lane
2014-10-22 03:53:59 UTC
Permalink
Post by Marti Raudsepp
Post by Ilya I. Ashchepkov
I wrote a function and during testing it I came across the strange
behaviour.
Function runtime is about 200ms first 5 times, 6th and futher calls takes
~22000 ms.
I simplified my schema, you can see it in attached file.
I've tested on 9.4beta3 and 9.3.5.
I didn't look at the test case. But most likely the problem is that
after 5 executions, plancache decides that a generic plan is no more
expensive than a custom plan, and reverts to executing the generic
one. But in reality there is a big difference in execution time.
Yeah. Ilya didn't show the exact case he was testing, but I suppose
it was one where the timestamptz range covered the whole table (since
the test script created a table with a mighty tiny range of date values,
it would have taken some effort to do otherwise). Anyway I tried it with
this case:
select test.fun (1, 'yesterday', 'tomorrow');
and indeed reproduced a big slowdown in later executions.

You can examine what the planner is doing with prepared statements,
such as ones in plpgsql functions, via PREPARE/EXECUTE:

regression=# prepare foo(int, timestamptz, timestamptz) as
select sum(V1.data + V2.data)
from test.view V1
inner join test.view V2 using(id)
where
V1.object_id=$1
and V1.time between $2 and $3
and V2.time between $2 and $3;
PREPARE

The first time you try it you get:

regression=# explain execute foo (1, 'yesterday', 'tomorrow');
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=7037.25..7037.26 rows=1 width=8)
-> Merge Join (cost=5909.71..6759.53 rows=55544 width=8)
Merge Cond: (v1.id = v2.id)
-> Sort (cost=2967.35..2975.69 rows=3333 width=8)
Sort Key: v1.id
-> Subquery Scan on v1 (cost=0.00..2772.33 rows=3333 width=8)
-> Seq Scan on data (cost=0.00..2739.00 rows=3333 width=8)
Filter: (("time" >= '2014-10-20 00:00:00-04'::timestamp with time zone) AND ("time" <= '2014-10-22 00:00:00-04'::timestamp with time zone) AND (object_id = 1) AND test.can_read(object_id))
-> Sort (cost=2942.35..2950.69 rows=3333 width=8)
Sort Key: v2.id
-> Subquery Scan on v2 (cost=0.00..2747.33 rows=3333 width=8)
-> Seq Scan on data data_1 (cost=0.00..2714.00 rows=3333 width=8)
Filter: (("time" >= '2014-10-20 00:00:00-04'::timestamp with time zone) AND ("time" <= '2014-10-22 00:00:00-04'::timestamp with time zone) AND test.can_read(object_id))
(13 rows)

After repeating that a few times it switches to:

regression=# explain execute foo (1, 'yesterday', 'tomorrow');
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Aggregate (cost=428.63..428.64 rows=1 width=8)
-> Nested Loop (cost=5.20..428.54 rows=17 width=8)
Join Filter: (data.id = v2.id)
-> Bitmap Heap Scan on data (cost=4.91..81.29 rows=17 width=8)
Recheck Cond: ((object_id = $1) AND ("time" >= $2) AND ("time" <= $3))
Filter: test.can_read(object_id)
-> Bitmap Index Scan on data_object_id_time_idx (cost=0.00..4.91 rows=50 width=0)
Index Cond: ((object_id = $1) AND ("time" >= $2) AND ("time" <= $3))
-> Materialize (cost=0.29..342.79 rows=17 width=8)
-> Subquery Scan on v2 (cost=0.29..342.71 rows=17 width=8)
-> Index Scan using data_object_id_time_idx on data data_1 (cost=0.29..342.54 rows=17 width=8)
Index Cond: (("time" >= $2) AND ("time" <= $3))
Filter: test.can_read(object_id)
(13 rows)

Notice the plan now contains parameter markers $n instead of literal
constants; this is because this is a generic plan. The problem is
that the planner has estimated this case a lot cheaper than it did
when it saw the actual parameter values (and could determine that the
query was going to have to scan the entire table). Without the actual
parameter values, it has to fall back on rules of thumb to estimate
how much of the table will be scanned. Although it doesn't know the
comparison values, it does recognize the "between" constructs as being
range restrictions, and its rule of thumb is that those are fairly
selective --- about 1% IIRC. So it thinks the query will not need to
read very much of the table and chooses a plan that'd be appropriate
if that were true.

The custom-vs-generic-plan heuristic is designed for cases where the
generic plan is based on unduly pessimistic assumptions; which is where
most of the problems have been historically. Here we've got one where
the generic plan is based on unduly optimistic assumptions. Not sure
how we might fix that without breaking the more commonly complained-of
case.

regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Ilya Ashchepkov
2014-10-22 16:38:31 UTC
Permalink
Post by Tom Lane
Post by Marti Raudsepp
Post by Ilya I. Ashchepkov
I wrote a function and during testing it I came across the strange
behaviour.
Function runtime is about 200ms first 5 times, 6th and futher calls takes
~22000 ms.
I simplified my schema, you can see it in attached file.
I've tested on 9.4beta3 and 9.3.5.
I didn't look at the test case. But most likely the problem is that
after 5 executions, plancache decides that a generic plan is no more
expensive than a custom plan, and reverts to executing the generic
one. But in reality there is a big difference in execution time.
Yeah. Ilya didn't show the exact case he was testing, but I suppose
it was one where the timestamptz range covered the whole table (since
the test script created a table with a mighty tiny range of date values,
it would have taken some effort to do otherwise). Anyway I tried it with
select test.fun (1, 'yesterday', 'tomorrow');
and indeed reproduced a big slowdown in later executions.
You can examine what the planner is doing with prepared statements,
regression=# prepare foo(int, timestamptz, timestamptz) as
select sum(V1.data + V2.data)
from test.view V1
inner join test.view V2 using(id)
where
V1.object_id=$1
and V1.time between $2 and $3
and V2.time between $2 and $3;
PREPARE
regression=# explain execute foo (1, 'yesterday', 'tomorrow');
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=7037.25..7037.26 rows=1 width=8)
-> Merge Join (cost=5909.71..6759.53 rows=55544 width=8)
Merge Cond: (v1.id = v2.id)
-> Sort (cost=2967.35..2975.69 rows=3333 width=8)
Sort Key: v1.id
-> Subquery Scan on v1 (cost=0.00..2772.33 rows=3333 width=8)
-> Seq Scan on data (cost=0.00..2739.00 rows=3333 width=8)
Filter: (("time" >= '2014-10-20 00:00:00-04'::timestamp with time zone) AND ("time" <= '2014-10-22 00:00:00-04'::timestamp with time zone) AND (object_id = 1) AND test.can_read(object_id))
-> Sort (cost=2942.35..2950.69 rows=3333 width=8)
Sort Key: v2.id
-> Subquery Scan on v2 (cost=0.00..2747.33 rows=3333 width=8)
-> Seq Scan on data data_1 (cost=0.00..2714.00 rows=3333 width=8)
Filter: (("time" >= '2014-10-20 00:00:00-04'::timestamp with time zone) AND ("time" <= '2014-10-22 00:00:00-04'::timestamp with time zone) AND test.can_read(object_id))
(13 rows)
regression=# explain execute foo (1, 'yesterday', 'tomorrow');
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Aggregate (cost=428.63..428.64 rows=1 width=8)
-> Nested Loop (cost=5.20..428.54 rows=17 width=8)
Join Filter: (data.id = v2.id)
-> Bitmap Heap Scan on data (cost=4.91..81.29 rows=17 width=8)
Recheck Cond: ((object_id = $1) AND ("time" >= $2) AND ("time" <= $3))
Filter: test.can_read(object_id)
-> Bitmap Index Scan on data_object_id_time_idx (cost=0.00..4.91 rows=50 width=0)
Index Cond: ((object_id = $1) AND ("time" >= $2) AND ("time" <= $3))
-> Materialize (cost=0.29..342.79 rows=17 width=8)
-> Subquery Scan on v2 (cost=0.29..342.71 rows=17 width=8)
-> Index Scan using data_object_id_time_idx on data data_1 (cost=0.29..342.54 rows=17 width=8)
Index Cond: (("time" >= $2) AND ("time" <= $3))
Filter: test.can_read(object_id)
(13 rows)
Notice the plan now contains parameter markers $n instead of literal
constants; this is because this is a generic plan. The problem is
that the planner has estimated this case a lot cheaper than it did
when it saw the actual parameter values (and could determine that the
query was going to have to scan the entire table). Without the actual
parameter values, it has to fall back on rules of thumb to estimate
how much of the table will be scanned. Although it doesn't know the
comparison values, it does recognize the "between" constructs as being
range restrictions, and its rule of thumb is that those are fairly
selective --- about 1% IIRC. So it thinks the query will not need to
read very much of the table and chooses a plan that'd be appropriate
if that were true.
The custom-vs-generic-plan heuristic is designed for cases where the
generic plan is based on unduly pessimistic assumptions; which is where
most of the problems have been historically. Here we've got one where
the generic plan is based on unduly optimistic assumptions. Not sure
how we might fix that without breaking the more commonly complained-of
case.
regards, tom lane
--
http://www.postgresql.org/mailpref/pgsql-general
Thank you for detailed explanation.
You guessed right my test, I forgot to add it to the first message.
I'll add more data and try more realistic test
--
Ilya
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Loading...