Cal Heldenbrand
2014-09-05 17:31:30 UTC
Hi everyone,
I'm trying to run a select query from a span of child partitions, separated
out in daily tables, in Postgres 9.1.5. The parent looks like this:
# \d logins
Table "public.logins"
Column | Type | Modifiers
-------------+-----------------------------+-----------
username | character varying(50) |
event | character varying(20) |
time | timestamp without time zone |
host | character varying(18) |
hash | character varying(32) |
Triggers:
logins_log_trigger BEFORE INSERT ON logins FOR EACH ROW EXECUTE
PROCEDURE logins_insert_trigger()
Number of child tables: 1581 (Use \d+ to list them.)
And an example child table:
# \d logins_20140904
Table "public.logins_20140904"
Column | Type | Modifiers
-------------+-----------------------------+-----------
username | character varying(50) |
event | character varying(20) |
time | timestamp without time zone |
host | character varying(18) |
hash | character varying(32) |
Indexes:
"logins_20140904_event" hash (event)
"logins_20140904_event_time" btree (event, "time")
"logins_20140904_username" hash (username)
"logins_20140904_username_time" btree (username, "time")
Check constraints:
"logins_20140904_time_check" CHECK ("time" >= '2014-09-04
00:00:00'::timestamp without time zone AND "time" <= '2014-09-04
23:59:59.99'::timestamp without time zone)
Inherits: logins
I'm attempting to run a query that looks something like this:
explain analyze select time,event from logins
where username='bob' and hash='1234' and time > current_date - interval
'1 week';
Result (cost=0.00..765.11 rows=1582 width=14)
-> Append (cost=0.00..765.11 rows=1582 width=14)
-> Seq Scan on logins (cost=0.00..0.00 rows=1 width=66)
Filter: (((username)::text = 'bob'::text) AND ((hash)::text =
'1234'::text) AND ("time" > (('now'::text)::date - '7 days'::interval)))
-> Index Scan using logins_20100501_username_time on
logins_20100501 logins (cost=0.01..0.48 rows=1 width=14)
...
This shows that it's attempting to run the query against all of my 1500
child tables.
If I change it to manually specify dates with an IN clause, it selects the
appropriate tables:
explain analyze select time,event from logins
where username='bob' and hash='1234' and time in ('2014-09-04',
'2014-09-05', '2014-09-03');
Result (cost=0.00..2.41 rows=3 width=31) (actual time=0.060..0.060 rows=0
loops=1)
-> Append (cost=0.00..2.41 rows=3 width=31) (actual time=0.060..0.060
rows=0 loops=1)
-> Seq Scan on logins (cost=0.00..0.00 rows=1 width=66) (actual
time=0.001..0.001 rows=0 loops=1)
Filter: (((username)::text = 'bob'::text) AND ((hash)::text
= '1234'::text) AND ("time" = ANY ('{"2014-09-04 00:00:00","2014-09-05
00:00:00","2014-09-03 00:00:00"}'::timestamp without time zone[])))
-> Bitmap Heap Scan on logins_20140903 logins (cost=1.09..1.20
rows=1 width=14) (actual time=0.039..0.039 rows=0 loops=1)
...
I know I could construct a query manually in my client by walking back
through the calendar dates, but for my own curiosity I'd like to find out a
more elegant way to run this query.
So far, I've been messing around with generating a series, then collecting
it back into an array, but nothing I've tried seems to work.
A few examples:
explain analyze select time,event from logins
where username='bob' and hash=1234' and time in (
generate_series(current_date - interval '3 days', current_date,
interval '1 day')
);
ERROR: argument of IN must not return a set
explain analyze select time,event from logins
where username='bob' and hash='1234' and time in (
select array_agg(series)
from generate_series(current_date - interval '3 days', current_date,
interval '1 day')
as u(series)
);
ERROR: operator does not exist: timestamp without time zone = timestamp
without time zone[]
explain analyze select time,event from logins
where username='bob' and hash=1234 and time in (
select unnest(array_agg(date_trunc('day',series))) from
generate_series(current_date - interval '3 days', current_date,
interval '1 day') as u(series)
)
-- No errors, but still queries all child tables.
Thank you!
--Cal
I'm trying to run a select query from a span of child partitions, separated
out in daily tables, in Postgres 9.1.5. The parent looks like this:
# \d logins
Table "public.logins"
Column | Type | Modifiers
-------------+-----------------------------+-----------
username | character varying(50) |
event | character varying(20) |
time | timestamp without time zone |
host | character varying(18) |
hash | character varying(32) |
Triggers:
logins_log_trigger BEFORE INSERT ON logins FOR EACH ROW EXECUTE
PROCEDURE logins_insert_trigger()
Number of child tables: 1581 (Use \d+ to list them.)
And an example child table:
# \d logins_20140904
Table "public.logins_20140904"
Column | Type | Modifiers
-------------+-----------------------------+-----------
username | character varying(50) |
event | character varying(20) |
time | timestamp without time zone |
host | character varying(18) |
hash | character varying(32) |
Indexes:
"logins_20140904_event" hash (event)
"logins_20140904_event_time" btree (event, "time")
"logins_20140904_username" hash (username)
"logins_20140904_username_time" btree (username, "time")
Check constraints:
"logins_20140904_time_check" CHECK ("time" >= '2014-09-04
00:00:00'::timestamp without time zone AND "time" <= '2014-09-04
23:59:59.99'::timestamp without time zone)
Inherits: logins
I'm attempting to run a query that looks something like this:
explain analyze select time,event from logins
where username='bob' and hash='1234' and time > current_date - interval
'1 week';
Result (cost=0.00..765.11 rows=1582 width=14)
-> Append (cost=0.00..765.11 rows=1582 width=14)
-> Seq Scan on logins (cost=0.00..0.00 rows=1 width=66)
Filter: (((username)::text = 'bob'::text) AND ((hash)::text =
'1234'::text) AND ("time" > (('now'::text)::date - '7 days'::interval)))
-> Index Scan using logins_20100501_username_time on
logins_20100501 logins (cost=0.01..0.48 rows=1 width=14)
...
This shows that it's attempting to run the query against all of my 1500
child tables.
If I change it to manually specify dates with an IN clause, it selects the
appropriate tables:
explain analyze select time,event from logins
where username='bob' and hash='1234' and time in ('2014-09-04',
'2014-09-05', '2014-09-03');
Result (cost=0.00..2.41 rows=3 width=31) (actual time=0.060..0.060 rows=0
loops=1)
-> Append (cost=0.00..2.41 rows=3 width=31) (actual time=0.060..0.060
rows=0 loops=1)
-> Seq Scan on logins (cost=0.00..0.00 rows=1 width=66) (actual
time=0.001..0.001 rows=0 loops=1)
Filter: (((username)::text = 'bob'::text) AND ((hash)::text
= '1234'::text) AND ("time" = ANY ('{"2014-09-04 00:00:00","2014-09-05
00:00:00","2014-09-03 00:00:00"}'::timestamp without time zone[])))
-> Bitmap Heap Scan on logins_20140903 logins (cost=1.09..1.20
rows=1 width=14) (actual time=0.039..0.039 rows=0 loops=1)
...
I know I could construct a query manually in my client by walking back
through the calendar dates, but for my own curiosity I'd like to find out a
more elegant way to run this query.
So far, I've been messing around with generating a series, then collecting
it back into an array, but nothing I've tried seems to work.
A few examples:
explain analyze select time,event from logins
where username='bob' and hash=1234' and time in (
generate_series(current_date - interval '3 days', current_date,
interval '1 day')
);
ERROR: argument of IN must not return a set
explain analyze select time,event from logins
where username='bob' and hash='1234' and time in (
select array_agg(series)
from generate_series(current_date - interval '3 days', current_date,
interval '1 day')
as u(series)
);
ERROR: operator does not exist: timestamp without time zone = timestamp
without time zone[]
explain analyze select time,event from logins
where username='bob' and hash=1234 and time in (
select unnest(array_agg(date_trunc('day',series))) from
generate_series(current_date - interval '3 days', current_date,
interval '1 day') as u(series)
)
-- No errors, but still queries all child tables.
Thank you!
--Cal