Most recent entry, up to date, by category: optimization

Question

Most recent entry, up to date, by category: optimization

I have a table in a PostgreSQL database called feeds_up. It looks like this:

| feed_url | isup | hasproblems | observed timestamp with tz    | id (pk)|
|----------|------|-------------|-------------------------------|--------|
| http://b.| t    | f           | 2013-02-27 16:34:46.327401+11 | 15235  |
| http://f.| f    | t           | 2013-02-27 16:31:25.415126+11 | 15236  |

It has something like 300k lines growing in ~ 20 rows every five minutes. I have a request that works very often (every page load)

select distinct on (feed_url) feed_url, isUp, hasProblems
    from feeds_up
    where observed <= '2013-02-27T05:38:00.000Z'
    order by feed_url, observed desc;

I will give an example of time, this time is parameterized. An analysis of the explanation is at explain.depesz.com . It takes about 8 seconds. Crazy!

There are only about 20 unique values for feed_url, so this seems really inefficient. I thought I would be stupid and try the FOR loop in a function.

CREATE OR REPLACE FUNCTION feedStatusAtDate(theTime timestamp with time zone) RETURNS SETOF feeds_up AS
$BODY$
DECLARE
    url feeds_list%rowtype;
BEGIN
FOR url IN SELECT * FROM feeds_list 
LOOP
    RETURN QUERY SELECT * FROM feeds_up
    WHERE observed <= theTime
    AND feed_url = url.feed_url
    ORDER BY observed DESC LIMIT 1;
END LOOP;
END;
$BODY$ language plpgsql;

select * from feedStatusAtDate('2013-02-27T05:38:00.000Z');

It only takes 307 ms!

FOR SQL , , , ? ? , FOR ?

ETA

Postgres: PostgreSQL 9.1.5 i686-pc-linux-gnu, gcc (SUSE Linux) 4.3.4 [gcc-4_3-branch revision 152973], 32-

feeds_up:

CREATE INDEX feeds_up_url
  ON feeds_up
  USING btree
  (feed_url COLLATE pg_catalog."default");

CREATE INDEX feeds_up_url_observed
  ON feeds_up
  USING btree
  (feed_url COLLATE pg_catalog."default", observed DESC);

CREATE INDEX feeds_up_observed
  ON public.feeds_up
  USING btree
  (observed DESC);

+5

sql for-loop postgresql

Cathy 15 . '13 0:12

2

, , .

, ""

"feed_url, "

, "feed_url", , , , . , , .

partition "feed_url" ( , )? "" ()?

0

RGPT 15 . '13 3:09

marcj · Accepted Answer · 2013-04-15T03:52:04+0000

, "id" , , MAX (id) feed_url , :

SELECT fu.feed_url, fu.isup, fu.hasproblems, fu.observed
FROM feeds_up fu
JOIN
(
  SELECT feed_url, max(id)  AS id FROM feeds_up
  WHERE observed <= '2013-03-27T05:38:00.000Z'
  GROUP BY feed_url
) AS q USING (id);
ORDER BY fu.feed_url, fu.observed desc;

, , "".

UPDATE:

"" "id" ( ), :

SELECT DISTINCT ON (fu.feed_url) fu.feed_url, fu.isup, fu.hasproblems, fu.observed
FROM feeds_up fu
JOIN
(
  SELECT feed_url, max(observed) as observed FROM feeds_up
  WHERE observed <= '2013-03-27T05:38:00.000Z'
  GROUP BY feed_url
) AS q USING (feed_url, observed)
ORDER BY fu.feed_url, fu.observed desc;

"". YMMV

Most recent entry, up to date, by category: optimization

More articles: