PostgreSQL
 sql >> Database >  >> RDS >> PostgreSQL

Come ordinare tuple distinte in una query PostgreSQL

Il ORDER BY più a sinistra gli articoli non possono essere in disaccordo con gli articoli del DISTINCT clausola. Cito il manuale su DISTINCT :

Prova:

SELECT *
FROM  (
    SELECT DISTINCT ON (c.cluster_id, feed_id) 
           c.cluster_id, num_docs, feed_id, url_time 
    FROM   url_info u
    JOIN   cluster_info c ON (c.cluster_id = u.cluster_id) 
    WHERE  feed_id IN (SELECT pot_seeder FROM potentials) 
    AND    num_docs > 5
    AND    url_time > '2012-04-16'
    ORDER  BY c.cluster_id, feed_id, num_docs, url_time
           -- first columns match DISTINCT
           -- the rest to pick certain values for dupes
           -- or did you want to pick random values for dupes?
    ) x
ORDER  BY num_docs DESC;

Oppure usa GROUP BY :

SELECT c.cluster_id
     , num_docs
     , feed_id
     , url_time 
FROM   url_info u
JOIN   cluster_info c ON (c.cluster_id = u.cluster_id) 
WHERE  feed_id IN (SELECT pot_seeder FROM potentials) 
AND    num_docs > 5
AND    url_time > '2012-04-16'
GROUP  BY c.cluster_id, feed_id 
ORDER  BY num_docs DESC;

Se c.cluster_id, feed_id sono le colonne chiave primarie di tutte le tabelle (entrambe in questo caso) da cui includi le colonne in SELECT list, quindi funziona solo con PostgreSQL 9.1 o successivo.

Altrimenti devi GROUP BY il resto delle colonne o aggregare o fornire ulteriori informazioni.