Il ORDER BY
più a sinistra gli articoli non possono essere in disaccordo con gli articoli del DISTINCT
clausola. Cito il manuale su DISTINCT
:
Prova:
SELECT *
FROM (
SELECT DISTINCT ON (c.cluster_id, feed_id)
c.cluster_id, num_docs, feed_id, url_time
FROM url_info u
JOIN cluster_info c ON (c.cluster_id = u.cluster_id)
WHERE feed_id IN (SELECT pot_seeder FROM potentials)
AND num_docs > 5
AND url_time > '2012-04-16'
ORDER BY c.cluster_id, feed_id, num_docs, url_time
-- first columns match DISTINCT
-- the rest to pick certain values for dupes
-- or did you want to pick random values for dupes?
) x
ORDER BY num_docs DESC;
Oppure usa GROUP BY
:
SELECT c.cluster_id
, num_docs
, feed_id
, url_time
FROM url_info u
JOIN cluster_info c ON (c.cluster_id = u.cluster_id)
WHERE feed_id IN (SELECT pot_seeder FROM potentials)
AND num_docs > 5
AND url_time > '2012-04-16'
GROUP BY c.cluster_id, feed_id
ORDER BY num_docs DESC;
Se c.cluster_id, feed_id
sono le colonne chiave primarie di tutte le tabelle (entrambe in questo caso) da cui includi le colonne in SELECT
list, quindi funziona solo con PostgreSQL 9.1 o successivo.
Altrimenti devi GROUP BY
il resto delle colonne o aggregare o fornire ulteriori informazioni.