Una panoramica della compilazione just-in-time (JIT) per PostgreSQL

Storicamente PostgreSQL ha fornito funzionalità di compilazione sotto forma di compilazione anticipata per funzioni PL/pgSQL e la versione 10 ha introdotto la compilazione di espressioni. Nessuno di questi genera codice macchina però.

JIT per SQL è stato discusso molti anni fa e per PostgreSQL la funzionalità è il risultato di una sostanziale modifica del codice.

Per verificare se il binario PostgreSQL è stato compilato con il supporto LLVM, utilizzare il comando pg_configure per visualizzare i flag di compilazione e cercare –with-llvm nell'output. Esempio per la distribuzione RPM PGDG:

omiday ~ $ /usr/pgsql-11/bin/pg_config --configure
'--enable-rpath' '--prefix=/usr/pgsql-11' '--includedir=/usr/pgsql-11/include' '--mandir=/usr/pgsql-11/share/man' '--datadir=/usr/pgsql-11/share' '--enable-tap-tests' '--with-icu' '--with-llvm' '--with-perl' '--with-python' '--with-tcl' '--with-tclconfig=/usr/lib64' '--with-openssl' '--with-pam' '--with-gssapi' '--with-includes=/usr/include' '--with-libraries=/usr/lib64' '--enable-nls' '--enable-dtrace' '--with-uuid=e2fs' '--with-libxml' '--with-libxslt' '--with-ldap' '--with-selinux' '--with-systemd' '--with-system-tzdata=/usr/share/zoneinfo' '--sysconfdir=/etc/sysconfig/pgsql' '--docdir=/usr/pgsql-11/doc' '--htmldir=/usr/pgsql-11/doc/html' 'CFLAGS=-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' 'PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig'

Perché LLVM JIT?

Tutto è iniziato circa due anni fa, come spiegato nel post di Adres Freund, quando la valutazione dell'espressione e la deformazione della tupla si sono rivelate gli ostacoli nell'accelerare query di grandi dimensioni. Dopo aver aggiunto l'implementazione della JIT, "la stessa valutazione dell'espressione è più di dieci volte più veloce di prima" nelle parole di Andres. Inoltre, la sezione Domande e risposte che termina il suo post spiega la scelta di LLVM rispetto ad altre implementazioni.

Sebbene LLVM fosse il provider scelto, il parametro GUC jit_provider può essere utilizzato per puntare a un altro provider JIT. Tieni presente, tuttavia, che il supporto per l'inlining è disponibile solo quando si utilizza il provider LLVM, a causa del modo in cui funziona il processo di compilazione.

Quando JIT?

La documentazione è chiara:le query di lunga durata legate alla CPU trarranno vantaggio dalla compilazione JIT. Inoltre, le discussioni sulla mailing list a cui si fa riferimento in questo blog sottolineano che JIT è troppo costoso per le query che vengono eseguite una sola volta.

Rispetto ai linguaggi di programmazione, PostgreSQL ha il vantaggio di "sapere" quando eseguire il JIT, affidandosi al pianificatore di query. A tal fine sono stati introdotti alcuni parametri GUC. Per proteggere gli utenti da sorprese negative quando si abilita JIT, i parametri relativi ai costi sono impostati intenzionalmente su valori ragionevolmente elevati. Tieni presente che l'impostazione dei parametri di costo JIT su "0" forzerà la compilazione di tutte le query JIT e, di conseguenza, rallenterà tutte le query.

Sebbene JIT possa essere generalmente vantaggioso, ci sono casi in cui averlo abilitato può essere dannoso, come discusso in commit b9f2d4d3.

Come eseguire la JIT?

Come accennato in precedenza, i pacchetti binari RPM sono abilitati per LLVM. Tuttavia, per far funzionare la compilazione JIT sono necessari alcuni passaggi aggiuntivi:

Ad esempio:

[email protected][local]:54311 test# show server_version;
server_version
----------------
11.1
(1 row)

[email protected][local]:54311 test# show port;
port
-------
54311
(1 row)

[email protected][local]:54311 test# create table t1 (id serial);
CREATE TABLE
[email protected][local]:54311 test# insert INTO t1 (id) select * from generate_series(1, 10000000);
INSERT 0 10000000
[email protected][local]:54311 test# set jit = 'on';
SET
[email protected][local]:54311 test# set jit_above_cost = 10; set jit_inline_above_cost = 10; set jit_optimize_above_cost = 10;
SET
SET
SET
[email protected][local]:54311 test# explain analyze select count(*) from t1;
                                                               QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate  (cost=97331.43..97331.44 rows=1 width=8) (actual time=647.585..647.585 rows=1 loops=1)
   ->  Gather  (cost=97331.21..97331.42 rows=2 width=8) (actual time=647.484..649.059 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=96331.21..96331.22 rows=1 width=8) (actual time=640.995..640.995 rows=1 loops=3)
               ->  Parallel Seq Scan on t1  (cost=0.00..85914.57 rows=4166657 width=0) (actual time=0.060..397.121 rows=3333333 loops=3)
Planning Time: 0.182 ms
Execution Time: 649.170 ms
(8 rows)

Nota che ho abilitato JIT (che è disabilitato per impostazione predefinita in seguito alla discussione pgsql-hacker a cui si fa riferimento in commit b9f2d4d3). Ho anche modificato il costo dei parametri JIT come suggerito nella documentazione.

Il primo suggerimento si trova nel file src/backend/jit/README a cui si fa riferimento nella documentazione JIT:

Which shared library is loaded is determined by the jit_provider GUC, defaulting to "llvmjit".

Poiché il pacchetto RPM non inserisce automaticamente la dipendenza JIT, come è stato deciso dopo intense discussioni (vedi il thread completo), è necessario installarlo manualmente:

[[email protected] ~]# dnf install postgresql11-llvmjit

Una volta completata l'installazione, possiamo testare subito:

[email protected][local]:54311 test# explain analyze select count(*) from t1;
                                                               QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate  (cost=97331.43..97331.44 rows=1 width=8) (actual time=794.998..794.998 rows=1 loops=1)
   ->  Gather  (cost=97331.21..97331.42 rows=2 width=8) (actual time=794.870..803.680 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=96331.21..96331.22 rows=1 width=8) (actual time=689.124..689.125 rows=1 loops=3)
               ->  Parallel Seq Scan on t1  (cost=0.00..85914.57 rows=4166657 width=0) (actual time=0.062..385.278 rows=3333333 loops=3)
Planning Time: 0.150 ms
JIT:
   Functions: 4
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 2.146 ms, Inlining 117.725 ms, Optimization 47.928 ms, Emission 69.454 ms, Total 237.252 ms
Execution Time: 803.789 ms
(12 rows)

Possiamo anche visualizzare i dettagli JIT per lavoratore:

[email protected][local]:54311 test# explain (analyze, verbose, buffers) select count(*) from t1;
                                                                  QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate  (cost=97331.43..97331.44 rows=1 width=8) (actual time=974.352..974.352 rows=1 loops=1)
   Output: count(*)
   Buffers: shared hit=2592 read=41656
   ->  Gather  (cost=97331.21..97331.42 rows=2 width=8) (actual time=974.166..980.942 rows=3 loops=1)
         Output: (PARTIAL count(*))
         Workers Planned: 2
         Workers Launched: 2
         JIT for worker 0:
         Functions: 2
         Options: Inlining true, Optimization true, Expressions true, Deforming true
         Timing: Generation 0.378 ms, Inlining 74.033 ms, Optimization 11.979 ms, Emission 9.470 ms, Total 95.861 ms
         JIT for worker 1:
         Functions: 2
         Options: Inlining true, Optimization true, Expressions true, Deforming true
         Timing: Generation 0.319 ms, Inlining 68.198 ms, Optimization 8.827 ms, Emission 9.580 ms, Total 86.924 ms
         Buffers: shared hit=2592 read=41656
         ->  Partial Aggregate  (cost=96331.21..96331.22 rows=1 width=8) (actual time=924.936..924.936 rows=1 loops=3)
               Output: PARTIAL count(*)
               Buffers: shared hit=2592 read=41656
               Worker 0: actual time=900.612..900.613 rows=1 loops=1
               Buffers: shared hit=668 read=11419
               Worker 1: actual time=900.763..900.763 rows=1 loops=1
               Buffers: shared hit=679 read=11608
               ->  Parallel Seq Scan on public.t1  (cost=0.00..85914.57 rows=4166657 width=0) (actual time=0.311..558.192 rows=3333333 loops=3)
                     Output: id
                     Buffers: shared hit=2592 read=41656
                     Worker 0: actual time=0.389..539.796 rows=2731662 loops=1
                     Buffers: shared hit=668 read=11419
                     Worker 1: actual time=0.082..548.518 rows=2776862 loops=1
                     Buffers: shared hit=679 read=11608
Planning Time: 0.207 ms
JIT:
   Functions: 9
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 8.818 ms, Inlining 153.087 ms, Optimization 77.999 ms, Emission 64.884 ms, Total 304.787 ms
Execution Time: 989.360 ms
(36 rows)

L'implementazione JIT può anche sfruttare la funzionalità di esecuzione di query parallele. Per esemplificare, prima disabilitiamo la parallelizzazione:

[email protected][local]:54311 test# set max_parallel_workers_per_gather = 0;
SET
[email protected][local]:54311 test# explain analyze select count(*) from t1;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Aggregate  (cost=169247.71..169247.72 rows=1 width=8) (actual time=1447.315..1447.315 rows=1 loops=1)
   ->  Seq Scan on t1  (cost=0.00..144247.77 rows=9999977 width=0) (actual time=0.064..957.563 rows=10000000 loops=1)
Planning Time: 0.053 ms
JIT:
   Functions: 2
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 0.388 ms, Inlining 1.359 ms, Optimization 7.626 ms, Emission 7.963 ms, Total 17.335 ms
Execution Time: 1447.783 ms
(8 rows)

Lo stesso comando con query parallele abilitate viene completato in metà tempo:

[email protected][local]:54311 test# reset max_parallel_workers_per_gather ;
RESET
[email protected][local]:54311 test# explain analyze select count(*) from t1;
                                                               QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate  (cost=97331.43..97331.44 rows=1 width=8) (actual time=707.126..707.126 rows=1 loops=1)
   ->  Gather  (cost=97331.21..97331.42 rows=2 width=8) (actual time=706.971..712.199 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=96331.21..96331.22 rows=1 width=8) (actual time=656.102..656.103 rows=1 loops=3)
               ->  Parallel Seq Scan on t1  (cost=0.00..85914.57 rows=4166657 width=0) (actual time=0.067..384.207 rows=3333333 loops=3)
Planning Time: 0.158 ms
JIT:
   Functions: 9
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 3.709 ms, Inlining 142.150 ms, Optimization 50.983 ms, Emission 33.792 ms, Total 230.634 ms
Execution Time: 715.226 ms
(12 rows)

Ho trovato interessante confrontare i risultati dei test discussi in questo post, durante le fasi iniziali dell'implementazione della JIT rispetto alla versione finale. Per prima cosa assicurati che le condizioni nel test originale siano soddisfatte, ad es. il database deve stare in memoria:

[email protected][local]:54311 test# \l+
postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |                       | 8027 kB | pg_default | default administrative connection database
template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +| 7889 kB | pg_default | unmodifiable empty database
          |          |          |             |             | postgres=CTc/postgres |         |            |
template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +| 7889 kB | pg_default | default template for new databases
          |          |          |             |             | postgres=CTc/postgres |         |            |
test      | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |                       | 2763 MB | pg_default |


[email protected][local]:54311 test# show shared_buffers ;
3GB

Time: 0.485 ms

Scarica il whitepaper oggi Gestione e automazione di PostgreSQL con ClusterControlScopri cosa devi sapere per distribuire, monitorare, gestire e ridimensionare PostgreSQLScarica il whitepaper

Esegui i test con JIT disabilitato:

[email protected][local]:54311 test# set jit = off;
SET
Time: 0.483 ms

[email protected][local]:54311 test# select sum(c8) from t1;
   0

Time: 1036.231 ms (00:01.036)

[email protected][local]:54311 test# select sum(c2), sum(c3), sum(c4), sum(c5),
   sum(c6), sum(c7), sum(c8) from t1;
   0 |   0 |   0 |   0 |   0 |   0 |   0

Time: 1793.502 ms (00:01.794)

Quindi esegui i test con JIT abilitato:

[email protected][local]:54311 test# set jit = on; set jit_above_cost = 10; set
jit_inline_above_cost = 10; set jit_optimize_above_cost = 10;
SET
Time: 0.473 ms
SET
Time: 0.267 ms
SET
Time: 0.204 ms
SET
Time: 0.162 ms
[email protected][local]:54311 test# select sum(c8) from t1;
   0

Time: 795.746 ms

[email protected][local]:54311 test# select sum(c2), sum(c3), sum(c4), sum(c5),
   sum(c6), sum(c7), sum(c8) from t1;
   0 |   0 |   0 |   0 |   0 |   0 |   0

Time: 1080.446 ms (00:01.080)

Si tratta di un aumento della velocità di circa il 25% per il primo test case e del 40% per il secondo!

Infine, è importante ricordare che per le istruzioni preparate, la compilazione JIT viene eseguita alla prima esecuzione della funzione.

Conclusione

Per impostazione predefinita, la compilazione JIT è disabilitata e per i sistemi basati su RPM il programma di installazione non suggerisce la necessità di installare il pacchetto JIT che fornisce il codice bit per il provider predefinito LLVM.

Quando si compila dai sorgenti, prestare attenzione ai flag di compilazione per evitare problemi di prestazioni, ad esempio se le asserzioni LLVM sono abilitate.

Come discusso nell'elenco di pgsql-hacker, l'impatto del JIT sui costi non è ancora completamente compreso, quindi è necessaria un'attenta pianificazione prima di abilitare l'intero cluster di funzionalità, poiché le query che potrebbero altrimenti trarre vantaggio dalla compilazione potrebbero effettivamente essere più lente. Tuttavia, JIT può essere abilitato in base alla query.

Per informazioni approfondite sull'implementazione della compilazione JIT, esamina il progetto Git logs, Commitfests e il thread di posta pgsql-hacker.

Buon JITing!