Upload
vladimir-sitnikov
View
722
Download
2
Embed Size (px)
Citation preview
© 2016 NetCracker Technology Corporation Confidential
PostgreSQL and JDBC: striving for top performance
Vladimir SitnikovPgConf 2016
2© 2016 NetCracker Technology Corporation Confidential
About me
• Vladimir Sitnikov, @VladimirSitnikv• Performance architect at NetCracker• 10 years of experience with Java/SQL• PgJDBC committer
3© 2016 NetCracker Technology Corporation Confidential
Explain (analyze, buffers) PostgreSQL and JDBC
•Data fetch•Data upload•Performance•Pitfalls
4© 2016 NetCracker Technology Corporation Confidential
Intro
Fetch of a single row via primary key lookup takes 20ms. Localhost. Database is fully cached
A. Just fine C. Kidding? Aim is 1ms!B. It should be 1sec D. 100us
5© 2016 NetCracker Technology Corporation Confidential
Lots of small queries is a problem
Suppose a single query takes 10ms, then 100 of them would take a whole second *
* Your Captain
6© 2016 NetCracker Technology Corporation Confidential
PostgreSQL frontend-backend protocol
• Simple query• 'Q' + length + query_text•Extended query•Parse, Bind, Execute commands
7© 2016 NetCracker Technology Corporation Confidential
PostgreSQL frontend-backend protocol
Super extended queryhttps://github.com/pgjdbc/pgjdbc/pull/478backend protocol wanted features
8© 2016 NetCracker Technology Corporation Confidential
PostgreSQL frontend-backend protocol
Simple query•Works well for one-time queries•Does not support binary transfer
9© 2016 NetCracker Technology Corporation Confidential
PostgreSQL frontend-backend protocol
Extended query• Eliminates planning time• Supports binary transfer
10© 2016 NetCracker Technology Corporation Confidential
PreparedStatement
Connection con = ...;PreparedStatement ps = con.prepareStatement("SELECT..."); ...ps.close();
11© 2016 NetCracker Technology Corporation Confidential
PreparedStatement
Connection con = ...;PreparedStatement ps = con.prepareStatement("SELECT..."); ...ps.close();
12© 2016 NetCracker Technology Corporation Confidential
Smoker’s approach to PostgreSQL
PARSE S_1 as ...; // con.prepareStmt BIND/EXECDEALLOCATE // ps.close()PARSE S_2 as ...; BIND/EXECDEALLOCATE // ps.close()
13© 2016 NetCracker Technology Corporation Confidential
Healthy approach to PostgreSQL
PARSE S_1 as ...; BIND/EXEC BIND/EXEC BIND/EXEC BIND/EXEC BIND/EXEC ...DEALLOCATE
14© 2016 NetCracker Technology Corporation Confidential
Healthy approach to PostgreSQL
PARSE S_1 as ...; 1 once in a life BIND/EXEC REST call BIND/EXEC BIND/EXEC one more REST call BIND/EXEC BIND/EXEC ...DEALLOCATE “never” is the best
15© 2016 NetCracker Technology Corporation Confidential
Happiness closes no statements
Conclusion №1: in order to get top performance, you should not close statementsps = con.prepareStatement(...)ps.execueQuery();ps = con.prepareStatement(...)ps.execueQuery();...
16© 2016 NetCracker Technology Corporation Confidential
Happiness closes no statements
Conclusion №1: in order to get top performance, you should not close statements
ps = con.prepare...ps.execueQuery();ps = con.prepare...ps.execueQuery();...
17© 2016 NetCracker Technology Corporation Confidential
Unclosed statements in practice
@Benchmarkpublic Statement leakStatement() { return con.createStatement();}pgjdbc < 9.4.1202, -Xmx128m, OracleJDK 1.8u40# Warmup Iteration 1: 1147,070 ns/op# Warmup Iteration 2: 12101,537 ns/op# Warmup Iteration 3: 90825,971 ns/op# Warmup Iteration 4: <failure>java.lang.OutOfMemoryError: GC overhead limit exceeded
18© 2016 NetCracker Technology Corporation Confidential
Unclosed statements in practice
@Benchmarkpublic Statement leakStatement() { return con.createStatement();}pgjdbc >= 9.4.1202, -Xmx128m, OracleJDK 1.8u40# Warmup Iteration 1: 30 ns/op# Warmup Iteration 2: 27 ns/op# Warmup Iteration 3: 30 ns/op...
19© 2016 NetCracker Technology Corporation Confidential
Statements in practice
• In practice, application is always closing the statements• PostgreSQL has no shared query cache• Nobody wants spending excessive time on
planning
20© 2016 NetCracker Technology Corporation Confidential
Server-prepared statements
What can we do about it?• Wrap all the queries in PL/PgSQL• It helps, however we had 100500 SQL of them
• Teach JDBC to cache queries
21© 2016 NetCracker Technology Corporation Confidential
Query cache in PgJDBC
• Query cache was implemented in 9.4.1202 (2015-08-27)see https://github.com/pgjdbc/pgjdbc/pull/319• Is transparent to the application• We did not bother considering PL/PgSQL again• Server-prepare is activated after 5 executions
(prepareThreshold)
22© 2016 NetCracker Technology Corporation Confidential
Where are the numbers?
• Of course, planning time depends on the query complexity• We observed 20мс+ planning time for OLTP
queries: 10KiB query, 170 lines explain• Result is ~0ms
23© 2016 NetCracker Technology Corporation Confidential
Overheads
24© 2016 NetCracker Technology Corporation Confidential
Generated queries are bad
• If a query is generated• It results in a brand new java.lang.String object• Thus you have to recompute its hashCode
25© 2016 NetCracker Technology Corporation Confidential
Parameter types
If the type of bind value changes, you have to recreate server-prepared statementps.setInt(1, 42);...ps.setNull(1, Types.VARCHAR);
26© 2016 NetCracker Technology Corporation Confidential
Parameter types
If the type of bind value changes, you have to recreate server-prepared statementps.setInt(1, 42);...ps.setNull(1, Types.VARCHAR);
It leads to DEALLOCATE PREPARE
27© 2016 NetCracker Technology Corporation Confidential
Keep data type the same
Conclusion №1• Even NULL values should be properly typed
28© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
If using prepared statements, the response time gets 5'000 times slower. How’s that possible?
A. Bug C. FeatureB. Feature D. Bug
29© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
https://gist.github.com/vlsi -> 01_plan_flipper.sql
select * from plan_flipper -- <- table where skewed = 0 -- 1M rows and non_skewed = 42 -- 20 rows
30© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
https://gist.github.com/vlsi -> 01_plan_flipper.sql0.1ms 1st execution0.05ms 2nd execution0.05ms 3rd execution0.05ms 4th execution0.05ms 5th execution250 ms 6th execution
31© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
https://gist.github.com/vlsi -> 01_plan_flipper.sql0.1ms 1st execution0.05ms 2nd execution0.05ms 3rd execution0.05ms 4th execution0.05ms 5th execution
250 ms 6th execution
32© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
• Who is to blame?• PostgreSQL switches to generic plan after 5
executions of a server-prepared statement
• What can we do about it?• Add +0, OFFSET 0, and so on• Pay attention on plan validation• Discuss the phenomenon pgsql-hackers
33© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
https://gist.github.com/vlsi -> 01_plan_flipper.sqlWe just use +0 to forbid index on a bad columnselect * from plan_flipper where skewed+0 = 0 ~ /*+no_index*/ and non_skewed = 42
34© 2016 NetCracker Technology Corporation Confidential
Explain explain explain explain
The rule of 6 explains:prepare x(number) as select ...;explain analyze execute x(42); -- 1msexplain analyze execute x(42); -- 1msexplain analyze execute x(42); -- 1msexplain analyze execute x(42); -- 1msexplain analyze execute x(42); -- 1msexplain analyze execute x(42); -- 10 sec
35© 2016 NetCracker Technology Corporation Confidential
Везде баг
36© 2016 NetCracker Technology Corporation Confidential
Decision problem
There’s a schema A with table X, and a schema B with table X. What is the result of select * from X?
A.X C. ErrorB.X D. All of the above
37© 2016 NetCracker Technology Corporation Confidential
Search_path
There’s a schema A with table X, and a schema B with table X. What is the result of select * from X?• search_path determines the schema used• server-prepared statements are not prepared for
search_path changes crazy things might happen
38© 2016 NetCracker Technology Corporation Confidential
Search_path can go wrong
• 9.1 will just use old OIDs and execute the “previous” query• 9.2-9.5 might fail with "cached plan must not change
result type” error
39© 2016 NetCracker Technology Corporation Confidential
Search_path
What can we do about it?• Keep search_path constant• Discuss it in pgsql-hackers• Set search_path
+ server-prepared statements = cached plan must not change result type
• PL/pgSQL has exactly the same issue
40© 2016 NetCracker Technology Corporation Confidential
To fetch or not to fetch
You are to fetch 1M rows 1KiB each, -Xmx128m while (resultSet.next()) resultSet.getString(1);
A. No problem C. Must use LIMIT/OFFSET
B. OutOfMemory D. autoCommit(false)
41© 2016 NetCracker Technology Corporation Confidential
To fetch or not to fetch
• PgJDBC fetches all rows by default• To fetch in batches, you need Statement.setFetchSize and connection.setAutoCommit(false)• Default value is configurable via defaultRowFetchSize (9.4.1202+)
42© 2016 NetCracker Technology Corporation Confidential
fetchSize vs fetch time
10 50 100 1000 200002468
6.48
2.28 1.761.04 0.97
2000 rows
2000 rows
fetchSize
Fa
ster
, ms
select int4, int4, int4, int4
43© 2016 NetCracker Technology Corporation Confidential
FetchSize is good for stability
Conclusion №2:• For stability & performance reasons set defaultRowFetchSize >= 100
44© 2016 NetCracker Technology Corporation Confidential
Data upload
For data uploads, use• INSERT() VALUES()• INSERT() SELECT ?, ?, ?• INSERT() VALUES() executeBatch• INSERT() VALUES(), (), () executeBatch• COPY
45© 2016 NetCracker Technology Corporation Confidential
Healty batch INSERT
PARSE S_1 as ...; BIND/EXEC BIND/EXEC BIND/EXEC BIND/EXEC BIND/EXEC ...DEALLOCATE
46© 2016 NetCracker Technology Corporation Confidential
TCP strikes back
JDBC is busy with sending queries, thus it has not started
fetching responses yet
DB cannot fetch more queries since it is busy with sending responses
47© 2016 NetCracker Technology Corporation Confidential
Batch INSERT in real life
PARSE S_1 as ...; BIND/EXEC BIND/EXECSYNC flush & wait for the response BIND/EXEC BIND/EXECSYNC flush & wait for the response ...
48© 2016 NetCracker Technology Corporation Confidential
TCP deadlock avoidance
• PgJDBC adds SYNC to your nice batch operations• The more the SYNCs the slower it performs
49© 2016 NetCracker Technology Corporation Confidential
Horror stories
A single line patch makes insert batch 10 times faster:
https://github.com/pgjdbc/pgjdbc/pull/380
- static int QUERY_FORCE_DESCRIBE_PORTAL = 128;+ static int QUERY_FORCE_DESCRIBE_PORTAL = 512;...// 128 has already been used static int QUERY_DISALLOW_BATCHING = 128;
50© 2016 NetCracker Technology Corporation Confidential
Trust but always measure
• Java 1.8u40+• Core i7 2.6Ghz• Java microbenchmark harness• PostgreSQL 9.5
51© 2016 NetCracker Technology Corporation Confidential
Queries under test: INSERT
pgjdbc/ubenchmark/InsertBatch.java
insert into batch_perf_test(a, b, c) values(?, ?, ?)
52© 2016 NetCracker Technology Corporation Confidential
Queries under test: INSERT
pgjdbc/ubenchmark/InsertBatch.java
insert into batch_perf_test(a, b, c) values(?, ?, ?)
53© 2016 NetCracker Technology Corporation Confidential
Queries under test: INSERT
pgjdbc/ubenchmark/InsertBatch.java
insert into batch_perf_test(a, b, c) values (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), ...;
54© 2016 NetCracker Technology Corporation Confidential
Тестируемые запросы: COPY
pgjdbc/ubenchmark/InsertBatch.java
COPY batch_perf_test FROM STDIN1 s1 12 s2 23 s3 3...
55© 2016 NetCracker Technology Corporation Confidential
Queries under test: hand-made structs
pgjdbc/ubenchmark/InsertBatch.java
insert into batch_perf_test select * from unnest('{"(1,s1,1)","(2,s2,2)", "(3,s3,3)"}'::batch_perf_test[])
56© 2016 NetCracker Technology Corporation Confidential
You’d better use batch, your C.O.
16 128 10240
50
100
150
216
128
InsertBatchStructCopy
The number of inserted rows
fa
ster
, ms
int4, varchar, int4
57© 2016 NetCracker Technology Corporation Confidential
COPY is good
16 128 10240
0.51
1.52
2.5
BatchStructCopy
The number of inserted rows
Fa
ster
, ms
int4, varchar, int4
58© 2016 NetCracker Technology Corporation Confidential
COPY is bad for small batches
1 4 8 16 12805
10152025
BatchStructCopy
Batch size in rows
Fa
ster
, ms
Insert of 1024 rows
59© 2016 NetCracker Technology Corporation Confidential
Final thoughts
• PreparedStatement is our hero• Remember to EXPLAIN ANALYZE at least six
times, a blue moon is a plus• Don’t forget +0 and OFFSET 0
60© 2016 NetCracker Technology Corporation Confidential
About me
• Vladimir Sitnikov, @VladimirSitnikv• Performance architect in NetCracker• 10 years of experience with Java/SQL• PgJDBC committer
© 2016 NetCracker Technology Corporation Confidential
Questions?
Vladimir Sitnikov,PgConf 2016