1 / 43

PostgreSQL Query Plans

CSI 3130 - 2009 - Lab 5. PostgreSQL Query Plans. Daniel Antwi www.site.uottawa.ca/~dantw005/csi3130. Outline. Review: building PostgreSQL Query Plans. Building PostgreSQL. Download the source code postgresql-8.4.1.tar.gz http://www.postgresql.org/ftp/source/v8.4.1/ Decompress

Télécharger la présentation

PostgreSQL Query Plans

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSI 3130 - 2009 - Lab 5 PostgreSQL Query Plans Daniel Antwi www.site.uottawa.ca/~dantw005/csi3130

  2. Outline • Review: building PostgreSQL • Query Plans

  3. Building PostgreSQL • Download the source code • postgresql-8.4.1.tar.gz • http://www.postgresql.org/ftp/source/v8.4.1/ • Decompress • Configure • Make install • Initialize a data directory • Start the sever and clients

  4. Commands $ tar xvf postgresql-8.4.1.tar.gz $ pwd $ cd postgresql-8.4.1 $ ./configure --prefix=/home/rwchen/pginstall --enable-debug --enable-cassert --enable-depend $ make install $ cd ../pginstall/bin $ ./initdb -D../data $ ./postgres -D ../data -p5555(Server) $ ./psql postgres -p 5555(Client)

  5. Query Plans • A query plan is a set of steps used to access or modify information in a DBMS. • Given a query, the query optimizer will evaluate different query plans and choose the best one (it considers). • Indexes • Join methods

  6. Query Plan Choice • Scan • Sequential scan • Index scan (btree, hash ...) • Join • Merge join • Index jon • Nested loop join

  7. Example: test.sql • We will try different queries and plans over the example • Two tables: • cust(cid, name, bno); • branch (bno, name, region); create table cust (cid int primary key, name char(40), bno int); create sequence seq_cust; create index cust_bno on cust(bno); create table branch (bno int primary key, name char(40), region int);

  8. Intialize the Database • Insert tuples: insert into cust values(nextval('seq_cust'), 'aaaaaaaaaaaaaaaaa', (random()*50)::int); insert into cust select nextval('seq_cust'), name, (random()*50)::int from cust; • Analyse • Refresh statistics information about the tables • analyse cust;

  9. Explain Query Plans - Scan • Explain will generate the best plan for a query, but not execute it postgres=# explain select * from cust where cid = 102; QUERY PLAN ----------------------------------------------------------------------- Index Scan using cust_pkey on cust (cost=0.00..8.29 rows=1 width=49) Index Cond: (cid = 102) (2 rows) postgres=# explain select * from cust where cid < 102; QUERY PLAN ------------------------------------------------------------------------ Index Scan using cust_pkey on cust (cost=0.00..8.73 rows=26 width=49) Index Cond: (cid < 102) (2 rows) postgres=# explain select * from cust where cid > 102; QUERY PLAN ------------------------------------------------------------- Seq Scan on cust (cost=0.00..5847.80 rows=262118 width=49) Filter: (cid > 102) (2 rows)

  10. Planner Configurations • In <postgresql.conf> (under the data directory) #------------------------------------------------------------------------------ # QUERY TUNING #------------------------------------------------------------------------------ # - Planner Method Configuration - #enable_bitmapscan = on #enable_hashagg = on #enable_hashjoin = on #enable_indexscan = on #enable_mergejoin = on #enable_nestloop = on #enable_seqscan = on #enable_sort = on #enable_tidscan = on

  11. Explain Query Plans - Scan • Try the following set enable_indexscan = off; explain select * from cust where cid = 102; explain select * from cust where cid < 102; explain select * from cust where cid > 102;

  12. Explain Query Plans - Join • Join cust and branch on bno, and sort the results on branch.region postgres=# explain select * from cust, branch where cust.bno = branch.bno order by branch.region; QUERY PLAN ------------------------------------------------------------------------- Sort (cost=60170.76..60826.12 rows=262144 width=98) Sort Key: branch.region -> Hash Join (cost=4.88..8801.80 rows=262144 width=98) Hash Cond: (cust.bno = branch.bno) -> Seq Scan on cust (cost=0.00..5192.44 rows=262144 width=49) -> Hash (cost=3.28..3.28 rows=128 width=49) -> Seq Scan on branch (cost=0.00..3.28 rows=128 width=49) (7 rows) • A hash-join plan is generated by default.

  13. postgres=# set enable_hashjoin = off; SET postgres=# explain select * from cust, branch where cust.bno = branch.bno order by branch.region; QUERY PLAN ---------------------------------------------------------------------- Sort (cost=71170.19..71825.55 rows=262144 width=98) Sort Key: branch.region -> Merge Join (cost=326.38..19801.23 rows=262144 width=98) Merge Cond: (branch.bno = cust.bno) -> Index Scan using branch_pkey on branch (cost=0.00..15.17 rows=128 width=49) -> Index Scan using cust_bno on cust (cost=0.00..16518.38 rows=262144 width=49) (6 rows) • Hash-join disabled • A Merge join plan based on index scan

  14. postgres=# set enable_mergejoin = off; SET postgres=# explain select * from cust, branch where cust.bno = branch.bno order by branch.region; QUERY PLAN ------------------------------------------------------------------------------ Sort (cost=83740.80..84396.16 rows=262144 width=98) Sort Key: branch.region -> Nested Loop (cost=0.00..32371.84 rows=262144 width=98) -> Seq Scan on branch (cost=0.00..3.28 rows=128 width=49) -> Index Scan using cust_bno on cust (cost=0.00..188.63 rows=5140 width=49) Index Cond: (cust.bno = branch.bno) (6 rows) • Merge join and Hash-join disabled • A nested Loop join plan with • Index scan on cust • Sequential scan on branch

  15. postgres=# set enable_indexscan = off; SET postgres=# explain select * from cust, branch where cust.bno = branch.bno order by branch.region; QUERY PLAN ------------------------------------------------------------------------------ Sort (cost=83644.48..84299.84 rows=262144 width=98) Sort Key: branch.region -> Nested Loop (cost=58.17..32275.52 rows=262144 width=98) -> Seq Scan on branch (cost=0.00..3.28 rows=128 width=49) -> Bitmap Heap Scan on cust (cost=58.17..187.88 rows=5140 width=49) Recheck Cond: (cust.bno = branch.bno) -> Bitmap Index Scan on cust_bno (cost=0.00..56.89 rows=5140 width=0) Index Cond: (cust.bno = branch.bno) (8 rows) • Merge join, hash-join, and index-scan disabled • A nested Loop join plan with • Bitmap index scan on cust • Bitmap sequential scan on branch

  16. postgres=# set enable_bitmapscan = off; SET postgres=# set enable_mergejoin = on; SET postgres=# explain select * from cust, branch where cust.bno = branch.bno order by branch.region; QUERY PLAN ------------------------------------------------------------------------------------- Sort (cost=101118.53..101773.89 rows=262144 width=98) Sort Key: branch.region -> Merge Join (cost=45842.07..49749.57 rows=262144 width=98) Merge Cond: (branch.bno = cust.bno) -> Sort (cost=7.76..8.08 rows=128 width=49) Sort Key: branch.bno -> Seq Scan on branch (cost=0.00..3.28 rows=128 width=49) -> Materialize (cost=45809.40..49086.20 rows=262144 width=49) -> Sort (cost=45809.40..46464.76 rows=262144 width=49) Sort Key: cust.bno -> Seq Scan on cust (cost=0.00..5192.44 rows=262144 width=49) (11 rows) • Disable bitmapscan (also indexscan and hash-join) and enable mergejoin • A sort merge-join plan is generated

  17. Compare Query Plans • We have 5 plans for the same query • select * from cust, branch where cust.bno = branch.bno order by branch.region; • The costs are estimated. But they are not actually costs. • You can try other combinations.

  18. Real Execution • Use EXPLAIN ANALYSE • Or, setup timing from psql: • \timing

  19. Self-Study • PostgreSQL Document Chap. 14 • http://www.postgresql.org/docs/8.4/interactive/using-explain.html

  20. CSI 3130 – 2012 – Lab 6 Query Planner Daniel Antwi http://www.site.uottawa.ca/~dantw005/csi3130/

  21. Outline • Query Planner: The Big Picture • Modify Source Codes

  22. PostgreSQL multiuser database server • postmaster is the PostgreSQL multiuser database server. • In order for a client application to access a database it connects to a running postmaster. • The postmaster then starts a separate server process postgres to handle the connection. • The postmaster also manages the communication among server processes.

  23. Planner/Optimizer • The task of the planner/optimizer is to create an optimal execution plan. • Scan methods • seqscan, index scan ... • Join methods • nested loop join, merge join, hash join • Join order • for more than two relations

  24. Modify Planner Source Codes • We will modify PostgreSQL source codes • query planner • Under: • src/backend/optimizer • src/backend/optimizer/plan • src/backend/optimizer/path

  25. Modify the Code: Example • Add the debug maro • Edit "src/backend/optimizer/plan/planner.c“ and “src/backend/optimizer/path/allpaths.c” • Add “#define OPTIMIZER_DEBUG” • Edit “/src/backend/optimizer/util/pathnode.c” • After line: 405, add one line • pathnode->total_cost=0; • Make install again • Start the server and try • explain select * from cust where cid=100;

  26. Debug using GDB • Start GDB • gdb ./postgres • Set breakpoints • break pathnode.c:402 • Start the server in single-user mode • start --single -D../data postgres • continue • Execute a SQL • explain select * from cust where cid=100; • (break at the breakpoint in pathnode.c) • Track • backtrace (see the calling stack) • step 1 (track step by step) • continue

  27. Related Files • src/backend/optimizer/plan/planmain.c • src/backend/optimizer/path/allpaths.c • set_plain_rel_pathlist • src/backend/optimizer/path/costsize.c • src/backend/optimizer/path/pathkeys.c • src/backend/optimizer/plan/createplan.c • src/backend/optimizer/README • Optimizer Data Structure • PathKeys

  28. Related Source Codes • src/backend/optimizer/plan/planmain.c • Line 85: query_planner() • Generate a plan • src/backend/optimizer/plan/allpaths.c • Line 86: make_one_relation() • Find all paths to generate a result relation • Line 166: set_rel_pathlist() • Line 214: set_plain_rel_pathlist() • Build paths for a base relation • /src/backend/optimizer/util/pathnode.c • Line 397: create_seqscan_path()

  29. Get Details of Plans • In PSQL use EXPLAIN VERBOSE • From server: • use “--log-error-verbosity=verbose -d5” • Enable OPTIMIZER_DEBUG • Add the following line at the beginning of "src/backend/optimizer/plan/planner.c“ and “src/backend/optimizer/path/allpaths.c” • #define OPTIMIZER_DEBUG • And then make install again • Try the queries from the last lab • example

  30. List of Commands - Installation cd postgresql-9.0.2 ./configure --enable-debug --enable-cassert --enable-depend --prefix=/home/rwchen/pginstall • Make • Build PostgreSQL • $gmake • Wait for this message: • All of PostgreSQL successfully made. Ready to install. • Install • Install PostgreSQL to $HOME/pginsalldirectory • $make install • Wait for this message: • PostgreSQLinstallation complete. Change to your own directory (use pwd to get current path)

  31. List of Commands - GDB cd ../pginstall/bin ./initdb –D../data ./postgres –D../data –p 5678 ./psql postgres –p 5678 ./postgres --single –D../data postgres (use Ctrl+D to exit) Change to a random port number to avoid conflicts gdb ./postgres break pathnode.c:402 start --single -D../data postgres continue backtrace step 1 Single-user mode

  32. GDB reference card • http://users.ece.utexas.edu/~adnan/gdb-refcard.pdf • vi reference card • http://www.digilife.be/quickreferences/QRC/Vi%20Reference%20Card.pdf

  33. Reference • Chap. 43.5. Planner/Optimizer • http://www.postgresql.org/docs/8.4/interactive/planner-optimizer.html • Chap 17.3. Starting the Database Server • http://www.postgresql.org/docs/8.4/interactive/server-start.html • PostgreSQL Internals Presentation • http://www.postgresql.org/files/developer/internalpics.pdf

More Related