Full Throttle Database: PostgreSQL 9.2

Speaker:Josh Berkus
Date:2013-02-22

New Features

  • Latest: 9.2.2
  • Read Scalability (350,000+ QPS?!)
  • Write Performance
  • Parallel Bulk Load - FASTEST POSTGRES EVER
    • 10-15 paralell streams (depending on CPU)
    • ~3x load speed
  • Index-Only Scans
    • Problem: “count(*)” is slow in PG
    • Looks at “Visibility Map”, skip table lookup
    • Default: sequential scan, index-only: ~50% faster FOR FREE
  • Cascading Replication
    • Traditional replication puts heavy load on master’s network interface
    • You may now replicate from a replica (“replica master”)
    • p2p replication!
    • It’s easy:
      • % pg_basebackup -h replica-master -P -x -D .
      • primary_conninfo=
      • from replica-master: “select client_addr from pg_stat_replication;”
      • from master-master: “select client_addr from pg_stat_replication;”
  • Replication Improvements
    • recv vs. write modes for synchronous replication
    • satandby-only backup
  • JSON
    • JSON data type: STORE JSON TOO!!
    • Array_to_json, row_to_json functions
    • Get query results as JSON!
    • “select row_to_json(books.*) from books;”
    • Cut out the middle-man transcoding between JSON!
  • PL/v8 - Pluggable JavaScript engine
    • Create PL plugins using... JS
    • Create indexes based on JS functions!
    • “create index bibrec_author on json_val(‘author’)...”
    • PL/coffee!
  • Indexing
    • 7.1: GiST - Generalized Search Tree (ranges, boxes)
    • 8.1: GIN - Generalized Inverted Index (default for full-text search, arrays)
    • 9.1: KNN - K-Neartest Neighbor (promixity, GIS, text similarity/trigrams)
    • 9.2: Space-GiST!
      • “Space-Partitioning Trees”
      • Faster to read/update than GiST!
      • “create index pt_gist_idx on geo using gist(point);”
      • “create index pt_gist_idx on geo using spgist(point);”
  • Range Types
    • Temporal range: [2012-04-10, 2012-04-12]
      • Types: tstzrange, timestamptz
      • “select * from copy_hostory where period @> timestamptz ‘2010-05-01’;”
    • Alpha index: [Abbe, Babel]
    • Linear distance: [375.453, 374.441]
  • DDL pit stop
    • ALTER IF EXISTS
    • DROP INDEX CONCURRENTLY
    • NOT VALID CHECK constraints
  • Instrumentation
    • Get good knowledge of what your db is doing and how it do
    • more autovacuum logging
    • “pg_stat_statements” extension (must be installed) - Real-time stats
    • “select * from pg_stat_statements order by total_time desc;”
      • Not as rich as parsing query logs, but instantly available!
  • Better EXPLAIN
    • report filtered-out rows
    • no-TIMING option
    • “explain (analyze on, buffers on) select * from pegbench_accounts;”
  • Other - lower CPU wakeups (power saving) - pg_hba.conf improvements (readability mostly) - XML “improvements”