Upload
jim-mlodgenski
View
398
Download
8
Embed Size (px)
DESCRIPTION
This is a introduction to PostgreSQL that provides a brief overview of PostgreSQL's architecture, features and ecosystem. It was delivered at NYLUG on Nov 24, 2014. http://www.meetup.com/nylug-meetings/events/180533472/
Citation preview
Introduction to PostgreSQL
November, 2014
Creative Commons Attribution License
Who Are We?
● Jim Mlodgenski– [email protected]– @jim_mlodgenski
● Co-organizer of– NYCPUG - www.nycpug.org
● Director, PgUS– www.postgresql.us
● CTO, OpenSCG– www.openscg.com
● Jonathan S. Katz– [email protected] – @jkatz05
● Co-organizer of– NYCPUG - www.nycpug.org
● Director, PgUS– www.postgresql.us
● CTO, VenueBook– www.venuebook.com
History
● The world’s most advanced open source database● Designed for extensibility and customization● ANSI/ISO compliant SQL support● Actively developed for almost 30 years
– University POSTGRES (1986-1993)– Postgres95 (1994-1995)– PostgreSQL (1996-2014)
Timeline
“Over the past few years, PostgreSQL has become the preferred open source relational database for many enterprise developers and start-ups, powering leading geospatial and mobile applications.” – Jeff Barr, Chief Evangelist, Amazon Web Services
Why PostgreSQL
Affordability
Technology
Security
Flexibility
Stability
Extensibility
Reliability
Predictability
Community
Auditability
Technology
● Full Featured Database– Mature Server Side Programming Functionality
– Hot Standby High Availability
– Online Backups
– Point In Time Recovery
– Table Partitioning– Spatial Functionality– Full Text Search
Security
● Object Level Privileges assigned to “Roles & User”● Row Level Security● Many Authentication mechanisms
– Kerberos– LDAP– PAM– GSSAPI
● Native SSL Support.● Data Level Encryption (AES, 3DES, etc)● Ability to utilize 3rd party Key Stores in a full PKI
infrastructure
Flexibility
● No Vendor Lock-in– Compliant with the ANSI SQL standard
– Runs on all major platforms using all major languages and middleware
● “BSD-like” license – PostgreSQL License– Allows businesses to retain the option of commercializing the final product
with minimal legal issues– No fear of “Open Source Viral Infection”
Predictability
● Predictable release cycles– The average span between major
releases over the last 10 years is 13 months
● Quick turn around on patches– The average span between minor
releases over the last 5 years is 3 months
Version Release Date
7.3 Nov-02
7.4 Nov-03
8.0 Jan-05
8.1 Nov-05
8.2 Dec-06
8.3 Feb-08
8.4 Jul-09
9.0 Aug-10
9.1 Sep-11
9.2 Sep-12
9.3 Sep-13
Community
● Strong Open Source Community● Independent & Thriving Development Community
– 10+ committers and ~200 reviewers– 1,500 contributors and 10,000+ members
● Millions of downloads per year
● PostgreSQL is a meritocracy– Influence through their merits (usually technical) of the contributor
Who's Using PostgreSQL
PostgreSQL Success Stories
“…With PostgreSQL we have been successful in growing the databases as the company
has grown, both in number of users and in the complexity of services we offer…”
Hannu Krosing – Database Architect Skye Technologies.
“We manage multiple terabytes of data in more than 50 unique production PostgreSQL databases.”
Cisco uses PostgreSQL as the embedded database in all its “Case Sensitive Routing”
(CSR) products to store carrier details, rules, contacts, routes – to perform call routing.
“…Fujitsu is proud of its sponsorship of contributions to PostgreSQL and of its work with
The PostgreSQL community. We are committed to helping make PostgreSQL the leading
Database Management System…”
Takayuki Nakazawa – Director Database in Software Group.
Database 101
● A database stores data● Clients ( people or applications ) input data into tables
( relations ) in the database and retrieve data from it● Relational Database Management Systems are responsible
for managing the safe-storage of data● RDBMSs are designed to store data in an A.C.I.D compliant
way ( all or nothing )– This is done via transactions
Database 101 - (ACID)
● Atomic – Store data in an 'all-or-nothing' approach
● Consistent – Give me a consistent picture of the data
● Isolated– Prevent concurrent data access from causing me woe
● Durable– When I say 'COMMIT;' the data, make sure it is safe until I explicitly destroy it
Database 101 - (Transactions)
● All or nothing● A transaction has
– A Beginning ( BEGIN; )
– Work ( multiple lines of SQL, i.e. INSERT / UPDATE / DELETE)– An Ending ( END; ) You would expect one of two cases
● COMMIT; ( save everything )● ROLLBACK; ( undo all changes, save nothing)
– Once the transaction has ended, it will either make ALL of the changes between BEGIN; and COMMIT; or NONE of them ( if there is an error for example )
PostgreSQL 101
● PostgreSQL meets all of the requirements to be a fully ACID-compliant, transactional database.
● PostgreSQL RDBMS serves a cluster aka an instance.– An instance serves one ( and only one ) TCP/IP port– Contains at least one database– Has an associated data-directory
Major Features● Full network client-server architecture● ACID compliant● Transactional ( uses WAL / REDO )● Partitioning● Tiered storage via tablespaces● Multiversion Concurrency Control
( readers don't block writers )
● On-line maintenance operations● Hot ( readonly ) and Warm ( quick-
promote ) standby ● Log-based and trigger based replication● SSL● Full-text search● Procedural languages
– Pl/pgSQL plus other, custom languages
General Limitations
Limit Value
Maximum Database Size Unlimited
Maximum Table Size 32 TB
Maximum Row Size 1.6 TB
Maximum Field Size 1 GB
Maximum Rows / Table Unlimited
Maximum Columns / Table 250-1600
Maximum Indexes / Table Unlimited
Client Architecture
Server Overview
● PostgreSQL utilizes a multi-process architecture● Similar to Oracle's 'Dedicated Server' mode● Types of processes
– Primary ( postmaster )– Per-connection backend process– Utility ( maintenance processes )
Server Architecture
Process Components
Memory Components
On-Disk Components
Data Types
● Building blocks of a schema● Optimized on-disk format for a specific type of data● PostgreSQL provides:
– Wide array (no pun intended) of basic to complex data types– Functional interfaces for ease of manipulation– Ability to extend and create custom data types
Number Types
Name Storage Size Range
smallint 2 bytes -32768 to +32767
integer 4 bytes -2147483648 to +2147483647
bigint 8 bytes -9223372036854775808 to 9223372036854775807
decimal variable up to 131072 digits before the decimal point; up to 16383 digits after the decimal point
numeric variable up to 131072 digits before the decimal point; up to 16383 digits after the decimal point
real 4 bytes 6 decimal digits precision
double 8 bytes 15 decimal digits precision
Character Types
Name Description
varchar(n) variable-length with limit
char(n) fixed-length, blank padded
text variable unlimited length
Date/Time Types
Name Size Range Resolution
timestamp without timezone
8 bytes 4713 BC to 294276 AD 1 microsecond / 14 digits
timestamp with timezone
8 bytes 4713 BC to 294276 AD 1 microsecond / 14 digits
date 4 bytes 4713 BC to 5874897 AD 1 day
time without timezone
8 bytes 00:00:00 to 24:00:00 1 microsecond / 14 digits
time with timezone
12 bytes 00:00:00+1459 to 24:00:00-1459
1 microsecond / 14 digits
interval 12 bytes -178000000 years to 178000000 years
1 microsecond / 14 digits
Specialized Types
Name Storage Size Range
boolean 1 byte false to true
smallserial 2 bytes 1 to 32767
serial 4 bytes 1 to 2147483647
bigserial 8 bytes 1 to 9223372036854775807
bytea 1 to 4 bytes plus size of binary string
variable-length binary string
cidr 7 or 19 bytes IPv4 or IPv6 networks
inet 7 or 19 bytes IPv4 or IPv6 hosts or networks
macaddr 6 bytes MAC addresses
uuid 16 bytes Universally Unique Identifiers
“Schema-less” Types
Name Description
xml stores XML data and checks the input values for well-formedness
hstore stores sets of key/value pairs
json stores an exact copy of the input JSON document
jsonb stores a decomposed binary format of the input JSON document
Range Types
● Represents a range of an element type– Integers– Numerics– Times– Dates– And more...
Range TypesCREATE TABLE travel_log (
id serial PRIMARY KEY,
name varchar(255),
travel_range daterange,
EXCLUDE USING gist (travel_range WITH &&)
);
INSERT INTO travel_log (name, trip_range) VALUES ('Chicago', daterange('2012-03-12', '2012-03-17'));
INSERT INTO travel_log (name, trip_range) VALUES ('Austin', daterange('2012-03-16', '2012-03-18'));
ERROR: conflicting key value violates exclusion constraint "travel_log_trip_range_excl"
DETAIL: Key (trip_range)=([2012-03-16,2012-03-18)) conflicts with existing key (trip_range)=([2012-03-12,2012-03-17)).
Indexes
● Enhances database performance
● Enforces some types of constraints– Uniqueness– Exclusion
Index Types
● B-Tree● Generalized Inverted Index (GIN)● Generalized Search Tree (GIST)● Space-Partitoned Generalized Search Tree (SP-GIST)
Coming Soon...● Block Range Index (BRIN) ● “VODKA”
Procedural Languages
● Allows for use defined functionality to be run within the database– Used as functions or triggers
● Frequent use cases– Enhance performance– Increase security– Centralize business logic
Procedural Language Types
● PL/pgSQL● PL/Perl● PL/TCL● PL/Python● More available through extensions...
Extensions
● Additional modules that can be plugged into PostgreSQL● Can be used to add a ton of useful features
– Procedural Languages– Data Types– Administration Tools– Foreign Data Wrappers
● Many found in contrib● Also www.pgxn.org
Procedural Language Extensions
● pl/Java● pl/v8● pl/R● pl/Ruby● pl/schema● pl/lolcode
● pl/sh● pl/Proxy● pl/psm● pl/lua● pl/php
Data Type Extensions
● Hstore● Case Insensitive Text (citext)● International Product Numbering Standards (ISN)● PostGIS (geometry)● BioPostgres● SSN● Email
PostGIS
● PostGIS adds OpenGIS Consortium (OGC) compliant geometry data types and functions to PostgreSQL
● With PostgreSQL, becomes a best of breed spatial and raster database
Administration Tool Extensions
● auto_explain● pageinspect● pg_buffercache● pg_stat_statements● Slony● OmniPITR● pg_monitoring● pgaudit● pg_partman
What are Foreign Data Wrappers?
● Used with SQL/MED– New ANIS SQL 2003 Extension
– Management of External Data
– Standard way of handling remote objects in SQL databases
● Wrappers used by SQL/MED to access remotes data sources
● Makes external data sources look like a PostgreSQL table
FDW Extensions
● PostgreSQL● Oracle● MySQL● Informix● Firebird● SQLite● JDBC● ODBC
● PostgreSQL● Oracle● MySQL● Informix● Firebird● SQLite● JDBC● ODBC
● TDS (Sybase/SQL Server)● S3● WWW● PG-Strom● Column Store● Delimited files● Fixed length files● JSON files
● Hadoop● MongoDB● CouchDB● MonetDB● Redis● Neo4j● Tycoon● LDAP
MongoDB FDWCREATE SERVER mongo_server FOREIGN DATA WRAPPER
mongo_fdw OPTIONS (address '192.168.122.47', port '27017');
CREATE FOREIGN TABLE databases (
_id NAME,
name TEXT
)
SERVER mongo_server
OPTIONS (database 'mydb', collection 'pgData');
test=# select * from databases ;
_id | name
--------------------------+------------
52fd49bfba3ae4ea54afc459 | mongo
52fd49bfba3ae4ea54afc45a | postgresql
52fd49bfba3ae4ea54afc45b | oracle
52fd49bfba3ae4ea54afc45c | mysql
52fd49bfba3ae4ea54afc45d | redis
52fd49bfba3ae4ea54afc45e | db2
(6 rows)
WWW FDW
test=# SELECT * FROM www_fdw_geocoder_googletest-# WHERE address = '731 Lexington Ave, New York, NY';
-[ RECORD 1 ]-----+----------------------------------------------address | type | street_addressformatted_address | 731 Lexington Avenue, New York, NY 10022, USAlat | 40.7619363lng | -73.9681017location_type | ROOFTOP
PL/Proxy
● Developed by Skype
● Allows for scalability and parallelization
● Uses procedural languages and FDWs
PostgreSQL Replication
● Replicate to read-only databases using native streaming replication
● All writes go to a master server
● Load balance across the pool of servers
PostgreSQL Scalability
● PostgreSQL scales up linearly up to 64 cores
● May scale further but hardware is not available to the community
http://rhaas.blogspot.com/2012/04/did-i-say-32-cores-how-about-64.html
Getting Help
● Community Mail List– http://www.postgresql.org/list/
● IRC– irc://irc.freenode.net/postgresql
● NYC PostgreSQL User Group– http://www.nycpug.org
Questions?