PostgreSQL 9.4 JSON Types and Operators

Preview:

Citation preview

PostgreSQL 9.4JSON Types and OperatorsPittsburgh PostgreSQL Users Group2015 December 16

Nicholas Kiralygithub.com/nkiraly

Twitter @NicholasKiralykiraly.nicholas@gmail.com

Introductionsgithub.com/nkiraly Twitter @NicholasKiraly

Systems Integration Engineer

Interface Systems to Produce New Value

Open Source Tools / PostgreSQL / FreeBSD Advocate

To play along with today’s examples, vagrant up with https://github.com/nkiraly/koadstation/tree/master/dbdevf2

Why should I JSON with PostgreSQL ?Misconception:I need one system of record.But I need to SQL, JSON, and XML so hard!I have to do transforms, parsing, and derivative storage in my implementation to get the job done with a single system of record.

Not so!JSON operators allow for expressions in queries and indexesLess round-trips for information querying and selectionStreamline and structure your data in one place in your system of recordBase your JSON documents on SQL normalized tables you get best of both worlds!

JSON Types JSON

JSONB

JSON vs JSONBStored as Text

Retains Whitespace

Retains Key Order

Retains Duplicate Keys

Index expressions only

Binary

Hierarchy of Key/Value pairs

Whitespace Discarded

Last Duplicate Key Wins

GIN Index support can be leveraged by contains operators (@> <@ ? ?| ?&)

What about HSTORE?PostgreSQL extension

For single-depth Key/Value pairs

JSON type objects can be nested to N

HSTORE only stores strings

JSONB uses full range of JSON numbers for element values

JSON Operators

-> ->>

#> #>>

@> <@

? ?|?&

->Get Object Field'[{"a":"foo"},{"b":"bar"},{"c":"baz"}]'::json->2

{"c":"baz"}

->>Get Object Field As Text

'{"a":1,"b":2}'::json->>'b'

2

#>Get Object At Specified Path

'{"a": {"b":{"c": "foo"}}}'::json#>'{a,b}'

{"c": "foo"}

#>>Get Object At Specified Path As Text

'{"a":[1,2,3],"b":[4,5,6]}'::json#>>'{a,2}'

3

JSON / JSONB Operators

@>Does the left object contain the element on the right?

'{"a":1, "b":2}'::jsonb @> '{"b":2}'::jsonb

<@Is the left element and value contained in the right?

'{"b":2}'::jsonb <@ '{"a":1, "b":2}'::jsonb

? Does the field key string exist within the JSON value?

'{"a":1, "b":2}'::jsonb ? 'b'

?|

Do any of these key strings exist?

'{"a":1, "b":2, "c":3}'::jsonb ?| array['b', 'c']

JSONB Contains Operators

?&

Do all these key strings exist?

'["a", "b"]'::jsonb ?& array['a', 'b']

JSON Queries

SELECT

doc ->’field’ ->>’subfield’

FROM sometable

Table JSON Type Usagesuchjson=# CREATE TABLE somejson ( id INTEGER, doc JSON);CREATE TABLE

suchjson=# \d+ somejson Table "public.somejson" Column | Type | Modifiers | Storage | Stats target | Description--------+---------+-----------+----------+--------------+------------- id | integer | | plain | | doc | json | | extended | |

Insert JSON Datasuchjson=# INSERT INTO somejson VALUES ( 1,'{ "name": "Nicholas Kiraly", "address": {

"line1": "5400 City Blvd","line2": "Apt B","city": "Pittsburgh","state": "PA","zipcode": "15212"

}}');INSERT 0 1

Select JSON Datasuchjson=# SELECT * FROM somejson ; id | doc----+-------------------------------- 1 | { + | "name": "Nicholas Kiraly", + | "address": { + | "line1": "5400 City Blvd", + | "line2": "Apt B", + | "city": "Pittsburgh", + | "state": "PA", + | "zipcode": "15212" + | } + | }(1 row)

Extract JSON Datasuchjson=# SELECT doc->>'address' FROM somejson ; ?column?-------------------------------- { + "line1": "5400 City Blvd",+ "line2": "Apt B", + "city": "Pittsburgh", + "state": "PA", + "zipcode": "15212" + }(1 row)

Navigate JSON Datasuchjson=# SELECT doc->'address'->>'zipcode' from somejson ; ?column?---------- 15212(1 row)

suchjson=# SELECT doc->'address'->>'zipcode' = ‘15212’ from somejson ; ?column?---------- t(1 row)

JSONB Queries

SELECT

doc ->’field’ ->>’subfield’

FROM sometable

Table JSONB Type Usagesuchjson=# CREATE TABLE somejsonb ( id BIGSERIAL, doc JSONB);CREATE TABLE

suchjson=# \d+ somejsonb Table "public.somejsonb" Column | Type | Modifiers | Storage | Stats target | Description--------+---------+-----------+----------+--------------+------------- id | integer | | plain | | doc | jsonb | | extended | |

Insert JSONB Datasuchjson=# INSERT INTO somejsonb ( doc ) VALUES ( '{ "name": "Nicholas Kiraly", "address": {

"line1": "5400 City Blvd","line2": "Apt B","city": "Pittsburgh","state": "PA","zipcode": "15212"

}}');INSERT 0 1

Select JSONB Datasuchjson=# SELECT * FROM somejsonb ; id | doc ----+-------------------------------------------------------------------------------- 1 | {"name": "Nicholas Kiraly", "address": {"city": "Pittsburgh", "line1": "5400 City Blvd", "line2": "Apt B", "state": "PA", "zipcode": "15212"}}(1 row)

Extract JSONB Datasuchjson=# SELECT doc->>'address' FROM somejsonb ; ?column?------------------------------------------------------------------------------------- {"city": "Pittsburgh", "line1": "5400 City Blvd", "line2": "Apt B", "state": "PA", "zipcode": "15212"}(1 row)

How many rows have that JSON element?suchjson=# SELECT COUNT(*) FROM somejsonb ; count-------- 101104

suchjson=# SELECT COUNT(*) FROM somejsonb WHERE doc->'address'?'line2'; count------- 56619

JSON Element arrays valuessuchjson=# CREATE TABLE shopping_lists (

shopping_list_id BIGSERIAL,shopping_list_doc JSONB

);CREATE TABLE

suchjson=# INSERT INTO shopping_lists ( shopping_list_doc ) VALUES ('{

"listName": "Needed supplies", "items": [ "diet shasta", "cheese curls", "mousse" ]

}' );INSERT 0 1

Element arrays as result rowssuchjson=# SELECT * FROM shopping_lists; shopping_list_id | shopping_list_doc------------------+----------------------------------------------------------------- 2 | {"items": ["diet shasta", "cheese curls", "mousse"], "listName": "Needed supplies"}(1 row)

suchjson=# SELECT jsonb_array_elements_text(shopping_list_doc->'items') AS item FROM shopping_lists;

item-------------- diet shasta cheese curls mousse(3 rows)

Multiple rows, Element arrays as result rowssuchjson=# INSERT INTO shopping_lists ( shopping_list_doc ) VALUES (

'{ "listName": "Running low", "items": [ "grid paper", "cheese curls", "guy fawkes masks" ]

}' );INSERT 0 1suchjson=# SELECT jsonb_array_elements_text(shopping_list_doc->'items') AS item FROM shopping_lists; item------------------ diet shasta cheese curls mousse grid paper cheese curls guy fawkes masks(6 rows)

JSON Indexes

SELECT

jsondoc->>’element’

USING INDEX

->> text expression

@> contains

Cost of WHERE JSON ->> Expressionsuchjson=# SELECT COUNT(*) FROM somejson; 100101suchjson=# SELECT COUNT(*) FROM somejson WHERE (doc->'address'->>'zipcode'::text) = '11100'; 1001suchjson=# EXPLAIN ANALYZE SELECT * FROM somejson WHERE (doc->'address'->>'zipcode'::text) = '11100'; QUERY PLAN------------------------------------------------------------------------------------Seq Scan on somejson (cost=0.00..4671.77 rows=501 width=36) (actual time=0.039..316.502 rows=1001 loops=1) Filter: (((doc -> 'address'::text) ->> 'zipcode'::text) = '11100'::text) Rows Removed by Filter: 99100 Planning time: 0.093 ms Execution time: 366.303 ms

Indexing for WHERE JSON ->> Expression Expressionsuchjson=# CREATE INDEX somejson_zipcode ON somejson ((doc->'address'->>'zipcode'::text));suchjson=# EXPLAIN ANALYZE SELECT * FROM somejson WHERE (doc->'address'->>'zipcode'::text) = '11100'; QUERY PLAN ------------------------------------------------------------------------------------ Bitmap Heap Scan on somejson (cost=12.18..1317.64 rows=501 width=36) (actual time=0.222..3.256 rows=1001 loops=1) Recheck Cond: (((doc -> 'address'::text) ->> 'zipcode'::text) = '11100'::text) Heap Blocks: exact=1001 -> Bitmap Index Scan on somejson_zipcode (cost=0.00..12.06 rows=501 width=0) (actual time=0.210..0.210 rows=1001 loops=1) Index Cond: (((doc -> 'address'::text) ->> 'zipcode'::text) = '11100'::text) Planning time: 0.186 ms Execution time: 5.214 ms

Cost of WHERE ->Element->> Expressionsuchjson=# SELECT COUNT(*) FROM somejsonb; 101104

suchjson=# EXPLAIN ANALYZE SELECT * FROM somejsonb WHERE (doc->'address'->>'zipcode'::text) = '11100'; QUERY PLAN------------------------------------------------------------------------------------ Seq Scan on somejsonb (cost=0.00..4171.32 rows=506 width=159) (actual time=0.033..58.670 rows=1011 loops=1) Filter: (((doc -> 'address'::text) ->> 'zipcode'::text) = '11100'::text) Rows Removed by Filter: 100093 Planning time: 0.134 ms Execution time: 64.235 ms

Indexing for WHERE ->Element->> ExpressionCREATE INDEX somejsonb_zipcode ON somejsonb ((doc->'address'->>'zipcode'::text));

suchjson=# EXPLAIN ANALYZE SELECT * FROM somejsonb WHERE (doc->'address'->>'zipcode'::text) = '11100'; QUERY PLAN----------------------------------------------------------------------------------- Bitmap Heap Scan on somejsonb (cost=12.22..1253.10 rows=506 width=159) (actual time=0.440..4.003 rows=1011 loops=1) Recheck Cond: (((doc -> 'address'::text) ->> 'zipcode'::text) = '11100'::text) Heap Blocks: exact=1011 -> Bitmap Index Scan on somejsonb_zipcode (cost=0.00..12.09 rows=506 width=0) (actual time=0.217..0.217 rows=1011 loops=1) Index Cond: (((doc -> 'address'::text) ->> 'zipcode'::text) = '11100'::text) Planning time: 0.113 ms Execution time: 6.161 ms

Cost of WHERE @> Contains Operatorsuchjson=# EXPLAIN ANALYZE SELECT * FROM somejsonb WHERE doc @> '{ "address": { "zipcode":"11100" } }'; QUERY PLAN------------------------------------------------------------------------------------- Seq Scan on somejsonb (cost=0.00..3665.80 rows=101 width=159) (actual time=0.019..59.580 rows=1011 loops=1) Filter: (doc @> '{"address": {"zipcode": "11100"}}'::jsonb) Rows Removed by Filter: 100093 Planning time: 0.094 ms Execution time: 64.843 ms

Indexing for WHERE @> Contains Operatorsuchjson=# CREATE INDEX somejsonb_gin ON somejsonb USING gin ( doc jsonb_path_ops);CREATE INDEX

suchjson=# EXPLAIN ANALYZE SELECT * FROM somejsonb WHERE doc @> '{ "address": { "zipcode":"11100" } }'; QUERY PLAN------------------------------------------------------------------------------------- Bitmap Heap Scan on somejsonb (cost=12.78..349.75 rows=101 width=159) (actual time=0.376..4.644 rows=1011 loops=1) Recheck Cond: (doc @> '{"address": {"zipcode": "11100"}}'::jsonb) Heap Blocks: exact=1011 -> Bitmap Index Scan on somejsonb_gin (cost=0.00..12.76 rows=101 width=0) (actual time=0.271..0.271 rows=1011 loops=1) Index Cond: (doc @> '{"address": {"zipcode": "11100"}}'::jsonb) Planning time: 0.136 ms Execution time: 6.800 ms

Index Size Cost Considerationssuchjson=# SELECT nspname AS "namespace", relname AS "relation", pg_size_pretty(pg_relation_size(C.oid)) AS "size"FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)WHERE nspname NOT IN ('pg_catalog', 'information_schema', 'pg_toast')ORDER BY pg_relation_size(C.oid) DESC;

namespace | relation | size-----------+-------------------+------------ public | somejsonb | 19 MB public | somejsonb_zipcode | 2232 kB public | somejsonb_gin | 1680 kB public | somejson | 8192 bytes public | somejsonb_id_seq | 8192 bytes public | posts | 8192 bytes(6 rows)

JSON ViewsSELECT docFROM jsonviewWHERE things-----------------------------{ “formatted”: “data” }

Blog Site User Data ModelCREATE TABLE users ( user_id varchar(100) PRIMARY KEY, user_name varchar(100) NOT NULL, user_rank integer NOT NULL DEFAULT 10, user_display_name varchar(200) NOT NULL user_profile text);

Blog Site Post Data ModelCREATE TABLE posts ( post_id varchar(100) PRIMARY KEY, post_date timestamp with time zone, post_summary varchar(200) NOT NULL, post_content text NOT NULL, post_tags varchar(100)[]);

CREATE TABLE posts_read ( post_id varchar(100) REFERENCES posts(post_id), user_id varchar(100) REFERENCES users(user_id), read_date timestamp with time zone NOT NULL, PRIMARY KEY (post_id, user_id));

User Detail JSON ViewCREATE VIEW users_json AS SELECT users.*, row_to_json(users, TRUE) FROM users ;

SELECT * FROM users_json WHERE user_name = 'rsanchez';-[ RECORD 1 ]-----+---------------------------------------------------user_id | rick.sanchez@potp.comuser_name | rsanchezuser_rank | 10user_display_name | Richard Sanchezuser_profile | Wubba lubba dub-dub!row_to_json | {"user_id":"rick.sanchez@potp.com", | "user_name":"rsanchez", | "user_rank":10, | "user_display_name":"Richard Sanchez", | "user_profile":"Wubba lubba dub-dub!"}

Post JSON Viewblogsite=#

CREATE VIEW posts_json AS

SELECT posts.*, row_to_json(posts, TRUE)

FROM posts ;

SELECT FROM Post JSON Viewblogsite=# SELECT * FROM posts_json WHERE post_id = 'beths-new-bike';-[ RECORD 1 ]+--------------------------------------------------------------------post_id | beths-new-bikepost_date | 2015-12-15 04:04:37.985041+00user_id | beth.smith@rr.compost_summary | My New Bikepost_content | I got a new bike last night and it's better than a horsepost_tags | {bike,new,suchpedal}row_to_json | {"post_id":"beths-new-bike", | "post_date":"2015-12-15T04:04:37.985041+00:00", | "user_id":"beth.smith@rr.com", | "post_summary":"My New Bike", | "post_content":"I got a new bike last night and it's better than a horse", | "post_tags":["bike","new","suchpedal"]}

Unread Posts JSON ViewCREATE VIEW unread_posts_json AS -- users that have not read posts SELECT ur.*, row_to_json(ur, TRUE) AS json FROM (

-- select u.user_id so view queries can limit to a specific user's unread posts

SELECT u.user_id, p.post_id, p.post_summary, p.post_dateFROM users AS u CROSS JOIN posts AS p

WHERE NOT EXISTS ( SELECT 1 FROM posts_read WHERE post_id = p.post_id AND user_id = u.user_id ) ) AS ur;

Posts Datablogsite=# SELECT * FROM posts;-[ RECORD 1 ]+---------------------------------------------------------post_id | beths-new-bikepost_date | 2015-12-15 05:59:26.634021+00user_id | beth.smith@rr.compost_summary | My New Bikepost_content | I got a new bike last night and it's better than a horsepost_tags | {bike,new,suchpedal}-[ RECORD 2 ]+---------------------------------------------------------post_id | ricks-new-space-vanpost_date | 2015-12-15 05:59:26.634021+00user_id | rick.sanchez@potp.compost_summary | New Spaceshippost_content | I got a new spaceship last night and -burp- its awesomepost_tags | {spaceship,new,interdimensional}

Select Morty Unread Posts JSON Viewblogsite=# -- what posts hasn't Morty read yet?blogsite=# SELECT json FROM unread_posts_json WHERE user_id = 'morty.smith@gmail.com'; json--------------------------------------------------------- {"user_id":"morty.smith@gmail.com", + "post_id":"beths-new-bike", + "post_user_id":"beth.smith@rr.com",+ "post_summary":"My New Bike", + "post_date":"2015-12-15T02:55:59.461635+00:00"} {"user_id":"morty.smith@gmail.com", + "post_id":"ricks-new-space-van", + "post_user_id":"rick.sanchez@potp.com",+ "post_summary":"I Got A New Space Van", + "post_date":"2015-12-15T02:55:59.461635+00:00"}(2 rows)

Select Rick Unread Posts JSON Viewblogsite=# -- what posts hasn't Rick read yet?blogsite=# SELECT json FROM unread_posts_json WHERE user_id = 'rick.sanchez@potp.com'; json------(0 rows)

Select Beth Unread Posts JSON Viewblogsite=# -- what posts hasn't Beth read yet?blogsite=# SELECT json FROM unread_posts_json WHERE user_id = 'beth.smith@rr.com'; json--------------------------------------------------------- {"user_id":"beth.smith@rr.com", + "post_id":"ricks-new-space-van", + "post_user_id":"rick.sanchez@potp.com",+ "post_summary":"I Got A New Space Van", + "post_date":"2015-12-15T02:55:59.461635+00:00"}(1 row)

Alternate Unread Posts ViewCREATE VIEW unread_posts_v2_json AS -- users that have not read posts -- alternate version SELECT ur.*, row_to_json(ur, TRUE) AS json FROM (

SELECT posts.post_id, posts.post_date, posts.post_summary, users.user_id

-- take the cross product of users and postsFROM usersINNER JOIN posts ON (

-- filter out the ones that exist in posts_read (users.user_id, posts.post_id) NOT IN (SELECT user_id, post_id FROM posts_read)

) ) AS ur;

Common Q&A Topic for PostgreSQL 9.4No in-place element editing in 9.4

JSON structures must be rebuilt to update element values

jsonb_set() in 9.5 to replace entire sections

Append children elements by the same name to overwrite existing values in 9.4

Regenerate / reselect entire JSON structure

Questions?Answers!

Feedbacks $#!7$%

SELECT doc->knowledgeFROM jsonbWHERE applied_knowledge IS NOT NULL

Questions ? Answers ! FeedbacksPostgreSQL 9.4 JSON Types and Operatorshttp://www.meetup.com/Pittsburgh-PostgreSQL-Users-Group

Check Out @lfgpgh http://lfgpgh.com

Nicholas Kiralygithub.com/nkiraly

Twitter @NicholasKiralykiraly.nicholas@gmail.com

Recommended