Last BaaS StandingDave Johnson - Apigee
Brief History of BaaS
Wait… WTF is a BaaS?• A service that helps developers build apps
• Developers can focus on building apps, not servers
• Enables serverless apps and services
• BaaS services give REST API access to:
• Data storage & queries
• Location & Geo-queries
• User management & authentication
• File storage and uploads
• Push notifications
• Followers, activities & social features
Don’t build a server
PHP
Ruby
Java
Node
MySQL PaaS
Services in the cloud
AppServer
Cloud
Focus on your app!
Services in the cloud
&
App Cloud
Add logic via API Gateway
BaaS Early History• 2010 Kinvey
• 2010 Usergrid
• 2010 StackMob
• 2011 Firebase
• 2011 AnyPresence
• 2011 Parse
• 2012 Usergrid -> Apigee
• 2013 StackMob -> Paypal
• 2014 Firebase -> Google
• 2014 StackMob -> shutdown
• 2014 Usergrid -> Apache Software Foundation
• 2016 Parse -> shutdown
Where are we now?• Apache Usergrid v2.1
• Open source since 2011
• Used in production since 2011
• 10,000 Requests Per Second
• Supported by Apigee and others
Make sure your BaaS is truly Open Source
Lock-in• With any BaaS, some lock-in is inevitable
• No standard REST API
• You can probably get all your data out but...
• No standard schema for data
• If you’re going to get locked in…
Get locked in to Open Source
You can fix & influence• Access to source code
• One vendor is not only source for fixes & features
• Open development
• See everything the dev team is doing
• Join the dev team yourself
• Strength in community
• Get help from other users & developers
You can escape!• If your vendor gets too expensive
• Starts to suck
• "Sunsets" the product
• Goes out of business
• Then..
Run it yourself!
Pay somebody else to run it
Usergrid Architecture
Overview of Usergrid• Usergrid is a Key-Value Database
• Stores Collections of JSON Entities
• Entities can be retrieved by ID or by name
• Usergrid is a Search Engine
• Entity properties are indexed
• Entities can be queried by simple SQL-like syntax
• Usergrid is a Graph
• Entities can be connected to other Entities
• Connections can be iterated or queried
Overview (continued)• Usergrid is a REST API
• All Usergrid features are available via REST API
• Usergrid is micro-service
• That ate way too many jars of tasty Java code #humor
• OK, yeah, Usergrid is a monolith
• Written in Java, packaged as a WAR
• With a comprehensive Admin Portal
• Written in HTML5 and JavaScript using Angular.js
Cassandra inside• Entity Storage & Graph built on Cassandra
• Why Cassandra?
• Linear scalability
• Well-known and well-understood database
• Strong community
• Paid support and hosting available
• Apache!
ElasticSearch inside• Entity Indexing & Query built on ElasticSearch
• Why ElasticSearch?
• Better than writing our own query engine
• Well-known and well-understood product
• Strong community
• Paid support and hosting available
• Not Apache, but based on Apache Lucene
Simplest deployment
Scale-up
Multi-region
Better Multi-region
Scaling to > 10,000 RPS• Based on testing with Gatling
• 10,000 RPS run configuration
• 35 c3.xlarge Tomcat nodes
• 9 c3.xlarge Cassandra nodes
• 6 m3.xlarge ElasticSearch nodes
• ~7,800 RPS run comfiguration
• 15 c3.xlarge Tomcat nodes
• 9 c3.xlarge Cassandra nodes
• 6 m3.xlarge ElasticSearch nodes
Gatling results
Seek vs. Search• Seek
• Access Entities via ID, name or other unique value
• Implemented with Cassandra
• Search
• Search via SQL-like syntax returns Entity IDs
• Implemented via ElasticSearch
Scaling Graph• Graph is implemented with Cassandra
• Edges can be iterated via "seek"
• Edges are stored as Columns
• Theoretical limit on Columns is 2 billion / Row
• Practical limit on Columns is 500,000 / Row
• So… we have to shard Edges
Selective Indexing• Query is nice but indexing is expensive
• Coming soon:
• Specify which properties are indexed
• Turn on indexing / query on per collection basis
• Completely disable ElasticSearch startup
For more information
https://blogs.apache.org/usergrid/entry/usergrid_10k_part1
Using Usergrid
Usergrid features• JSON data storage and queries
• Geo-location and queries
• User management and authentication
• File storage and uploads
• Push notifications
• Social graph, activities and followers
Usergrid REST API• JSON over HTTP
• POST, GET, PUT and DELETE JSON Entities
• Entities have UUID, Name, Type
• Entities have name/value properties
• Entity properties are indexed
• SQL-like syntax for Entity queries
Usergrid Data model
Starting points
• Each application has root URL like this
• http://host/${org}/${app}
• Entities exist in Collections
• http://host/${org}/${app}/${collection}
• Next: some examples using http://httpie.org
Introducing HTTPie
Let’s use HTTPie for examples…HTTPie is like curl but with a simplified syntax. This HTTPie command $ http post :8080/daveorg/demo1/users name="David M. Johnson" username=dave password=Dave1dave [email protected]
is equivalent to this curl command: $ curl i —X POST http://localhost:8080/daveorg/demo1/users -H "Content-type: application/json" —d '{"name":"David M. Johnson", "username":"dave", "password":"Dave1dave", "email":"[email protected]"}'
HTTPieusage: http [OPTION] [OPTION]…
[METHOD] URL [REQUEST_ITEM [REQUEST_ITEM …]]
REQUEST ITEM
name:value HTTP Headers
name==value URL parameter
name=value Data field of JSON object
name:=value Non-string JSON data fields
Example: create app user$ http post localhost:8080/daveorg/demo1/users name="David M. Johnson" username=dave password=Dave1dave [email protected]
HTTP/1.1 200 OK
Access-Control-Allow-Origin: *
Content-Type: application/json
Date: Wed, 04 May 2016 19:14:08 GMT
Server: Apache-Coyote/1.1
Transfer-Encoding: chunked
{
"action": "post",
"application": "37352905-1216-11e6-b24a-080027e7a627",
"applicationName": "demo1",
"duration": 105,
"entities": [{
"activated": true,
"created": 1462389248805,
"email": "[email protected]",
Example: login as app user$ http post localhost:8080/daveorg/demo1/token username=dave password=Dave1dave grant_type=password HTTP/1.1 200 OK Access-Control-Allow-Origin: * Content-Type: application/json Date: Wed, 04 May 2016 16:50:15 GMT Server: Apache-Coyote/1.1 Transfer-Encoding: chunked { "access_token": "YWMtRuudPBIYEeaXoiGds0XA_wAAAVSgud0mLwlxDs2qac-L86tllu0LfPcYX_Y", "expires_in": 604800, "user": { "activated": true, "created": 1462380207203, "email": “[email protected]",
Example: Creating Entities
Creating an Entity is as easy as posting a JSON object to a Collection URL. For example…
Create an entity of type “cat” (assuming no auth): $ http post localhost:8080/daveorg/demo1/cats name=enzo color=orange
Create a cat (assuming auth required): $ http post "localhost:8080/daveorg/demo1/cats access_token==YWMtRuud…" name=bertha color=tabby
Example: Getting entities
Once you have created a named entity, you can get it
By name:$ http get :8080/daveorg/demo1/cats/enzo
By UUID$ http get :8080/daveorg/demo1/cats/0966f75…
Example: Getting a Collection$ http get localhost:8080/daveorg/demo1/cats HTTP/1.1 200 OK
Access-Control-Allow-Origin: *
Content-Length: 1198
Content-Type: application/json
Date: Fri, 06 May 2016 13:00:11 GMT
Server: Apache-Coyote/1.1
{
"action": "get",
"application": "066c74ea-12f4-11e6-8ca0-ced9a4638778",
"applicationName": "demo1",
"count": 3,
"duration": 10,
"entities": [{
"color": "grey",
"created": 1462539597538,
"metadata": {
"path": "/cats/6f511c99-138a-11e6-9469-ced9a4638778",
"size": 327
},
Example: Query a Collection
Queries use simplified SQL-like syntax, with AND and OR operators. No joins or functions are supported.
Query for all orange cats in the collection: $ http get :8080/daveorg/demo1/cats ql=="color='orange'"
Example: ConnectionsCreate some cats:$ http post localhost:8080/daveorg/demo1/cats \
name=enzo color=orange
$ http post localhost:8080/daveorg/demo1/cats \
name=bertha color=tabby
Connect Dave to his pets:$ http post :8080/daveorg/demo1/users/dave/pets/cats/enzo
$ http post :8080/daveorg/demo1/users/dave/pets/cats/bertha
Get Dave's pets:$ http get :8080/daveorg/demo1/users/dave/pets
Search Dave's pets:$ http get :8080/daveorg/demo1/users/dave/pets ql=="color='tabby'"
Example: Geo-Queries
Create a Restaurant with location specified:
$ cat restaurant.json
{ "name" : "Rockadero",
"location": {
"latitude": 37.779632,
"longitude": -122.395131
}
}
$ http post :8080/daveorg/demo1/restaurants < restaurant.json
Query by location:$ http get :8080/daveorg/demo1/restaurants \
ql=="location within 10000 of 37.78,-122.39"
Retail Use Cases
Customer favorites
• Leverages graph / connections
• Scales to millions of customers
Store Locator
• Retailer POSTs an entity for each store with latitude/longitude coordinates
• Retailer’s app uses uses GeoQueries to find stores closest to customers
Deal finder with notifications
• Retailer POSTs an entity for each store with latitude/longitude coordinates
• Retailer POSTs entities for each sale item in each store on a nightly or weekly basis
• Retailer’s app uses uses Push Notifications to notify customers within some minimum distance from store
Contribute to Usergrid
We need help with…• Documentation and generation of API docs
• Improvements and new features in Stack
• Improvements and new features, better UX in Portal
• Testing and automation of testing
• Getting new users started, community support
https://github.com/
https://www.apache.org/licenses/icla.txt
Contributor Workflow• Discuss
• Discuss your changes on the dev mailing list (optional for small changes)
• Create a JIRA issue for your changes (optional for small changes)
• Do the work
• Fork the Github apache/usergrid repo
• Make your changes (if it’s code, then don’t forget to add tests)
• Push your changes
• Submit a clean PR against the appropriate branch of apache/usergrid
• Update JIRA issue, announce PR on dev mailing list
apache / usergrid
Apache Git(The Canonical Repo)
Apache mirrors world-wide
1
2
3 4
5
You fork the repo
You work in your fork
Usergrid committer reviews PR and
pushes to Apache
Your code makes it into the next official release!
Your Fork
You submit a Pull Request
Contributor workflowREAD-ONLY
Questions?