Upload
rob-tweed
View
3.462
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Presentation given at the London Node Users Group, 2011
Citation preview
The Globals Database
Its significance for Node.js developers
Rob TweedM/Gateway Developments Ltd
http://www.mgateway.comTwitter: @rtweed
Background
• Director, M/Gateway Developments Ltd• Web/Ajax/Mobile web technologies• Healthcare, Financial Services• Business/enterprise applications
– Browser-based– Interactive– Secure– Internet & Intranet– Database intensive
What databases?
• “Global storage-based”– Caché (“native storage”) – GT.M
• Very high performance & reliability• Very low maintenance• NoSQL database that pre-dates the NoSQL era• Tried and tested for business-critical use• Dominate healthcare and financial services• Much wider applicability, but generally little
known– Key limitation: accessed via an outdated language
What’s my interest in Node.js?
• Using Javascript since late 1990’s
• Douglas Crockford– “Loopage”
• Javascript on the server– Direct access to global storage via
Javascript?
Globals Database
http://globalsdb.org
What is Globals?
• Core database engine from Caché• No native language• None of Caché’s object/relational functionality• Just the core global-storage engine• Free
– But not Open Source– Otherwise extremely liberal licence
• Two APIs:– Java– Node.js
Global storage
• Global Persistent Variables– aka “Globals”
• Globals = unit of persistent storage– Schema-free– Hierarchically structured– Sparse– Dynamic
– “persistent associative array”
– Each array element = “a global node”
Anatomy of a Global Node
• A Global node has:– A name– 0, 1 or more subscripts– String value
globalName[subscript1,subscript2,..subscriptn] == value
Simple key/value storage
telephone“617-555-1414” “Tweed, Rob”
“211-555-9012” “James, George”
telephone[“617-555-1414”] == “Tweed, Rob”
Multi-dimensional structures“firstNode” 5
“lastNode” 2
“node”
4
list “myList”
“nodeCounter” 5
“nextNode” 2
“previousNode” 5
“value” “George”
5“nextNode” 4
“value” “Rob”
2“value” “John”
“previousNode” 4
Globals = Universal NoSQL Engine
• Document storage– Automatic mapping to/from JSON
• Columnar
• Graph
• Native XML database
• Relational
• Object
Global storage as a db technology
• Low-level storage engine• No built-in indexing
– You create/define the indices you need
• Schema-free– Database behaviour defined in your application logic
• No built-in query language– APIs for node hierarchy traversal
• All the basic building blocks for creating your own tailored NoSQL database
Why bother?
• Lots of Open Source NoSQL databases for Node.js:– Redis– CouchDB– MongoDB– Riak– etc
Globals Node.js APIs
• In-process– No network layer
• API style:– Asynchronous– Synchronous
Why Globals? My view:
• Ridiculously high performance
• One single multi-purpose database
• Synchronous coding
Objection, m’lud!
In-process database
• That sounds like too intimate a relationship– A problem in the database will bring down the
Node.js process
• On the other hand…– No network bottleneck– Potential for very high performance
Synchronous APIs
Gospel according to St. Async of the Node
Introducing Q-Oper8
https://github.com/robtweed/Q-Oper8
Q-Oper8 architecture
Main Node.js server process
Pre-spawned Child processes
Queue ofrequest objects
Q-Oper8 architecture
Main Node.js server process
Pre-spawned Child processes
Queue ofrequest objects
Incoming request
Q-Oper8 architecture
Main Node.js server process
Pre-spawned Child processes
Queue ofrequest objects
If necessary,Put it on the
queue
Q-Oper8 architecture
Main Node.js server process
Pre-spawned Child processes
Queue ofrequest objects
If child process
is free, sendfirst request
Queueprocessingtriggered
Q-Oper8 architecture
Main Node.js server process
Pre-spawned Child processes
Queue ofrequest objects
Child process unavailable
Q-Oper8 architecture
Main Node.js server process
Pre-spawned Child processes
Queue ofrequest objects
Process request
Q-Oper8 architecture
Main Node.js server process
Pre-spawned Child processes
Queue ofrequest objects
Finished:Send back
result object
Q-Oper8 architecture
Main Node.js server process
Pre-spawned Child processes
Queue ofrequest objects
Process resultand sendresponse
Q-Oper8 architecture
Main Node.js server process
Pre-spawned Child processes
Queue ofrequest objects
Child process available
Q-Oper8 architecture
Main Node.js server process
Pre-spawned Child processes
Queue ofrequest objects
Queueprocessingtriggered
Q-Oper8 benefits
• Child processes can use blocking I/O– Only handle one request at a time– Nobody else to block
• Main server process uses kosher async logic• Main process is isolated from activity in child
processes:– Blocking I/O– Synchronous coding– Heavy computation– in-process database
• Allows Node.js to use multiple CPU cores
Q-Oper8 + Globals
Main Node.js server process
Pre-spawned Child processes
Queue ofrequest objects
Pure asyncactivityhere
Globalsdb APIs
Globalsdb APIs
Globalsdb APIs
Q-Oper8 + Globals Performance
• My test machine:– HP Proliant ML115G Server
• 4-core AMD Opteron 2.1GHz CPU• 8 Gb memory• 640 Gb 7200 RPM SATA hard drive
– (Western Digital Caviar Blue)
– Ubuntu Linux 10.10 Server (64-bit)– Node.js 0.4.0– Globals DB
– Under £200-worth of machine
Q-Oper8 No-op test
• Passing simple message each way– No other processing
• Determine “steady state”:– Add requests to queue as fast as child
processes can process them
• Optimum performance with 3 child processes
• 18,350 per second
Q-Oper8 + Globals
• Increasing numbers of Set commands per request– Each Set creates one new global node
• Different APIs:– Async parallel– Async nested– Sync
• Measured steady-state maximum rate
Q-Oper8 + Globals
Q-Oper8 + Globals
Sync
Async parallel
Async nested
Requests/sec
No of Global Sets per request
1 2 5
Q-Oper8 + Globals
Requests/Sec
(Sync)
No of Global Sets per request
1000500200100
Q-Oper8 + Globals
Global sets/Sec
(Sync)
No of Global Sets per request
1000500200100
Conclusions
• Globals’ synchronous APIs are more than twice as fast as its asynchronous ones
• The more Globals activity you do per request, the better the db performance
• On my small server, performance maxed at 190,000 global node sets/sec– Redis-benchmark on same server: 88,000
Sets/sec
Globals in Use
• ewdDOM module– Persistent XML DOM implementation– Uses Globals for storage– DOMs created and stored in globals– Access to DOMs is via synchronous APIs– APIs hit the data in situ, not an in-memory
copy
– https://github.com/robtweed/ewdDOM
ewdDOM: Example
var document = ewdDOM.getDocument('rob'); var node1 = document.createElement("testElement"); document.getElementById('myNode').appendChild(node1); document.getElementById('myNewNode').text = ‘Some new text'; document.output();
• Synchronous logic allows full use of OO syntax
• Each level in the “dot syntax” represents a number of global node manipulations, with result passed to its child method
• Not feasible using async logic with call-backs
Globals in Use
• M/DB-g– Emulation of Amazon SimpleDB– 100% API-compatible– Development history:
• Pure global database native-language implementation (Caché and GT.M)
• Node.js + async access to GT.M– Painful! Incomplete
• Node.js + sync access to Globals– Easy! Line by line copy of original implementation
Sync v Async
• Async perfect for event-handling and other I/O• Globals makes sync coding possible for
database access/manipulation– Higher performance– Easier and quicker to write– Much simpler to maintain– Allows creation of higher-level OO APIs– Node.js for business/enterprise applications becomes
feasible to consider• Developer and maintenance time = the primary IT costs
Globals: Conclusion
• Significance:– Extremely high performance database for use
with Node.js– Very adaptable database engine (universal
NoSQL engine)• One database for all needs• “virtual storage” for Node.js?
– Possibility of synchronous database coding
Appendix
Macbook Air benchmark
• OS X Lion
• Ubuntu 10.10 VM + Fusion 3
• 1 CPU + 512Mb memory:– 1 Q-Oper8 child process– Max global sets: ~ 38,000 / sec
• 2 CPU + 512Mb memory:– 2 Q-Oper8 child process– Max global sets: ~ 77,000 / sec
ewdDOM Benchmark
• Create small DOM programmatically– 177 X Write:
• 164 X set• 13 X kill
– 322 X Read:• 110 X get• 212 X data
• 1000 requests put on Q-Oper8 queue• 3.5 seconds to process
– 92,000 reads/sec + 50,570 writes/sec