1
Herald: Achieving a Global Event Notification Service
Luis Felipe Cabrera, Michael B. Jones, Marvin Theimer
Microsoft Research
2
Global Event Notification Services
• Communication via event notification (also called publish/subscribe) is well-suited for loosely-coupled eCommerce applications, as well as Internet-scale distributed applications (e.g. instant messaging and multi-player games).
• General event notification systems currently:– scale to tens of thousands of clients,– do not have global reach.
3
Internet-scale Issues
• Scaling requirements are millions and billions, perhaps more.
• There will (probably) not be a single organization that owns the entire event notification infrastructure. Hence a federated design is required.
• Global reach implies that failures and network partitions will be common-place.
4
Focus on the Basic Distributed Systems Primitives
• Focus on the scalability of basic message delivery and distributed state management capabilities.
• Employ a very simple message-oriented design and assume – until proven otherwise – that richer event notification semantics can be layered on top.
5
Herald Event Notification Model
Creator
Publisher Subscriber
1: Create Rendezvous Point
2: Subscribe3: Publish
4: Notify
Rendezvous Point
Herald Service
6
Design Criteria
• The “usual” criteria:– Scalability– Resilience– Self-administration– Timeliness
• Additional criteria:– Heterogeneous federation– Security– Support for disconnection– Partitioned operation
7
Scalability
• 1011 Rendezvous Points (RPs)
• 1011 publishers & subscribers in aggregate
• 1010 publishers & subscribers per RP
• 1010 federation members
• 102 events/sec/RP
8
Resilience
• “Fail last, fail least” semantics.
• Correct operation in the presence of malicious/corrupt participants.
9
Self-administration
• System decides where to place state and how to propagate information about state changes.
• System dynamically adapts to changing loads and the presence of faults and network partitions.
• No manual tuning.
11
Heterogeneous Federation
• Federation of machines within cooperating but mutually suspicious domains of trust.
• Federated parties may include both small and large domains.
12
Security
• Support restricted access to Herald facilities.
• Support concepts such as groups and roles.
13
Support for Disconnection
• Eventual delivery to disconnected subscribers.
• Event histories to allow a posteriori examination of the past.
14
Partitioned Operation
• Continued operation on both sides of a network partition.
• Eventual (out-of-order) delivery after partition healing.
15
Non-Goals
• What’s the “best” way to do:– Naming– Filtering– Complex subscription queries
• In-order delivery (except as layered on top)
16
Applying Lessons of the Internet and Web
• Assume things are broken:– Mutual suspicion and no dependence on correct
behavior by others.
• Don’t try to fix everything:– All distributed state is maintained in a weakly-
consistent soft-state manner and is aged.
– All distributed state is incomplete and may be inaccurate.
17
Design Overview
• We think we only need these mechanisms:– Replication.– Overlay distribution networks.– Time contracts.– Event histories.– Administrative rendezvous points.
18
Replication
RP1@L1
RP2@L1
Herald@L1
RP1@L2
Herald@L2
RP1@L3
Herald@L3
Pub1
Sub2
Pub2
Sub1
Sub4
Sub5
Pub3Sub3
19
Overlay Distribution Networks
RP1@L1
Herald@L1
RP1@L2
Herald@L2
RP1@L3
Herald@L3
Pub1
Sub2
Sub1
Sub4
Pub2
RP1@L3
Herald@L3
22
Administrative Rendezvous Points
RP1
Herald Service
Name Service
1. Subscribe RP1@
2. Notify(change)
23
Engineering & Research Issues
• Baseline scalability numbers
• Dynamic system reconfiguration
• Federation and security
24
Baseline Scalability Numbers
• How scalable are single-node servers and server clusters?
• What are multicast-style delivery systems actually capable of, especially in aggregate?
25
Dynamic System Reconfiguration
• Reconfiguring distributed RP state in response to aggregate workloads and global state changes.
• Dealing with “flash crowd” loads.• Placement of RP state to minimize the effects of
network partitions and disconnection.• Placement of RP state to enable efficient
implementations of higher-level pub/sub semantics.
26
Federation and Security
• Can we define simple, open protocols?
• Will we need heavy-weight mechanisms to deal with malicious/corrupt servers?
• How should anonymity and privacy be dealt with/supported?
27
Related Work
• Non-global event notification systems (Gryphon, Ready, Siena, …)
• Netnews
• P2P systems such as Gnutella and Farsite
• Overlay & multicast networks
• CDNs
• OceanStore
28
Conclusion
• Global event notification is emerging as a key Internet technology.
• Herald is exploring scalability of the basic message and distributed state management aspects of an event notification system:– Gain engineering experience with scalable pub/sub
systems.– Explore dynamic system reconfiguration.– Understand the implications of federation and security.