Upload
boris-kraft
View
1.661
Download
4
Embed Size (px)
DESCRIPTION
In this talk we give an overview about some special integration and extensibility problems which occurred in our projects so far and the way how they could be solved with Magnolia. This covers some examples like customization of the admin interface and workflows as well as issues concerning authentication via LDAP / SSO or document management. On a more technical side of things, we will present a system architecture for handling high-volume User Generated Content traffic, e.g. large amounts of page-comments being submitted in short time. The purpose of this architecture is to ensure CMS availability during such UGC peak loads. This will be followed by an application demo showing an integration of the Google Web Toolkit (GWT) within the Magnolia 4.3 admin-interface, here used for moderating page comments. We'd like to share our experiences with this approach, and provide some background on its motivation and applicability.
Citation preview
Magnolia Conference, Technical Track | Basel, 16. September 2010
Empowering Magnolia for Enterprise Use Cases
About us
Sebastian Frick
Technical Project Manager
Jörg von Frantzius
System Architect
Some facts about Aperto
Internet agency in Berlin
Offering Conception, Design,
Development, Online Marketing
Building projects with Magnolia since
2006 for clients like Siemens,
Bertelsmann, EADS, INSM, Frankfurt
School and others
Contributed frontend and concept for
Standard Templating Kit
What we are talking about today
Part 1: workflow specific enhancements
Part 2: approach on dealing with user generated content
Part 1: workflow specific enhancements
Workflows - out of box features
Typical business requirements
Customization examples
Best practises / recommendations
Workflows - out of box features
Standard 4-eyes-workflow for
publishing process
Sending of E-mail notifications
Management of multiple workflows
Time-based de-/activation
Commenting
Inbox for editors for managing
workflow items
Standard 4 Eyes-Workflow
Group „Editors“ Group „Publishers“
activates content
rejects content
approves content published
Mapping Configuration in AdminCentral
Workflows depending on paths in CMS
Different repositories can share one
workflow or run their own one
OpenWFE – XML definition
XML contains
Process-definitions
Participants (e.g. group, role, user)
Fields (variables)
Conditional expressions (if, while,
loop)
Many more expressions or patterns
(OpenWFE manual)
There do exist some tools, but...
OpenWFE IDE
DroFlo – visual editor
Not mature or Magnolia specific enough
– for modelling a workflow it usually
takes good text editor with syntax
highlighting and a developer.
Typical business requirements > workflow process
Enhanced number of steps or states
(e.g. 8-eyes-workflow)
Automatic or manual selection of next
receiver
Non-linear pattern (e.g. one item
assigned to different groups at the
same time)
Workflow engine for different scenarios
than publishing (e.g. internal
processes)
Manual selection of receiver
Manual selection of receiver
activation dialog is extendible like any usual Magnolia dialog
setted variables can be retrieved via OpenWFE-elements
Automatic selection of receiver
Possible scenarios
by language of content
by section in site tree
by role
Dynamic selection of receiver
Make use of commands or custom functions for „outsourcing“ business logic
to custom Java classes or external services
1
2
method added in
OpenWFE‘s
function-map.xml
Typical business requirements – workflow usability
Display of current number of workflow
items
Display of current process status &
participiant
Better traceability: workflow history
For editors: lack of workflow information in standard view
Custom column providing additional workflow information
Custom column providing additional workflow information
Sitetree-Implementation can be
exchanged by configuration
Adding additional columns is quite
easy (via Java)
Meta-information can be retrieved from
WorkflowItem-Object
Current number of items in inbox
Example for dynamic display of current
workflow items via AJAX based polling
AdminCentral frontend is extendible,
but we‘re looking forward to new
MagnoliaUI
Inbox view: enhancing workflow item dialog
Enhancing workflow-dialog by history tab
History info of an workflow item can be
build from attributes available in
WorkflowItem-Object
Enhancing workflow-dialog by references tab
Since content on one page can be
located in several repositories,
showing up references may be helpful
relations (UUID) and activation state
can be retrieved via Magnolia standard
functions
Best practises / recommendations
Best practises / recommendations #1
Specification via state diagram
Best practises / recommendations #2
Identify patterns and conditions
Proof of OpenWFE-Support vs. custom implementation
www.workflowpatterns.org for evaluation and comparison
flash based animations and descriptions of patterns
Best practises / recommendations #3
Avoid redundancies in workflow definitions whenever possible
Make use of Java-based commands (easier to maintain)
Best practises / recommendations #4
If groups will be used for determining workflow participants –
don‘t add ACLs to workflow groups
use seperate groups insteads
define a proper naming convention
Best practises / recommendations #5
Don‘t underestimate testing efforts
Set up at testing plan
Have already an idea on how to monitor single workitems before
development phase
Iterative development and testing
Regulary acceptance testing, adjustments will be usually necessary
Links
Evaluation matrix of workflow engines
Magnolia Workflow introduction
Home of OpenWFE
Part 2: approach on dealing with user generated content
1) 2.1. UGC: what‘s the problem?
2) 2.2. General solution + 2 implementation approaches
3) 2.3. GWT in the admin central
34 Basel | 10.09.2009 | Magnolia Conference | Technical Track
Client‘s UGC requirements
(UGC = User Generated Comments, e.g. page comments)
Client‘s website has page-commenting feature
At peak load times, thousands of users want to post their page comments
within a couple of minutes
Client‘s requirement:
Sustained content delivery during UGC peak loads!
35 Basel | 10.09.2009 | Magnolia Conference | Technical Track
But …
Magnolia
does scale
just great!
So,
what‘s the
problem?
The problem: UGC POST requests differ from content requests
Content requests
can be satisfied from cache, i.e. fast response time per request
no bottleneck, i.e. performance scales linearly with number of servers
network bandwidth can be maxed out, given a
good caching hierarchy and
sufficient hardware sizing
Not the case with UGC POST requests…
36 Basel | 10.09.2009 | Magnolia Conference | Technical Track
The problem: UGC POST have much larger performance impact!
UGC POST requests
cannot be satisfied from cache!
because require DB insert
requests take orders of magnitude longer,
meanwhile blocking your HTTP worker threads
system can take fewer of these requests simultaneously
before becoming unavailable
DB will become the bottleneck at some point
UGC load can exceed any hardware sizing
37 Basel | 10.09.2009 | Magnolia Conference | Technical Track
Solution (system architecture)
Website availability can only be ensured by
Separating content delivery from UGC processing ,
through separate operating system processes
UGC processing can run on dedicated hardware if necessary
So it could also be shifted into the cloud
38 Basel | 10.09.2009 | Magnolia Conference | Technical Track
Solution (system architecture): consequences
Consequences of separation:
Even if UGC processes fail (all threads busy):
Magnolia processes happily continue to serve content requests
Worst case only means: enduser will still see web page contents,
with additional error message „page commenting currently unavailabe“
39 Basel | 10.09.2009 | Magnolia Conference | Technical Track
Magnolia approach for page comments
Commenting module
(http://documentation.magnolia-cms.com/modules/commenting.html)
In order to separate processes:
have dedicated Magnolia instances that serve only commenting repository
Must have multiple of these instances for scalability
UGC is not published from author to publish servers,
but still all publish servers must see same content:
must set up shared JCR repository using Jackrabbit clustering
(http://wiki.apache.org/jackrabbit/Clustering)
40 Basel | 10.09.2009 | Magnolia Conference | Technical Track
Decision Magnolia approach vs. custom solution
Problems that we saw for us:
Increased system complexity (setup Jackrabbit clustering, setup dedicated
Magnolia instances with commenting repository)
Our lack of practical experience with clustered Jackrabbit:
How hard is it to setup?
How does it scale, in terms of lock contention?
How does it behave under high load?
What long-term consequences does journaling have (performance, maintenance)
For us: incurred complexity and risks outweigh advantages
42 Basel | 10.09.2009 | Magnolia Conference | Technical Track
Custom architecture chosen for UGC processing
Page comments are rendered in browser by Javascript,
using Google Web Toolkit (GWT)
UGC requests served by separate tomcats,
containing only a single REST webservice
Comments data is stored in clustered RDBMS
Comment moderation implemented with GWT
Proven software stack on server-side,
we know which screws to turn for optimization
43 Basel | 10.09.2009 | Magnolia Conference | Technical Track
44 Basel | 10.09.2009 | Magnolia Conference | Technical Track
Comment moderation: UI requirements
Comment moderation requires lots of tedious manual work,
UI shouldn‘t make a hard job even worse
So UI should have:
High useability
In particular: immediate responsiveness where possible
(i.e. no noticeable delay between click and visual response)
45 Basel | 10.09.2009 | Magnolia Conference | Technical Track
Comment moderation in admin interface with GWT
Solution: Google Web Toolkit (GWT)
(Java translated to Javascript + great tooling)
suitable for functional UIs (i.e. without pixel-grained styling)
server roundtrips can be minimized:
As much logic as wanted can be executed in browser
(e.g. status message update upon selection change)
As much state as wanted can be held in browser
(e.g. caching of previously shown rows in a paging table)
Economical implementation through development and debugging in Java
46 Basel | 10.09.2009 | Magnolia Conference | Technical Track
Comment moderation demo
Demo…
For the technically interested:
Use of GWT RPC, turned out to be fast and reliable,
most of all: much easier to program than custom REST webservice
JPA2 entities are serialized transparently through GWT RPC,
by using net.sf.gilead
Paging table based on org.gwtlib
Following: the big picture…
47 Basel | 10.09.2009 | Magnolia Conference | Technical Track
48 Basel | 10.09.2009 | Magnolia Conference | Technical Track
Thank you for your interest!
Our contacts..
In the web...
http://www.aperto.de
http://blog.aperto.de
http://www.twitter.com/aperto