PoolParty Solutions

http://www.poolparty.biz/

1/11

PoolParty Solutions

Andreas Blumauer [email protected] 25.02.2013

Contents

Smart Customer Support Systems ......................... 1 Enterprise Linked Data Integration ......................... 2 Vocabulary Management ........................................ 3 Semantic Content Management ............................. 4 Text Mining of Business News ............................... 5 Vertical Search Solutions ....................................... 6 Knowledge Bases ................................................... 7 Linked Open Data ................................................... 8 Recommender Systems ......................................... 9 Semantic Search .................................................. 10

Smart Customer Support Systems How can semantic technologies help to make customer support systems more intelligent in order to understand customer's needs better?

Addressed problem

Customer support systems frequently cause disorientation due to the technical terms used and a lack of transparent and easily comprehensible navigation structures. Providers of products and services from various sectors (telecommunications, public administration, law, etc.) use different languages and differing categories than consumers (or citizens) do. Thus, in many cases clients have to deal with frustrating translation work which leads to misunderstandings and increasing costs in the call center.

Our solution approach

Users benefit from a guidance system which helps to achieve orientation at any point of the support system. The guidance system consists of semantic search facilities like search filters (faceted search), search refinements, similarity search (see also: recommender system) and integrated fact boxes which display further details about the search term which might refer to a product, for example. As a prerequisite for these improvements, a knowledge model consisting of concepts (e.g. products, technologies, services, etc.) its relations and


2/11

differing names (including synonyms) has to be created. This model is based on an open W3C standard called SKOS which makes the effort future-proof. In some cases it is advisable to split up the thesaurus into (at least) two modules. A semantic layer of this kind helps to translate between the two worlds (supplier/vendor vs. client/customer). While the supplier’s thesaurus still links its concepts to the corresponding parts of the client’s thesaurus, the thesauri can be managed separately from each other.

Results

• Semantic index of content base of the support system

• User-friendly digital guidance system

• Facilities to refine search queries to find answers to specific questions more easily

• Help users to learn quickly: combine search results with facts from other knowledge bases automatically

Used methods, technologies and standards

• PoolParty knowledge modelling approach

• Simple Knowledge Organization System (SKOS)

• PoolParty Thesaurus Server

• PoolParty Extractor

• PoolParty Search

Enterprise Linked Data Integration How can linked data be used as a more agile and flexible methodology for enterprise data integration?

Addressed problem

Putting all the information in one place which describes a business object like a product, a customer or a certain technology can ease the life of many people significantly. Unfortunately, the automatic integration of data from various sources can cause tremendous efforts. Data in enterprises is organised such that data remains locked up in its database. Knowledge workers are forced to collect information from a series of data silos manually to put those pieces together like a puzzle in order to create the basis for a decision making process. Data integration projects most often are built upon yet another inflexible data structure. Numerous amendments or additions made to the structure or to the semantics of an information component cannot be reflected properly by the integration layer. The result is a landscape consisting of data silos which are scarcely connected to each other. Intelligent linkages happen only in the course of ad hoc processes which are not readily comprehensible.


3/11


Web data, but also data in enterprises are characterized by a great structural diversity as well as frequent changes. This poses a great challenge for applications based on that data. We address this problem by using a flexible data model that supports the integration of heterogenous and volatile data. We make use of linked data technologies for data integration purposes which relies on graph-based models. This allows to incrementally extend the schema by various properties and constraints. Linked data is based on open standards which makes the effort future-proof.

Results

• 360o views on specific business objects ('topic pages') like products, companies, technologies etc.

• Reports based on sometimes complex queries which can only be answered if data is used from various sources

• Mashups of unstructured (e.g.: business news, social media, etc.) and structured data (e.g.: statistics, legacy data, etc.)

• Mashups of data from the web (e.g.: open government data) and internal data sources


• Linked data stack

• Semantic web standards (RDF, SKOS, SPARQL etc.)

• Linked data alignment

• Linked data manager

• PoolParty Semantic Integrator


• Large scale RDF triple stores (e.g.: Virtuoso)

Vocabulary Management How can controlled vocabularies become an easily accessible source of knowledge to link information sources more efficiently?

Addressed problem

Benefits from creating and using vocabularies still seem to be below the invested effort. Whereas controlled vocabularies can build the basis for a richer metadata management system, it remains still unclear how thesauri or ontologies can also be used as a valuable information source on its own. Vocabulary management can help to overcome the Babylonian language confusion. A thesaurus can be used by knowledge workers as an encyclopedia to better understand unclear, unintelligible or ambiguous terms and phrases which occur in a large proportion of the documents, mails or protocols they have to deal with on a daily basis.


In order to get (enterprise) vocabularies widely accepted the costs for the creation and development of such thesauri and vocabularies have to stay as low as possible. This can be achieved if thesaurus managers get support by appropriate methods and software tools to produce high-quality semantic metadata built upon open standards. In case the enterprise (or domain-specific) thesaurus is built upon W3C's Simple Knowledge Organization System (SKOS) it can also build the core of an organization's knowledge


4/11

graph to be reused by many other applications. In addition, built-in text analytics, several importers and linked data enrichment tools help to extend the enterprise vocabulary further and further while keeping the efforts as low as possible. A comprehensive library of quality- and validity checks makes sure that the outcome will meet the highest demands for quality. Putting an enterprise vocabulary to the right place means, that it should be reused by other applications as often as possible. Several standard APIs allow quick integration as well as complex queries over the resulting knowledge graph.

Results

• Enterprise vocabularies fully compatible with W3C's semantic web standards (SPARQL, RDF, SKOS)

• Ready to be used within a linked data enterprise architecture

• Highly comfortable thesaurus editor, fully web-based with hundreds of features

• Importers for legacy data sources

• Integrations with frequently used enterprise systems like Sharepoint, Confluence or Drupal

• Facilities to enrich thesauri with terms from document collections and linked open data




• PoolParty Knowledge Modeling Approach

• Linked Data enrichment

• Data importers and text analytics

• Thesaurus Quality and Validity Checker (qSKOS)

Semantic Content Management How can linked data help to establish a metadata layer across systems to link content from multiple sources?

Addressed problem

Managing content in a CMS is a cost-intensive task. To take care of metadata as an integral part of professional content management is likely to be neglected. Using referencable metadata on top of our content is key to increase the value of such cost-intensive assets.


Text analytics based upon controlled vocabularies can help to keep the cost of managing metadata in a CMS as low as possible. Annotating and categorizing content by using thesauri also makes sure that a highly-expressive semantic index of our content repositories can be built later on. Automatic text analytics in combination with comfortable user-dialogues for semi-automatic content tagging can be used to link, categorize and annotate content. Our solution approach is aiming to establish a metadata layer outside the actual content management system to make an integration with other content repositories as easy as possible.


5/11

Results

• Automatic document annotation and categorization (XML documents, plain text)

• Semi-automatic tagging dialogues based on tag recommender

• Rule-based named entity recognition

• Sentiment analysis

• Connectors to enterprise linked data repository


• Concept-based annotation


• Natural language processing


Text Mining of Business News How can semantic technologies help to filter out news items and to put them in a specific context automatically?

Addressed problem

Working as an analyst, researcher, product manager or as a journalist means that one has to skim through hundreds of news articles per day. On the one hand the usage of social networks and attached reputation systems can help to narrow down the number of relevant sources, on the other hand an ever increasing amount of information has ended up on our desktops since we have become active members of Twitter, Linkedin or other social media channels. Unstructured information makes up the largest portion of frequently quoted 'big data'. Being able to deal with unstructured information in combination with structured data like statistics or relational databases has become a key ability to succeed in a variety of knowledge intensive industries.


Domain-specific text mining becomes more precise when built upon controlled vocabularies. The analysis of large amounts of mainly short documents like business news requires highly


6/11

performant algorithms built on top of specific knowledge graphs. The outcome of text mining in the context of linked data is rather a 'web of linked entities' than simply a 'semantic document index'. By using linked data based knowledge models in its core, PoolParty platform is able to combine text mining with graph databases.

Results

• Precise and highly performant text mining for specific domains

• Extraction of highly structured knowledge graphs from semi-structured and unstructured information

• Basis for integrated views over heterogeneous information sources



• Natural Language Processing

• SKOS

Vertical Search Solutions How can semantic knowledge models contribute to a highly efficient topical search engine?

Addressed problem

Common paradigms of search engine development not necessarily reach the optimal results when specialized information put into a specific context or process has to be retrieved. A vertical search engine, in contrast to a general web or enterprise search engine, focuses on a specific knowledge domain. To bring such a topical search engine to its full potential the underlying index has to be built upon a specific knowledge model. A vertical search engine, as distinct from a general search tool, makes also use of an individual user interface and domain-specific navigational elements. But most importantly, in case the search engine shall cover a clearly defined scope, the use of semantic knowledge models achieves a very good cost-benefit ratio.


Structured information as well as unstructured text can build the basis for vertical search solutions. By reflecting the knowledge about the search domain with means of a thesaurus, a more precise semantic document index can be built. Using linked data based knowledge graphs for document


7/11

indexing instead of pure term-based vocabularies allows to further enrich the document basis by facts from other knowledge models. Users benefit from richer search results not only consisting of documents but also of facts and figures related to the actual information needs. Since a vertical search solution is built around a well-defined scope, it is also advisable to generate and provide specific search assistants like facets or search refinement tools.

Results

• Smart search assistants (faceted search etc.)

• Precise search results

• Search application and interfaces customized to the needs of the subject matter experts

• Integrated views on structured and unstructured information alike



• PoolParty Search Server

Knowledge Bases How can semantic technologies help to make collaborative knowledge bases better accessible for employees?

Addressed problem

Transforming a simple document server into a collaborative knowledge base which serves as a valuable source for knowledge workers in their daily work is not as simple as it seems to be. On the one hand collaboration platforms like enterprise wikis most often are the right choice to encourage people to collect ideas for new content or to make knowledge about products and services better accessible. On the other hand knowledge bases tend to get tattered over time.


In order to make specific knowledge about business processes, methods or technologies available for as many employees as possible, we combine the best of three worlds: enterprise collaboration software, text mining and controlled vocabularies. This results in solutions which fulfill the demand for highly dynamic and flexible knowledge bases, still stable (technical and content-wise) enough to be used in professional environments. Since the knowledge base is generated around a controlled vocabulary acting as a meta-layer, traditional navigation structures like trees no longer act as a rigid corset which makes traversing of graph-like structures impossible.


8/11

Semi-automatic tools for linking, categorizing and content indexing is key to overcome this problem. Putting a controlled vocabulary in place which grows in parallel to the content base demands new and more agile patterns of taxonomy or thesaurus management than 'traditional' approaches would provide.

Results

• Linked knowledge objects on top of enterprise collaboration platforms like Confluence or Sharepoint

• Semantic search over knowledge bases

• Automatic content enrichment


• Atlassian Confluence

• Microsoft Sharepoint

• Drupal

• PoolParty PowerTagging

• Semantic Sharepoint

• Semantic Confluence

Linked Open Data How can open semantic web standards stimulate new ways to distribute and reuse data and information across intraorganisational and extraorganisational boundaries?

Addressed problem

For many organizations the efficient distribution of its data has become a main task. For example, NPOs or NGOs which want to stimulate specific markets can free up their information, make it available and accessible to allow new entrants. Publishers have recognized that opening up (parts of) their databases can stimulate the demand for even more information inducing finally the act of purchase. Open semantic web standards play a key role in this distribution policy since they allow a high degree of reusability and linking.


The strict usage of semantic web standards, not only as an export format but as the way to represent data internally allows us to bring linked data to its full potential. Initial phases of a knowledge graph project might start with the creation of a SKOS thesaurus further enriched by facts or ontological statements from other linked data sources. The publication of linked data inside corporate boundaries or of linked open data on the (semantic) web is technically spoken the same task. In both cases data can be accessed programmatically by the usage of standard APIs


9/11

like SPARQL. Data becomes a self-describing digital asset to build semantically enhanced applications or mashups.

Results

• Linked Data Server as part of the PoolParty Thesaurus Server

• Linked Open Data Portals

• Linked Data Manager to retrieve, extract and transform open data automatically and periodically

• Linked Data alignment tools



• Linked Data Manager

• Semantic Web Standards (SKOS, RDF Schema, SPARQL)

• Drupal

• Large scale RDF triple stores (e.g.: Virtuoso)

Recommender Systems How can semantic technologies help to enrich a DMS or CMS with intelligent functions like recommender systems?

Addressed problem

Given the plethora of information in large document collections or content repositories, the provision of digital assistants can become essential to survive. Who else has been working on a similar document or a related issue I am working on right now? Is there a corresponding slide deck available which deals with the same questions like the paper I am writing just now? Typical document or content management systems are still more focussed on workflow management or archiving solutions than on functionalities which help to put content into the context of the actual work step.


Recommender engines work on top of semantic fingerprints. Each business object (resource) is represented by its semantic metadata which is a fragment of the overall enterprise knowledge graph. This meta information is used to detect hidden links between objects like persons or documents. Controlled vocabularies based on SKOS and linked data build the backbone to express the semantic fingerprint of each resource. Algorithms which calculate the 'similarity' between such graph fragments are used as core elements for the recommendation engine.


10/11

Results

• Content management workflows free of interruptions and media breaks

• Avoid unnecessary overlapping and duplications of work

• Support and stimulate cross-reading in knowledge bases or cross-selling in shop systems

• Enable serendipity effects


• Semantic fingerprints

• Similarity algorithms and machine learning

• SPARQL


• Large scale triple stores (e.g.: Virtuoso)

Semantic Search How can semantic search (which goes beyond search over documents only) be realized in the context of enterprise information systems?

Addressed problem

Search has become a more and more important functionality in most information management systems. Learning from web search engines, most intranet searches have already introduced some useful assistance functions like auto-complete. Semantic search can go far beyond those rather simple features and can help to reduce search times to a minimum while user experience will improve noticeably. Looking at digital assistants like Apple's Siri, it becomes obvious that the role of search systems will become more and more important for the next generations of knowledge bases. Semantic search and search in general is still very focused on the idea of retrieving a list of relevant documents whilst in reality knowledge workers have to find and link information from a huge variety of sources including statistical databases, videos or personnel databases.


Semantic search in the context of linked data means to search over a knowledge graph including document search. This approach makes complex queries possible, e.g.: show me all business news which mention at least one of our suppliers of components used in product ABC. The basis for


11/11

such complex queries is made up by an enterprise linked data store containing a 'semantic index' of various legacy data sources combined with the knowledge graph plus enrichments from other linked data sources, taxonomies and ontologies.

Results

• search engine which provides means for complex queries

• queries over various kinds of information (documents, relational databases, taxonomies, etc.)

• personalized search



• SPARQL

• Large scale triple stores (eg.: Virtuoso)

This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.

http://creativecommons.org/licenses/by-nd/3.0/deed.en_US