Upload
search-technologies
View
345
Download
1
Tags:
Embed Size (px)
DESCRIPTION
This presentation was given by Search Technologies' CEO Kamran Khan at the November 2013 Enterprise Search Summit / KMWorld in Washington DC. He discussed how modern search engines are currently being combined with powerful independent content processing pipelines and the distributed processing technologies from big data to form new and exciting enterprise search architecture, delivering results only available to the biggest companies with the deepest pockets in the past. For more information visit http://www.searchtechnologies.com/.
Citation preview
A Big Data Architecture for SearchKamran Khan, CEO
The expert in the search space
The expert in the search space
Search Technologies Overview
San Diego, CA
San Jose, CR
Herndon, VA
Ascot, UK
Cincinnati, OH Karlsruhe, DE
• The leading IT Services company dedicated to Enterprise Search & Search-based Applications
• Implementation, Consulting, Managed Services• 120 employees and growing• Independent, working with all of the leading
software vendors and open source alternatives
The expert in the search space
500+ Customers
The expert in the search space
What Is Big Data?
The expert in the search space
Where Did Modern Big Data Come From?
LOG FILES
Web Servers
Content
Web Servers
Content
LOG FILESLOG
FILES
Web Servers
Content
The expert in the search space
What is Big Data?
LOG FILESLOG FILES
LOG FILES
LOG FILES
LOG FILES
LOG FILESLOG FILES
LOG FILES
LOG FILES
LOG FILES
LOG FILES
LOG FILES
LOG FILES
LOG FILES
LOG FILES
LOG FILES
LOG FILES
LOG FILES
LOG FILES
LOG FILES
LOG FILES
The expert in the search space
What is Big Data?
Too big for a single machinePhysically impossible for a single machine
Data Aggregation & AnalysisSimply transforming data records is not enoughMust aggregate / “boil down” the data
Batch ProcessingVery long running jobs (not real-time)
Message: Lots of Data “Big Data”
The expert in the search space
Enabling Technologies
Modern Statistical Analysis
Elastic / Cloud
Computing
Big Data For Search
Hadoop
The expert in the search space
What is Big Data?
UnstructuredData
Content
Content
Content
Content
Content
Content
Content
Hadoop
Content
Content
Content
Content
Content
Content
Content
Content
The expert in the search space
A Traditional Integrated Architecture
Search EngineSharePoint
Content Sources
Aspire Connector
ConnectorsIndex Pipeline
Search Index
Does a lot of what we need for Enterprise Search
Limitations• Limited support for modern analytics• Limited support for content processing• Re-indexing takes too long• Limits ability to do continuous improvement cycle
File System
RDBMS
Employee Directory
ETC.
The expert in the search space
Why Content Processing is Important
Powerful & Complete Content Processing ServiceClean and consistent data and metadataAbility to supplement metadata
Support for Continuous Improvement CycleDevelop and maintain processing IPAbility to easily migrate to new search engines
Search EngineEmployee Directory
Content Sources
Aspire Connector
ConnectorsIndex Pipeline
Search IndexContent
ProcessingContent
ProcessingFile System
RDBMS
Employee Directory
ETC.
The expert in the search space
A New Enterprise Search Architecture
Integrated Platform (Docs, Log Files and External data)Reduced CostBetter Agility and ScalabilityFast ReindexingExpanded Functionality
Search EngineEmployee Directory
Content Sources
Aspire Connector
ConnectorsIndex Pipeline Search
IndexContent
Processing &Tokenization
Secure Cache
Analytics
Docs, Log files,Supplemental
DataETC.
File System
RDBMS
Employee Directory
The expert in the search space
Advanced Features & Analytics Enabled
Search and MatchForward and Reverse CitationLatent Semantic AnalysisMore Precise Term Weighting Beyond TF/IDFNear Duplicate DetectionDocument Topic TaggingResults ranking including popularityRecommendations based on user behaviorSuggested queries based on user behavior
The expert in the search space
In Summary
Structured Big Data Technology Will Revolutionize Enterprise Search
New architecture for search providing better:Analytics and other functionalityContent processingAgilityEconomics and scalability
Big Data architectures will significantly move search forward
For further informationwww.searchtechnologies.com
The expert in the search space