Upload
ngotuyen
View
214
Download
0
Embed Size (px)
Citation preview
StatementTime spent per week
(hours)
Cost of search time
per week ($)Value per year ($)
Value for 100 users
($) / year
COSTS:
Average serch time spent 5 * $255 * $13 260 $1 326 000
* Average salary of scientist/chemist. $90.000/year (according to glassdoor.com). 220 workdays/year.
Search is expensive
Statement
Time spent
per week
(hours)
Cost of search
time per week
($)
Value per year
($)
Value for 100
users ($) /
year
* CL license
cost
( $ / 100 user /
annual)
Saving on 100
users / year
($)
ROI (in
months)
SAVINGS:
Save time 1 $51 $2 652 $265 200 $22 600 $242 600 1 Month
* CL = ChemLocator
But can be cheaper
A web based search tool to find chemistry in unstructured data
Unstructured data (or unstructured information) refers to information that either does not have a pre-
defined data model or is not organized in a pre-defined
manner.
ChemLocator does not store documents
What is ChemLocator
pdf html doc docx xls xlsx
pptx ppt one cdx mrv mol
rdf email aspx xml sdf etc...
iupac cdxml inChi cas smiles rxn
CMS
Search
(ChemLocator)
Collaboration
Digital asset
management
Records
management
Workflow
Capture
Content Management (CMS) Enterprise Content Management (ECM)
Search
What is ChemLocator
ChemLocator
Search Show
SE
CU
RIT
Y T
RIM
MIN
G
Through integration
Free Text Search
Chemical Search
Metadata Search
Ontology Search
Tag Search
Corporate ID Search
Compliance Search
Search features
ChemLocator – Windows Service
CLIENT starts indexing
REST API (owin)
http request
(JSON)
Local
Repository
Drive
http response
(JSON)
Active
Directory
Azure
Drive
Indexing component
ChemLocator
DB
Identity Server (owin)
AUTHORIZATION
AUTHENTICATION
Crawler
Create indexing queue
Indexing node 1
Pick from indexing queue
Extractors
Store
structuresAdd to free
text indexFree text index files
Indexing node 2
Get doc/content
Under the hood - Indexing
Owin – Web Server
ChemLocator - Windows Service
Content
Repository
Identity Server
REST API
CLIENT starts a search
Active
Directory
Authorization
Authentication
Query node
Query node
selector
Structure query
(JChem Base)Free text query
Ontology Query
Free text query
(Lucene / Elastic)
Tags query
(Lucene / Elastic)
Collect documents
for structures
Security
trimming
Merge
hits
= WCF call (TCP binding)
Under the hood - Search
Document source: Local Drive
Repository size: 60GB
Document count: 123 000
Structure count: 2 800 000
Performance – Single server env.
• Chemical Search in 2017? - Not enough anymore
• Combination of Chemical and free text search? - Better
• Then, what’s important?
• Speed, accuracy
• Search is a „feature” of a complex software
Ontology (Virtuoso)
Compound Registration
Compliance Checker
Lucene.NET
Elastic Search
SharePoint
Synergy
INTEGRATION
Integrations
Company A
Company B Company C
Histamine in registration systems
Company A
ACMP-1
Company B
BCMP-2
Company C
CCMP-3
Corporate IDs
PERSONAL edition
No IT
Easy configuration
SERVER edition
INSTALLER - NICE
Complete
Seamless deployment
• Web based, fast hybrid search tool with fine
grained security
• Extensible. Integrable with 3rd party apps
• Works on your documents
• No IT deployment
Take home