Upload
gethue
View
1.207
Download
1
Embed Size (px)
DESCRIPTION
Open up your user base to the data! Contrary to programming and SQL, almost everybody knows how to search. This talk describes through an interactive demo based on open source Hue how users can graphically search their data in Hadoop. The underlying technical details of the application and its interaction with Apache Solr will be clarified. The session will detail how to get started with data indexing in just a few clicks as well as explore several data analysis scenarios. Through a web browser, attendees will be shown how to explore and visualize data for quick answers. The new search dashboard in Hue, with its draggable charts and dynamic interface, lets any non-technical user look for documents or patterns. Attendees of this talk will learn how to get started with interactive search visualization in their Hadoop cluster.
Citation preview
Interactively search and visualize your data Romain Rigaux
Goals
Build a Web app Quickly explore data
… with Solr
Hue: make Solr / Hadoop easier to use
+
Architecture
REST
“Just a view” on top of the standard Solr API
History: v1 User
History: v1 Admin
Architecture: Next!
Lot of learning, UX Boost needed
Simple, don’t know it is Solr
History: v2 User
History: v2 Admin
Architecture
/select /admin/collections /get /luke...
/add_widget /zoom_in /select_facet /select_range...
REST AJAX Templates
+ JS Model
www….
Architecture: UI for Facets
Layout
Collection
Query
All the 2D positioning (cell ids), visual, drag&drop
Dashboard, fields, template, widgets (ids)
Search terms, selected facets (q, fqs)
Adding a widget life cycle
Load the initial page Edit mode and Drag&Drop
/solr/zookeeper/clusterstate.json /solr/admin/luke…
/get_collection
Adding a widget life cycle
/solr/select?stats=true /new_facet
Select the field Guess ranges (number or dates)
Rounding (number or dates)
Adding a widget life cycle Query part 1
Query Part 2
Augment Solr response
facet.range={!ex=bytes}bytes&f.bytes.facet.range.start=0&f.bytes.facet.range.end=9000000& f.bytes.facet.range.gap=900000&f.bytes.facet.mincount=0&f.bytes.facet.limit=10
q=Chrome&fq={!tag=bytes}bytes:[900000+TO+1800000]
{ !'facet_counts':{ ! 'facet_ranges':{ ! 'bytes':{ ! 'start':10000,! 'counts':[ ! '900000',! 3423,! '1800000',! 339,!
! ! ...! ]! }! }!}!
{! ...,! 'normalized_facets':[ ! { ! 'extraSeries':[ !! ],! 'label':'bytes',! 'field':'bytes',! 'counts':[ ! { ! 'from’:'900000',! 'to':'1800000',! 'selected':True,! 'value':3423,! 'field’:'bytes',! 'exclude':False! }! ], ...! }! }!}!
JSON to Widget { !"field":"rate_code",!"counts":[ ! { ! "count":97797,! "exclude":true,! "selected":false,! "value":"1",! "cat":"rate_code"! } ...!
{ !"field":"medallion",!"counts":[ ! { ! "count":159,! "exclude":true,! "selected":false,! "value":"6CA28FC49A4C49A9A96",! "cat":"medallion"! } ….!
{ !"extraSeries":[ !!],!"label":"trip_time_in_secs",!"field":"trip_time_in_secs",!"counts":[ ! { ! "from":"0",! "to":"10",! "selected":false,! "value":527,! "field":"trip_time_in_secs",! "exclude":true! } ...!
{ !"field":"passenger_count",!"counts":[ ! { ! "count":74766,! "exclude":true,! "selected":false,! "value":"1",! "cat":"passenger_count"! } ...!
Repeat…
Enterprise features
- Access to Search App configurable, LDAP/SAML auths - Share by link - Solr Cloud (or non Cloud) - Proxy user
/solr/jobs_demo/select?user.name=hue&doAs=romain&q= - Security
Kerberos - Sentry
Collection level, Solr calls like /admin, /query, Solr UI, ZooKeeper
Demo Index and Visualize Taxi data
http://chriswhong.com/open-data/foil_nyc_taxi/ https://archive.org/details/nycTaxiTripData2013 [torrent better]
Missed it?
http://demo.gethue.com/search
What’s next?
- Map Pivot Facets - Autocomplete - Analytics range facets - Easier Indexing - … ?
Thank you!
http://gethue.com/blog/search https://github.com/cloudera/hue