13
Multidimensional Data Analysis with JRuby Raimonds Simanovskis github.com/rsim @rsim

Multidimensional Data Analysis with JRuby

  • Upload
    rsim

  • View
    132

  • Download
    7

Embed Size (px)

DESCRIPTION

Lightning talk at RailsConf 2011 about mondrian-olap gem

Citation preview

Page 1: Multidimensional Data Analysis with JRuby

Multidimensional Data Analysis

with JRubyRaimonds Simanovskis

github.com/rsim@rsim

Page 2: Multidimensional Data Analysis with JRuby

Relationaldata model

Page 3: Multidimensional Data Analysis with JRuby

SQL is good for detailed data queries

Get all sales transactions inUSA, California

SELECT customers.fullname, products.product_name, sales.sales_date, sales.unit_sales, sales.store_salesFROM sales LEFT JOIN products ON sales.product_id = products.id LEFT JOIN customers ON sales.customer_id = customers.idWHERE customers.country = 'USA' AND customers.state_province = 'CA'

Page 4: Multidimensional Data Analysis with JRuby

SQL becomes complexfor analytical queries

SELECT product_class.product_family, SUM(sales.unit_sales) unit_sales_sum, SUM(sales.store_sales) store_sales_sum FROM sales LEFT JOIN product ON sales.product_id = product.product_id LEFT JOIN product_class ON product.product_class_id = product_class.product_class_id LEFT JOIN time_by_day ON sales.time_id = time_by_day.time_id LEFT JOIN customer ON sales.customer_id = customer.customer_id WHERE time_by_day.the_year = 2011 AND time_by_day.quarter = 'Q1' AND customer.country = 'USA' AND customer.state_province = 'CA' GROUP BY product_class.product_family

Get total sales in USA, Californiain Q1, 2011 by main product groups

Page 5: Multidimensional Data Analysis with JRuby

Maybe write distributed map reduce function?

Page 6: Multidimensional Data Analysis with JRuby

MultidimensionalData Model

Multidimensional cubes

DimensionsHierarchies and levels

Measures

Page 7: Multidimensional Data Analysis with JRuby

OLAP technologiesOn-Line Analytical Processing

Page 9: Multidimensional Data Analysis with JRuby

MDX query language

SELECT {[Measures].[Unit Sales], [Measures].[Store Sales]} ON COLUMNS, [Product].children ON ROWSFROM [Sales]WHERE ( [Time].[2011].[Q1], [Customers].[USA].[CA] )

Get total units sold and sales amountin USA, California in Q1, 2011by main product groups

Page 10: Multidimensional Data Analysis with JRuby

Or in Ruby like this

olap.from('Sales').columns('[Measures].[Unit Sales]', '[Measures].[Store Sales]').rows('[Product].children').where('[Time].[2011].[Q1]', '[Customers].[USA].[CA]').execute

Get total units sold and sales amountin USA, California in Q1, 2011by main product groups

Page 11: Multidimensional Data Analysis with JRuby

Also more complex queries

olap.from('Sales').with_member('[Measures].[ProfitPct]'). as('(Measures.[Store Sales] - Measures.[Store Cost]) / Measures.[Store Sales]', :format_string => 'Percent').columns('[Measures].[Store Sales]', '[Measures].[ProfitPct]').rows('[Product].children').crossjoin('[Customers].[Canada]', '[Customers].[USA]'). top_count(50, '[Measures].[Store Sales]')where('[Time].[2011].[Q1]').execute

Get sales amount and profit %of top 50 products sold in USA and Canada during Q1, 2011

Page 12: Multidimensional Data Analysis with JRuby

OLAP schema(mapping cube to tables)

schema = Mondrian::OLAP::Schema.define do cube 'Sales' do table 'sales' dimension 'Gender', :foreign_key => 'customer_id' do hierarchy :has_all => true, :primary_key => 'customer_id' do table 'customer' level 'Gender', :column => 'gender', :unique_members => true end end dimension 'Time', :foreign_key => 'time_id' do hierarchy :has_all => false, :primary_key => 'time_id' do table 'time_by_day' level 'Year', :column => 'the_year', :type => 'Numeric', :unique_members => true level 'Quarter', :column => 'quarter', :unique_members => false level 'Month',:column => 'month_of_year',:type => 'Numeric',:unique_members => false end end measure 'Unit Sales', :column => 'unit_sales', :aggregator => 'sum' measure 'Store Sales', :column => 'store_sales', :aggregator => 'sum' endend

Page 13: Multidimensional Data Analysis with JRuby

mondrian-olap gemeazybi.com