33
dataVISIONS: Big Data Visual Analytics Tool: VERSION 2: •Predictive analytics provides an entry into Retail Analytics for example, finding propensity of buying. •Very few variables get selected out of 500 or so, hence it NEEDS Holistic approach. Rank order doesn’t mean largest possible dollar made from historical buying patterns... •Propensity of buying needs to be looked through pricing strategy. Evaluate its effects through as many variables as possible because of correlation with hottest topic in retail or return rate analysis. dataVISIONS is cost effective compared to similar new SAS tool. •Furthermore, Big Data Retail Analytics needs Visual Analytics that is prospective, not

Data Visions Big Data Visual Analytics Tool

Embed Size (px)

DESCRIPTION

dataVISIONS is built with novel machine learning algorithms in combination with deep data mining by fraud concepts in response to a simple but profound question,"What should be the Pricing strategy to stop eCommerce fraud, improve Cyber-security, decrease Anti Money Laundry, Call center behavior analysis etc?" What segmentation techniques can be applied towards those goals?

Citation preview

Page 1: Data Visions Big Data Visual Analytics Tool

dataVISIONS: Big Data Visual Analytics Tool: VERSION 2:

•Predictive analytics provides an entry into Retail Analytics for example, finding propensity of buying.

•Very few variables get selected out of 500 or so, hence it NEEDS Holistic approach. Rank order doesn’t mean largest possible dollar

made from historical buying patterns...•Propensity of buying needs to be looked through pricing strategy. Evaluate its effects through as many variables as possible because

of correlation with hottest topic in retail or return rate analysis. dataVISIONS is cost effective compared to similar new SAS tool.

•Furthermore, Big Data Retail Analytics needs Visual Analytics that is prospective, not retrospective. It also must unearths good

questions, hypothesis and interpretation that one must be able to see!

Page 2: Data Visions Big Data Visual Analytics Tool

Big Data Analytics Vision: Visualize or not IS the Question!

•Customers are not comfortable with just numbers or interpretation that differs by person-to-person. •Visualization tool needed in the company’s assets. •Helps with practical application of Statistics; stops peeling layers of data forever!•Current visualization tools in market lack combination of statistical methods to meet demands for Data Mining (segmentation appreciation). Several examples are provided below:-

Page 3: Data Visions Big Data Visual Analytics Tool

Pricing Strategy: Gender Effects on eCommerce: New Pricing shows clear segmentation: Spatial statistics in combination with Regression shows very strong price sensitivity (very bottom right pane).

Page 4: Data Visions Big Data Visual Analytics Tool

Pricing Strategy: Gender Effects on eCommerce: Same Method: Market need for crisp Segmentation Visualization: low, medium (center), high and very strong price

sensitive customers (very bottom right pane).

Page 5: Data Visions Big Data Visual Analytics Tool

APPLICATION OF SPATIAL STATISTICS SIMULATION SHOWS MANY SELLER STORES ON ECOMMERCE WEBSITES THAT NEEDS TO BE TURNED OFF WITH NEW PRICE.

MAJORITY OF THEM ARE AT VERY LOW PRICE; LIKELY STOLEN CONSUMER ELECTRONICS PRODUCTS SOLD TO MAKE A QUICK BUCK IN ECOMMERCE. WHAT WILL BE THE CONSUMER PROPENSITY TO BUY IN FUTURE FROM THIS ECOMMERCE WEBSITE ?

COMPANY NEEDS TO INSTITUTE NEW PRICING POLICY BECAUSE THE WORKING HYPOTHESIS IS : “IN BOTH GENDERS, NEW PRICING REMOVES SOME SELLER STORES THAT COSTS MONEY TO SERVE AND ADDS TO FRAUD REDUCTION/SECURITY”

What we learned so far…

Page 6: Data Visions Big Data Visual Analytics Tool

eCommerce Actionable Findings Found

•Old price was an easy entry to any seller. Buyer cloud from simulation for new price allow

segmentation and identification of price sensitive and insensitive customers of various levels.

•eCommerce Company needs to have a strategy that serves well to low price sellers and buyer as

well. New Pricing works with segmenting high value seeker as well; want to pay more.

•More information through social media behavior is provided why this strategy will serve well…

Page 7: Data Visions Big Data Visual Analytics Tool

Pricing Strategy VS Social Media (photo sharing= yes) behavior interaction simulation: old price, Gender=Male

Page 8: Data Visions Big Data Visual Analytics Tool

Pricing Strategy VS Social Media (photo sharing= yes) behavior interaction simulation: new price,

Gender=Male

Page 9: Data Visions Big Data Visual Analytics Tool

Pricing Strategy VS Social Media (photo sharing= no) behavior interaction simulation: old price, Gender=Male

Page 10: Data Visions Big Data Visual Analytics Tool

Pricing Strategy VS Social Media (photo sharing= no) behavior interaction simulation: New price, Gender=Male

Page 11: Data Visions Big Data Visual Analytics Tool

Actionable Findings Learned•Photo-sharing is an important social media

behavior; dataVISIONS methods remove clutter and finds price sensitive customers even for those who

do not indulge in photo sharing behavior. • some seller stores drop out with new pricing

simulation; indicating some fraud reduction. This in turn improves buyer return to eCommerce website.•dataVISION removes clutter and shows this data is coming from Very price sensitive customers (Y-axis).•A happy eCommerce customer will have low

Recency (time taken to next purchase) and not ask for product return at the cost of company.

Page 12: Data Visions Big Data Visual Analytics Tool

Actionable Findings Learned•Product recommendation on social media is a way that Retailer can count on return purchase. Higher recommendation could mean higher Net Promoter

Score (NPS). •Next two slides show value of Net Promoter Score

in eCommerce. NPS is the difference between Promoters and Detractors, same number can be arrived in multiple ways (Detractors>Promoters

=<0). Hence predictive analytics will NEVER use it! •dataVISION removes clutter and shows this data is

coming from Very price sensitive customers.

Page 13: Data Visions Big Data Visual Analytics Tool

Male: Net Promoter Score <=0 (more buyer gave bad review of product), old price simulation

Page 14: Data Visions Big Data Visual Analytics Tool

Male: Net Promoter Score <=0, New price simulation: at least some bad reviews by buyers are gone from very price

sensitive customers; could become return purchaser.

Page 15: Data Visions Big Data Visual Analytics Tool

CYBER SECURITY AND RETAIL VARIABLES INTERACTION

• Cyber security is an important concern that permeates every aspect of US corporate system: •http://www.businessweek.com/articles/2012-08-02/the-cost-of-cyber-crime

•Billions of $ being poured in unsuccessfully!•As consumers switch to mobile apps; there will be phenomenal growth in fraudulent bills paid from apps hacking because it is very easy to push a button on mobile inadvertently. Plus the global busy life makes it so much easier…•Retailers need to pay lot more attention to prevent it prospectively and grab the market share NOW!

Page 16: Data Visions Big Data Visual Analytics Tool

Cyber security: Algorithm encourages to form spikes but 7 observations from right refuses to do so; just in time price or product will sell very quickly due to “right

sizing”. Seller’s Original price; new pricing strategy unable to stop this. Low r-squares fraud patterns: male gender, Net Promoter Score= <0; 4 weeks of a month (same data

source as above: Male Gender)

Page 17: Data Visions Big Data Visual Analytics Tool

How does pricing strategy interacts with cyber security? One seller from above slide occurred twice in male gender. there are only three spikes out of 20 observations below. R-square pattern shows fraud hits in the data. This seller’s pattern below for 8 months: Male Gender, Net Promoter Score= <0.

Page 18: Data Visions Big Data Visual Analytics Tool

Cyber security VS Online Pricing Strategy: The method is designed to smooth out large pricing spikes; complicated, Aberrant Online

Selling Pattern (AOSP) on the same data above shows exact same patterns. Hypothesis in slide 5 is rejected!

Page 19: Data Visions Big Data Visual Analytics Tool

Female Gender is not the victim here. New Pricing helps with female gender only. Hypothesis: “female gender has better awareness of a consumer electronics product

such as phones in ecommerce space”. Net Promoter score =<0 and only one time seller were found here. Low NPS is not from fraud, perhaps over advertisement is

culprit here… product return rate should be lower than male gender

Page 20: Data Visions Big Data Visual Analytics Tool

How does Pricing strategy interacts with Net Promoter Score (3 & 4) and Recency (500 or 50% of time likely to visit this ecommerce website) of buyer (male) for cyber security? New

pricing strategy kicks at observation #7, all sellers participated only once: No difference between old and new price (A/B Testing)…. Explains buyer Recency is only 50%

Page 21: Data Visions Big Data Visual Analytics Tool

How does Seasonality interacts with Online Fraud? Mock up data of holiday sale: Only visualization from combination methods shows behavior of fraud (right); one product really high price , one low price and two at expected price. This way no one suspects of anything wrong; explains the change in statistics from left (no fraud) to right at that time (male gender). Program this and find thousands like it in database; saving millions $ prospectively!

Mean normalized + moving average Mean normalized + moving average

Page 22: Data Visions Big Data Visual Analytics Tool

One known online fraudster’s sale was added: Highest variations explained differs by 20% in right pane (R-sq: fraud). Finding more patterns like this in database will make

Millions $ for company. But this data mining pattern came from question that became obvious only after visualization of spike pattern in the center of right pane.

Large price spike normalized: pattern change: No fraud

Large price spike normalized: how pattern changes!

Page 23: Data Visions Big Data Visual Analytics Tool

RETAIL BANKING RISK ANALYTICS: ANTI MONEY LAUNDRY (AML)

•Retail Banking is superpower of US economy; needless to write billions spend to bail out this sector to stabilize the US Economy (2007-10). •Banks provide loans to retail customers and make money based on loan origination rate and interest rate etc...•Risk/retail banking paradigm is shifting; pricing needs to be looked through the prism of Online and Social Media Behavior. •Need to find profitable customers, has working life left and will go through some more life changing events, hence creating retail demand. These customers must Never churn from your business!!•Customer segmentation here are 1.female gender, 2.online Ads imp=0, 3.TV Ads imp=0, 4.online photo sharing=0, 5.leader in providing mortgages and home equity lines of credit to consumers= 0. The segmentation below show the followings from Data Mining:-

Page 24: Data Visions Big Data Visual Analytics Tool

Pattern 1: Business Question: are the AML customers have churned or still with the bank: no de-differencing and r-square is same for linear and

quadratic equations (Loan Amount is Y- axis).

Page 25: Data Visions Big Data Visual Analytics Tool

Pattern 2: differencing and same pattern as above (same Y-axis)….

Page 26: Data Visions Big Data Visual Analytics Tool

Pattern 3: Genetic Algorithm also has no effects in changing the coefficients!

Page 27: Data Visions Big Data Visual Analytics Tool

Pattern 4: First three patterns above do not change. One expects these customers have churned. It is nice to confirm the interpretation VISUALLY! Anti mutation rate when brings shrinkage in coefficients, confirms continuation of same pattern as above; keeps over segmentation rate low. Customer Churn is not just inevitable, but have done so! Catching them for AML will be difficult.

Page 28: Data Visions Big Data Visual Analytics Tool

A) THE VALUE PROPOSITION (VPE) OF CALL CENTER IS “LOAD BALANCING” OR ROUTE MAJORITY OF CALLS TO MOST PRODUCTIVE CALL CENTER SALES AGENTS.

B) THAT MEANS AGENTS WITH HIGHEST SALES CONVERSION RATE (SRC).

C) SALES AGENT CAN EASILY TAKE A SALES CALL AND INPUT IN SYSTEM AS NON SALE CALL IF SELLING DID NOT MATERIALIZE TO KEEP SRC HIGH. THIS IS FRAUDULENT ACTIVITY WHICH BEATS THE LOAD BALANCING CONCEPT.

D) THEN THERE ARE CALL CENTER AGENTS WHO TAKE VERY LONG SALES CONVERSION TIME. THIS IS WASTE BECAUSE CUSTOMER USUALLY DO NOT WAIT FOR 30 MINUTES ON PHONE TO BUY A PRODUCT IN NEXT 30 MINUTES.

E) SALES AGENTS WITH SIMILAR TIMES OF WAIT AND CALL TIME IS VERY SUSPICIOUS BECAUSE CALL SALE TIME> WAIT RESULTS INTO REFERRED FOR TRAINING. THIS SALES AGENT IS DOING BEST TO AVOID NEGATIVE EFFECTS ON PERFORMANCE. PLUS MORE TIME MEANS ADDITIONAL PRAISE FOR PRODUCT THAT MAY NOT LIVE UP TO; TRIGGERING RETURN AND LOSS OF WARRANTY $.

CALL CENTER ANALYTICS:FRAUD and WASTE

Page 29: Data Visions Big Data Visual Analytics Tool

CALL CENTER ANALYTICS: looking for condition E because it is fraud and waste as well as company may lose warranty $ and could end up paying for return shipping

$. No_seasonal r-sq coefficients are lower than season (good) because sell occur in Christmas. Bad news is that seasonality Linear and Quadratic coefficients are similar!

Page 30: Data Visions Big Data Visual Analytics Tool

Similar Linear and Quadratic R-sq means the call center agent is avoiding training referral and company could end up incurring additional $ for this

sales later . Mathematical Equation developed catches the agent in action; very low coefficient pattern means review all sales made by this agent in call

center after sending for training.

Page 31: Data Visions Big Data Visual Analytics Tool

FRAUD, WASTE AND ABUSE HAS CAUGHT UP WITH RETAIL. IT IS WITH ECOMMERCE, BRICK AND MOTOR STORE AND EVERYWHERE.

PROPENSITY OF BUYING IS CORNERSTONE OF RETAIL PREDICTIVE ANALYTICS. EVEN EXCRUCIATING ANALYSIS OF TOP 2% DECILE RESULTS IN VARIANCE WITH PREDICTED VS OBSERVED PURCHASE $. MUST REVIEW PRICING STRATEGY!

RETAILERS HATE TO SEE PRODUCTS RETURNED DUE TO POOR SHAPE OF STOLEN PRODUCT OR EXAGGERATED ADVERTISEMENT. SMALL COMPANY IN BAY AREA WILLING TO FORK OUT MILLIONS OF $ FOR PRICY SAS TOOL.

SOCIAL MEDIA REVOLUTION IS SUCH THAT ONE NEGATIVE COMMENT EQUALS TO WASHING OF THOUSANDS OF $ IN ADVERTISEMENT SPEND AND GOODWILL.

AML HAS ORIGIN IN INSURANCE AND REQUIRES COMPETENCY IN BANKING LOAN ORIGINATION DATA, HEALTHCARE AND CAR INSURANCE DATA. THAT’S WHY EVEN TOP 5 CONSULTING COMPANY HAS LOWER PRESENCE IN IT; HARD TO FIND SME IN ALL THESE THREE AREAS.

BUSINESS VISION APPENDIX

Page 32: Data Visions Big Data Visual Analytics Tool

PRATIBHA SINHA: MS PHYSICS, BIHAR UNIVERSITY, MBA IN INTERNATIONAL MARKETING FROM IGNU IN PATNA.

CORPORATE HIGHLIGHT IS RECOGNIZED ECOMMERCE EXPERT, EXPERIENCE WITH DIGITAL RIVER, SYMANTEC (NORTON PRODUCT) AND PACIFIC GAS AND ELECTRIC COMPANY IN BAY AREA. AFTER CORPORATE WORK, SHE ENJOYS EXPERIMENTING WITH INDIAN AND CHINESE SPICES.

NAVIN SINHA HAS MS IN AGRICULTURAL STATISTICS, STATISTICAL GENETICS, DECISION SCIENCES (MBA). HE IS AUTHOR OF 12 PEER PAPERS AND ONE US PATENT.

CORPORATE HIGHLIGHT IS EXPERIENCE FROM SEVERAL BILLION $ COMPANY SUCH AS DSM FOOD SPECIALTY (6TH LARGEST EUROPEAN COMPANY) , BEST BUY, WIPRO, UNITEDHEALTH GROUP AND VERISK HEALTH. NAVIN IS RESPECTED FRAUD AND DATA MINING EXPERT IN INSURANCE AND SIMILAR VERTICALS. NAVIN ENJOYS APPLYING MATHEMATICAL GENETICS CONCEPTS TO BREED NEW VARIETIES OF TOMATO WHEN NOT WORKING ON CORPORATE PROJECTS.

DOUBLE CHECK CONSULTING: dataVISIONS

Page 33: Data Visions Big Data Visual Analytics Tool

CONCLUSIONS•dataVISIONS Big Data Visual Analytics Tool was built on mocking up Retail and Banking data from Navin and Pratibha Sinha’s corporate experiences. •Invited speaker by American Statistical Association for Cancer Data Mining (YouTube:2009). Pratibha Sinha is an eCommerce Expert.•The tool achieves its objectives: Unearth hypothesis, unexpected Data Mining patterns in various dynamic US Corporate system. Like to know a Tool that does all this??? •Flexible to share growing pains to help build customized Visualization tool for a company; learning and collaboration will only improve dataVISIONS! •Navin Sinha is an award wining poet from Utah State University (1998); took that level of creativity and imagination to come up with dataVISIONS big data visual analytics tool. The material presented here is a Very Small Sample of Methods. •Disclaimer: According to CA Laws, Propriety Technical Marketing Material of Navin Sinha and Pratibha Sinha (952-905-6636). They are not liable for unauthorized use.

•VPE: “Something for the money, and-more for the satisfaction!”