Light Customer Development ProjectLight Customer Development Project
Data AnalystMillie
Statistical Analyst Christian
Market Research AnalystSarah
Marketing StrategistNatvida
Summer 2013Summer 2013
2
CONTENT TABLE
Methodology:
•objectives
•What variables we’ve chosen and why
•3 Steps to analysis the data
Findings
•General key findings
•Validation of the findings
•Our best customer profile
Recommendations
•how can we reach new customers
•How can we maintain and enhance the profit
level of existing customer
WHAT IS THE OBJECTIVE?
Profit is the fundamental goal of enterprises
-- Karl Marx
General Rule:Thus, as usual, we use RFM as the most important indicator of our analysis
WHAT VARIABLES DID WE CHOSE AND WHY
• recency_from_sub_end• frequency• ACCUM_AMOUNT
RFM Data
• lt_duration_yr• channel_0• PRODUCTS_RECENTLY_PURCHASE• IB_BUY_Beauty_Cosmetic_Aids• IB_BUY_Crafts_Hobbies• IB_BUY_Electronics_Gadgets
BEHAVIORAL DATA
• AGE• GENDER• IB_NUMBER_OF_LIFETRAITS• IB_OCCUPATION_CUSTOMER• IB_OWN_RENT_HOME• IB_MARITAL_STATUS• IB_NET_WORTH_ESTIMATOR• IB_PP_HOUSEHOLD_SIZE• IB_PP_NUMBER_OF_CHILDREN• STATE• ZIP_CODE• MAG_CUS_ACTIVE_MAGAZINES• MAG_CUS_ORIG_MAGAZINE• IB_VOTER_PARTY• personicx_group
DEMOGRAPHIC DATA
Why did we chose these data:Since we don’t have exact time of last purchase, recency_from_sub_end is the most close attributeAccum_amount is more significant than monetary since it’s not representative to just look at last purchaseDemographic, psychographic, behavioral data are 3 sets of data that marketers ususally use to identify target customers
VARIABLES THAT WE ABANDONED AND WHY
5
• IB_WORKING_WOMAN_IN_HH• DIGITAL_INDICATOR_FLAG• LT_digital_title• lt_digital_sub_ind• lt_digital_print_auth_ind
INVALID DATA
• magnitude• order_year_0• order_month2_0• order_week2_0• MAG_CUS_ACCUM_AMOUNT
UNNECESSARY DATA
• IB_PRESENCE_OF_CHILDREN• IB_LT_Cat_Owner• IB_LT_Dog_Owner• IB_LT_Other_Pets
norm_renew_score• auto_renew_cc_score• auto_renew_bill_score• channel_internet_score• channel_traditional_score• EMAIL_PRESENCE_FLAG• IB_LENGTH_OF_CURRENT_RESIDENCE
COMBINED DATA
Why didn’t we chose these data:Life duration is used instead of order_year / month / weekMag_cus_accum_amount has the same dataset with accum_amountThere are too many missing data in invalid data setSome data can be combined together. E.g. if cat/dog/other pets owner can be combined to if pets owner
STEP1: RUN CORRELATION_GENERAL
STEP1: RUN CORRELATION_GENERAL
General Rule:Correlation are run between RFM Data and demographic / psychographic / behavior dataDidn’t use ANOVA or Regression. Since RFM may not be independent from each other
First used MONOVA to test under a DPB variable, whether RFM are independent from each otherThen used Discriminant function analysis to identify whether a DPB variable has influence over RFM* In order to save labor, also used Canonical analysis to run several DPB variable togetherLast but not least, run regression between that DPB variable and RFM
STEP 1: TAKE LT DURATION YEAR AS AN EXAMPLE
SAS CODE FOR LT_DURATION & RFM
STEP 1: TAKE LT DURATION YEAR AS AN EXAMPLE
MANOVA shows RFM are significantly independent from each otherWhile CANDISC indicates that at least one factor of RFM is affacted by lt_duration
STEP 1: TAKE LT DURATION YEAR AS AN EXAMPLE
The plot clearly showed that accum_amount and lt_duration is posstively correlatedWhile the p value indicates that the correlation is significant
STEP 1: TAKE LT DURATION YEAR AS AN EXAMPLE
Correlation coeffeciency between lt_duration and frequency is lowerBut still significantly related ( P value 0.0076 smaller than 0.05)
STEP 1: TAKE LT DURATION YEAR AS AN EXAMPLE
Recency is negatively correlated as you may see from the graphBut it’s reasonable since the smaller it is, the earlier you renewed the magazine. And loyalty customers tend to renew their magazine more actively
STEP 1: OTHER EXAMPLES
However, it’s not always true that DPB variables are correlated with RFMFor example. Recency has nothing to do with age. It makes sense, if the magazine is run by month, and customer subscribe to one year, it doesn’t matter if the customer is 80 yod or 18 yod
STEP 1: OTHER CODINGS
For categorical data, use glm and class instead of regression to run the correlation
STEP 2: RFM VALUE CALCULATION
RECENCY
FREQENCY
MONETARY
• If smaller than -1, rate it 5
• If smaller than -0.5, rate it 4
• If smaller than 0, rate it 3
If smaller than 1, rate it 1
If smaller than 2, rate it 2
If smaller than 3, rate it 3
If smaller than 4, rate it 4
• Divide accumulative amount by 10000
• If bigger than 7, use 7
• If equal to 0, rate it 2,
• If smaller than 1, rate it 1
• If bigger than 1, rate it 0
Through SAS analysis, we found that 50% DPB variables are related to monetary factor, while 30% are related to frequency, 20% related to recency. We use the ratio and calculated a new RFM Value, we call it customer valueAnd then run pivot table analysis between customer value and DPB data
50%
20%
30%
STEP 3: CUSTOMER VALUE & DPB VALUE PIVOT TABLE ANALYSIS
Then we run pivot table analysis among the variables and customer value to visualize the findings
17
CONTENT TABLE
Methodology:
•What variables we’ve chosen and why
•How did we analysis the data
•What tools did we use
Findings
•General key findings
•Validation of the findings
•Our best customer profile
Recommendations
•how can we reach new customers
•How can we maintain and eenhance the profit
level of existing customer
KEY FINDINGS: 28-45 IS THE BEST RANGE FOR TARGET CUSTOMER
KEY FINDINGS: FEMALE CUSTOMERS ARE 15.6 TIMES MORE VALUABLE THAN MALE CUSTOMERS
KEY FINDINGS: CUSTOMERS WHO OWN THEIR HOUSE ARE 19.8 TIMES MORE VALUABLE THAN THOSE WHO RENT
KEY FINDINGS: MARRIED COUPLES ARE MORE VALUABLE THAN SINGLE CUSTOMERS
KEY FINDINGS: DEMOCRATS ARE MOST VALUABLE
If independent generates 1 unit of RFM value, democrats will contribute 7, while republican gives 17, vacancy gives 10
Vacancy Independent DemocratsRepublican
KEY FINDINGS: GEO DISTRIBUTION
KEY MARKETS THAT SHARE THE HIGHEST VALUEABLE CUSTOMERS: Tier 1: California, Texas, New YorkTier 2: Illinios, Pennsylvania, FloridaTier 3: Arizona, michigan, Ohio, Tennessee, Georgia, North Carolina
KEY FINDINGS: HIGH VALUALBE ZIP-CODES
Top Zip code Area is as follows: New York City has the top 2, which is coordinate with the geo distribution San Jose has relatively high income residents due to the insensitivity of high-tech companies Lancaster has well-educationed population, so does chicago, near northwestern univ.
SAS VALIDATION
SPSS VALIDATION
Findings Via Skype:53% of its circulation in the top ten U.S. metropolitan areas Average Income is $109,877 (2009)
BEST CUSTOMER PROFILE: CAREER WOMEN CINDY
Gender: Female
Age: 40 years old
Marital Status: Married (no kids)
Occupation: “Occupation 4”
Income level: High
Area of residence: Manhattan, New York
Zip Code: 10021 (Upper East)
Owner of home
Political Affiliation: Democrat
Beliefs: Liberal: Pro choice etc.
Favorite News Channel: MSNBC
Hobbies and Interests: Traveling, Reading,
Zoomba, Art Museums, Jazz Clubs
RUN CUSTOMER DEMO/PSCHO ON ETELMAR
28
29
CONTENT TABLE
Methodology:
•What variables we’ve chosen and why
•How did we analysis the data
•What tools did we use
Findings
•General key findings
•Validation of the findings
•Our best customer profile
Recommendations
•how can we reach new customers
•How can we maintain and enhance the profit
level of existing customer
ACQUISITION: TV COMMERCIAL
How to enhance the influence among target audience most efficiently: TV Advertising in California, New York and Texas Buy commercial in tv programs our target customers are watching
How to enhance the influence among target audience most efficiently:Top magazines our target audience is reading Use up-seling and cross-sell to increase profit
CUSTOMER ACQUISITION: MAGAZINE
32Copyright © 2007 Accenture All Rights Reserved.Copyright @ 2007 Accenture. All rights reserved.
How to enhance the influence among target audience most efficiently:Radio is still one of the strongest influencersCommuting Radio in car is the best choice for our target audience
ACQUIRE NEW CUSTOMER: RADIO
33Copyright © 2007 Accenture All Rights Reserved.Copyright @ 2007 Accenture. All rights reserved.
How to enhance the influence among target audience most efficiently:Direct Mail Campaign in top 6 zip code areas Also, NCOA provide great opportunity to reach out to potential customers
ACQUIRE NEW CUSTOMER: DIRECT MAIL
ACQUIRE CUSTOMERS: OTHERS
34
Other ways to enlarge customer database: Invite friends, earn credits Enable Subscription in store
RETAIN LOYAL CUSTOMER: 5 KEY METHODS
How to maintain customer : Loyalty Program ( Offers & Discounts, Benefits) Insert in magazine for renewSurvey & Questionnaire Event / Experiential marketingSocial Media Enagement