33
1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

Embed Size (px)

Citation preview

Page 1: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

1

All Powder Board and Ski

Microsoft Access WorkbookChapter 9: Data Warehouses and Data MiningJerry PostCopyright © 2007

Page 2: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

2

Desired Sales Cube Dimensions

Sales Dimensions

State (ship)MonthCategoryStyleSkillLevelSizeColorManufacturerBindingStyleWeightMax?ItemMaterial?WaistWidth?

Page 3: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

3

Early Data: Spreadsheets

Page 4: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

4

Action

ActionChoose File/Get External data/ImportRead the Sales spreadsheet into a new tableRead the Rental data into another new tableCreate new entries for CustomerID and

EmployeeID equal to 0

Page 5: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

5

Create Customer and Employee

CustomerID and EmployeeID are missing from the old data.Instead of relying on blank cell values, create a new customer called “Walk-in” and a new employee called “Employee”Write down the ID numbers generated for these anonymous entries.If you use SQL, you can assign a value of zero to these entries.

INSERT INTO Customer (CustomerID, LastName)Values (0,'Walk-in')

INSERT INTO Employee (EmployeeID, LastName)Values (0,'Staff')

Page 6: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

6

Extract Model Data

SELECT DISTINCT OldSales.ModelID, OldSales.ManufacturerID, OldSales.Category, OldSales.Color, OldSales.ModelYear, OldSales.Graphics, OldSales.ItemMaterial, OldSales.ListPrice, OldSales.Style, OldSales.SkillLevel, OldSales.WeightMax, OldSales.WaistWidth, OldSales.BindingStyleFROM OldSales;

Page 7: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

7

Action

ActionCreate a new query in Design viewRetrieve DISTINCT data from the new sales tableRetrieve DISTINCT data from the new rental tableConnect the two queries with a UNIONSave the query

Page 8: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

8

UNION Query for Models

SELECT DISTINCT ModelID, ManufacturerID, Category, …

FROM OldSales

UNION

SELECT DISTINCT ModelID, ManufacturerID, Category, …

FROM OldRentals

Page 9: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

9

Action

ActionCreate a new query that retrieves DISTINCT values from the

saved UNION queryVerify that it worksAdd an INSERT INTO statement above the SELECT statement

to copy the data to the ItemModel tableRun the queryUse a similar process to add SKU, ModelID, and Size to the

Inventory tableFollow a similar process to copy the Sales, Rental, SalesItem,

and RentalItems tables

Page 10: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

10

Insert Model Data into ItemModel

INSERT INTO ItemModel (ModelID, ManufacturerID, Category, Color, ModelYear, Graphics, ItemMaterial, ListPrice, Style, SkillLevel, WeightMax, WaistWidth, BindingStyle)SELECT DISTINCT qryOldModels.ModelID, qryOldModels.ManufacturerID, qryOldModels.Category, qryOldModels.Color, qryOldModels.ModelYear, qryOldModels.Graphics, qryOldModels.ItemMaterial, qryOldModels.ListPrice, qryOldModels.Style, qryOldModels.SkillLevel, qryOldModels.WeightMax, qryOldModels.WaistWidth, qryOldModels.BindingStyleFROM qryOldModels;

Page 11: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

11

Insert SKU Data into Inventory

INSERT INTO Inventory (ModelID, SKU, Size, QuantityOnHand)SELECT DISTINCT qryOldInventory.ModelID, qryOldInventory.SKU, qryOldInventory.Size, 0 As QuantityOnHandFROM qryOldInventory;

Note the use of the column alias to force a zero value for QuantityOnHand for each row

Page 12: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

12

Copy Sales Data

INSERT INTO Sales (SaleID, SaleDate, ShipState, ShipZIP, PaymentMethod)SELECT DISTINCT OldSales.SaleID, OldSales.SaleDate, OldSales.ShipState, OldSales.ShipZIP, OldSales.PaymentMethodFROM OldSales;

Note that if you have added data to your Sales table, your existing SaleID values might conflict with these

You can solve the problem by adding a number to these values so they are all larger than your highest ID

INSERT INTO Sales (SaleID, SaleDate, ShipState, ShipZIP, PaymentMethod)SELECT DISTINCT OldSales.SaleID+5000, OldSales.SaleDate, OldSales.ShipState, OldSales.ShipZIP, OldSales.PaymentMethodFROM OldSales;

Page 13: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

13

Copy SaleItem Rows

INSERT INTO SaleItem (SaleID, SKU, QuantitySold, SalePrice)SELECT DISTINCT OldSales.SaleID+5000, OldSales.SKU, OldSales.QuantitySold, OldSales.SalePriceFROM OldSales;

If you transformed the SaleID in the prior step for the Sale data, you must do the exact same calculation for SaleID in the SaleItem table

Page 14: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

14

Action

ActionCreate a new query in Design viewTables: Sale, SaleItem, Inventory, ItemModel, ManufacturerColumns: ShipState, PaymentMethod, SaleMonth with a

format of “yyyy-mm”, and Value = QuantitySold*SalePriceSum the Value columnTest the query and save it

Page 15: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

15

Query for PivotTable

Include all desired sale dimensions Compute Value as quantity times price

Format SaleDate as year and month: yyyy-mm

Page 16: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

16

PivotTable Form Wizard

Page 17: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

17

Action

ActionChoose Insert/FormSelect AutoForm: PivotTableSelect the saved queryDrag the SaleMonth and drop it on the column fields positionDrag the Category and ShipState fields and drop them on the

row fields positionDrag the Value field and drop it in the middle Detail sectionDrag the other fields and drop them at the top as filter fields

for future useScroll the main grid to the far rightDrop the Value field in the Totals column

Page 18: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

18

PivotTable Screen

Place columns (month)

Place rows (State, Category)

Place Value last

Place other columns

Page 19: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

19

Action

ActionCreate quarterly groupsClick on the first month and right click on the third month to

highlight themRight click and select the Group/Group Items optionRight click on the new cell SaleMonth1 and change its Caption

property to QuarterRight click on the Group1 cell and set its caption to 2001-Q1Repeat the process for all quartersUse a similar process to group the four yearsUse the Collapse option to reduce the grid display

Page 20: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

20

PivotTable

Right click to select all columns and choose Hide Details

Page 21: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

21

PivotTable Groups

Page 22: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

22

Action

ActionCreate a new Excel spreadsheetChoose Data/PivotTable and PivotChartSelect External data sourceClick the Get Data buttonBrowse to your database and select the query built in the

previous sectionCreate a PivotTable with the SaleMonth field as rows, the

Category field as columns and Value as the Detail

Page 23: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

23

Action

ActionRight click on the PivotTable and select the PivotChart optionIn the PivotChart worksheet remove all categories except

Boards, Skis, Boots, and ClothesRight click to set the Chart Type as a Line ChartRight click each series and Add TrendlineChoose a linear trend and forecast it for 3 periods aheadSet properties to improve the chart’s appearance

Page 24: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

24

Action

ActionMake sure the Data Analysis Add-Ins are

installed in Excel (Tools/Add-Ins)Choose Tools/Data AnalysisSelect T-Test Two-sample with unequal

variancesCompare the Board and Ski sales

Page 25: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

25

Time Series Analysis: Excel

Page 26: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

26

Action

ActionIf you have Microsoft MapPointStart a new Excel worksheetCreate a PivotTable with rows for ShipState and

columns for Board, Boots, Clothes, and SkisCreate a second worksheet that copies the main

data and titles from the PivotTable without the first title rows

Save the fileInsert a MapPoint chartFollow the wizard instructions to add a North

American map with pie charts

Page 27: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

27

GIS: Microsoft MapPoint

The PivotTable places the data into rows and columns

A dynamic copy of this sheet is used to remove the top rows

Page 28: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

28

MapPoint Data Wizard

Page 29: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

29

GIS Analysis of Sales

Page 30: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

30

Action

ActionUse the database command File/Get External Data/Import to read

the demographic spreadsheet dataCreate a query to combine the sales data by state with the

demographic dataCopy and paste the three columns of data into a new worksheetIn Excel, choose Tools/Data Analysis/RegressionSelect the Value column as the Y-range and the population and

income columns as the X-rangeCheck the top row as label optionRun the regression

Page 31: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

31

Sales by State for Regression

Note that some states are missing from the list.

Page 32: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

32

Regression Setup

You should include the label row but be sure to check the box to show you included it

Page 33: 1 All Powder Board and Ski Microsoft Access Workbook Chapter 9: Data Warehouses and Data Mining Jerry Post Copyright © 2007

33

Regression Results

Relatively high R-square

Population is a significant predictor, income is not