SQL Server 2005_Data Mining

Embed Size (px)

Citation preview

  • 8/12/2019 SQL Server 2005_Data Mining

    1/27

    SQL Server 2005: Data MiningMicrosoft Virtual Labs

  • 8/12/2019 SQL Server 2005_Data Mining

    2/27

    SQL Server 2005: Data Mining

    Table of Contents

    SQL Server 2005: Data Mining ............................................................................................... 1

    Exercise 1 Lab Setup ..................................................................................................................................................... 2

    Exercise 2 Creating Decision Tree and Nave Bayes Data Mining Models .............. ........... .......... ........... .......... .......... 4Exercise 3 Viewing Mining Accuracy Charts .......... .......... ........... .......... ........... ........... .......... ........... .......... ........... .... 16

    Exercise 4 Creating a Prediction Query ....................................................................................................................... 21

  • 8/12/2019 SQL Server 2005_Data Mining

    3/27

    SQL Server 2005: Data Mining

    Page 1 of 25

    SQL Server 2005: Data Mining

    ObjectivesAfter completing this lab, you will be better able to:

    Create Decision Tree and Nave Bayes Data Mining Models View Mining Accuracy Charts Create a Prediction Query Model Time Series

    Estimated Time toComplete This Lab 90 MinutesComputer used in this Lab

    SQL BI

  • 8/12/2019 SQL Server 2005_Data Mining

    4/27

    SQL Server 2005: Data Mining

    Page 2 of 25

    Exercise 1Lab Setup

    ScenarioIn this part of the lab you will set up the views you will work with in the rest of the lab.

    Tasks Detailed Steps

    Complete the following

    task on:

    SQL BI

    1. Create the Views

    Note:Logon to the server with the following credentials:

    UserName :Administrator.

    Password :Pass@word1.

    a. From the Windows task bar, select Start | All Programs | Microsoft SQL Server2005 | SQL Server Management Studio.

    b. In the Connect to Serverdialog, make sure that in the Server typedrop downlist-box Database Engineis selected. Enter localhostin the Server nametextbox

    and select Windows Authenticationin the Authenticationdrop down list-box, as

    in Figure 1. Click Connect.

    Figure 1: Connect to Server Dialog

    c. Select File | Open | File.d. Navigate to the C:\MSLabs\SQL Server 2005\Lab Projects\Data Mining

    Lab\DM Setupdirectory, and select the ViewCreation.sqlfile. Click Open.

    e. Click Connectin the Connect to Serverdialog that appears.f. Execute the script by pressing F5, or by clicking on the Executeicon in the

    toolbar, as shown in Figure 2.

  • 8/12/2019 SQL Server 2005_Data Mining

    5/27

    SQL Server 2005: Data Mining

    Page 3 of 25

    Tasks Detailed Steps

    Figure 2: Execute Script

    g. When the scrip has executed successfully, select the File | Exitmenu item to closethe SQL Server Management Studio.

  • 8/12/2019 SQL Server 2005_Data Mining

    6/27

    SQL Server 2005: Data Mining

    Page 4 of 25

    Exercise 2Creating Decision Tree and Nave Bayes Data MiningModels

    ScenarioThe management at Adventure Works wants to analyze purchasing decisions based on customer demographics.

    Analysis Services has improved data mining functionality, providing the following data mining techniques:

    Microsoft Association Rules Microsoft Clustering Microsoft Decision Trees Microsoft Nave Bayes Microsoft Neural Network Microsoft Sequence Clustering Microsoft Time Series

    In this exercise, you will develop an Analysis Services solution using the Microsoft Business Intelligence

    Development Studio environment. The Business Intelligence Development Studio is an environment based on the

    Microsoft Visual Studio 2005 environment.

    Business Intelligence Development Studio provides you with an integrated development environment for designing,

    testing, editing, and deploying projects to the Analysis Server. You will create and view a data mining structure with

    Decision Trees and Nave Bayes data mining models using AdventureWorksDW customer data.

    To create and view data mining models, you will:

    Create an Analysis Services project in the Business Intelligence Development Studio environment. Create a data source and data source view. Create a data mining structure and decision trees data mining model using the Mining Model Wizard. Create a related mining model (Nave Bayes) in the Mining Models view. Deploy the Analysis Services solution. Explore the data mining models using the Mining Model Viewer.

    Tasks Detailed Steps

    Complete the following

    16 tasks on:

    SQL BI

    1. Create an AnalysisServices Project

    a. From the Windows task bar, select Start | All Programs | Microsoft SQL Server2005 | SQL Server Business Intelligence Development Studio.

    b. Select File | New | Project.c. In the New Projectdialog box, in the Project Typespane, click the Business

    Intelligence Projectsfolder.

    d. In the Templatespane, click the Analysis Services Projecticon.e. In the Nametext box, type DM Exercise 1.f. In the Location text box, enter C:\MSLabs\SQL Server 2005\User Projects\.g. Uncheck the Create directory for Solutioncheckbox. Figure 1 shows how the

    New Project dialog box should look once you're done.

    h. Click OK.

  • 8/12/2019 SQL Server 2005_Data Mining

    7/27

    SQL Server 2005: Data Mining

    Page 5 of 25

    Tasks Detailed Steps

    Figure 1: New Project Dialog

    Note: The project is created in a new solution: the solution is the largest unit of

    management in the Business Intelligence Development Studio environment. Each

    solution contains one or more projects. An Analysis Services Project is a group of

    related files containing the XML code for all of the objects in an Analysis Services

    database.

    Note: You can view the solution and its projects in the Solution Explorerpane on the

    right hand side in the Business Intelligence Development Studio. If the Solution

    Explorer is not visible you can view it by selecting the View | Solution Explorermenu

    item (or the keyboard shortcut Ctrl + Alt + L).

    2. Set the DeploymentMode Property

    a. In the Solution Explorer window, right-click the DM Exercise 1project, and selectProperties from the context menu.

    b. In the DM Exercise 1 Property Pagesdialog box, under the ConfigurationProperties folder, click Deployment.

    c. In the right pane, click the Deployment Modeproperty. In the Deployment Modedrop-down list click DeployAll, and then click OK.

    Note: You can configure the build, debugging, and deployment properties of an

    Analysis Services project.

    3. Create a Data Source a. In the Solution Explorer pane, under the DM Exercise 1project, right-click theData Sourcesfolder, and then select New Data Source from the context menu.

    b. In the Data Source Wizarddialog box, on the Welcome to the Data SourceWizardpage, click Next.

    Note:If the Data connections pane already includes localhost.AdventureWorksDW,

    skip to step k.c. On the Select how to define the connectionpage, make sure the Create a data

    source based on an existing or new connectionradio button is chosen. Click

    New .

    d. In the Connection Managerdialog box, select the SqlClient Data Providerfromthe .Net Providersfolder in the Providerdrop down combo box at the top of the

    page.

    e. In the Server namedrop down list type localhost.

  • 8/12/2019 SQL Server 2005_Data Mining

    8/27

    SQL Server 2005: Data Mining

    Page 6 of 25

    Tasks Detailed Steps

    f. Under Log on to the server, click Use Windows Authentication.g. In the Select or enter a database namedrop-down list, click

    AdventureWorksDW.

    h. Click Test Connection.i. Click OKto dismiss the message box

    j. In the Connection Managerdialog box, click OK.k. In the Data Source Wizarddialog box, on the Select how to define the

    connectionpage, verify that localhost.AdventureWorksDWis selected, and click

    Next.

    l. In the Impersonation Informationpage, check the Defaultcheckbox and clickNext.

    m.On the Completing the Data Source Wizardpage, leave the default Data sourcename Adventure Works DWunchanged, and then click Finish.

    Note: You have now set up the information how to connect to the database you are

    working with. It is now time to define the schema information you want to use in the

    solution. You do this through theData Source View.

    4. Create a Data SourceView

    a. In the Solution Explorer pane, under the DM Exercise 1project, right-click theData Source Viewsfolder, and then select New Data Source View from the

    context menu.

    b. In the Data Source View Wizarddialog box, on the Welcome to the DataSource View Wizard page, click Next.

    c. On the Select Data Sourcepage, in the Relational data sources pane, verify thatAdventure Works DWis selected, and then click Next.

    Note:At this point, Analysis Services may take a few moments to read the database

    schema.

    d. In this project, your Data Source View is not going to be based on a table; instead,it will be based on a view. On the Select Tables and Viewspage, double-click

    vDMLabCustomerTrain to add this table to the Included objects list.

    Note: You may need to expand the Name column, and/or the entire dialog box, inorder to be able to select vDMLabCustomerTrain.

    e. Click Next.f. On the Completing the Wizardpage, in the Nametext box, type Customers and

    then click Finish. The Data Source View Designer will open. The Data Source

    View Designer is a graphical representation of the data schema you have defined.

    g. Right-click the vDMLabCustomerTrain table and then click Explore Data, as inFigure 2.

  • 8/12/2019 SQL Server 2005_Data Mining

    9/27

    SQL Server 2005: Data Mining

    Page 7 of 25

    Tasks Detailed Steps

    Figure 2: Explore Data

    Note:Analysis Services may take a few moments to read the data.h. This opens a new tab in which you can view the data for the table. If you like, you

    can make the tab into a dockable floating window instead. You do this by right-

    clicking on the tab header and choose Floatingor Dockable

    i. In the Explore vDMLabCustomerTrain Tablewindow, scroll to view the data,and then click on the Xin upper right hand corner as in Figure 3 to close the

    window.

    Figure 3: Explore Table Window

    Note:A Data Source View contains data source schema information. As shown here,

    you do not have to base the Data Source View on table(s): You can use views as well.

    5. Create a Data MiningStructure

    a. In the Solution Explorer pane, under the DM Exercise 1database, right-click theMining Structuresfolder, and then select New Mining Structure from the

    context menu.

    b. In the Data MiningWizard, on the Welcome to the Data Mining Wizardpage,click Next.

    Note: The Mining Model Wizard is the starting point for all data mining operations.

    c. On the Select the Definition Methodpage, click From existing relationaldatabase or data warehouseand then click Next.

    d. On the Select the Data Mining Techniquepage, in the Which data miningtechnique do you want to use?drop-down list, verify that Microsoft Decision

    Treesis selected, and then click Next.

    e. On the Select Data Source View page, in the Available data source views pane,verify that the Customers data source view is selected, and then click Next.

    f. On the Specify Table Types page, in the Input tables pane, in the

  • 8/12/2019 SQL Server 2005_Data Mining

    10/27

    SQL Server 2005: Data Mining

    Page 8 of 25

    Tasks Detailed Steps

    vDMLabCustomerTrain row, verify that the Casecheck box is selected, and

    then click Next.

    g. On the Specify the Training Data page, in the Mining model structure pane,select or deselect each cell by clicking on the check box as shown in Figure 4.

    Figure 4: Specifying Columns for Analysis

    Note:Because CustomerKeyis the primary key of the source table, the Data Mining

    Wizard has automatically selected it as the key. The key identifies the cases in the

    mining model.

    Note: The CustomerKey,FirstName, andLastNamecolumns should not be selected

    asInputorPredictablecolumns.

    h. Click Next.i. On the Specify Columns Content and Data Typepage click Next.

    j. On the Completing the Wizard page, in the Mining Structure Name text box,type Customersand check the Allow drill throughcheck box, and then clickFinish. TheMining Structuredesigner will open as in Figure 5.

  • 8/12/2019 SQL Server 2005_Data Mining

    11/27

    SQL Server 2005: Data Mining

    Page 9 of 25

    Tasks Detailed Steps

    Figure 5: The Mining Structure

    Note:A data mining structure may contain multiple data mining models. Each data

    mining model uses a subset of the data referenced by the data mining structure. When

    the data mining structure is processed, the source data is queried once and then all of

    the data mining models are processed in parallel.

    6. Add and editcolumns in theMining Structure

    a. In the Mining Structuretree view on the left side of the designer window, right-click Columns, and then click Add a Column.

    b. In the Select a Column dialog box, in the Source columntree view, select theAge column, and then click OK.

    c. An alert will appear indicating that you already have an Age column selected.Click Yesto approve and dismiss the dialog box.

    d. In the Mining Structure tree view, right-click the Age 1 column, and then clickProperties.

    e. In the Properties window, in the Content property drop-down list, selectDiscretized.

    Note:By changing the Content property toDiscretized, the server will automatically

    determine discrete ranges for the column.

    f.

    In the Properties window, in the Name property text box, type Age Discretized,and then press .

    g. An alert will appear confirming that you want to change the name for all relatedcolumns. Click Yesto approve and dismiss the dialog box.

    7. Rename the MiningModel

    a. Select the Mining Models tab to view information about the model as in Figure 6.

  • 8/12/2019 SQL Server 2005_Data Mining

    12/27

    SQL Server 2005: Data Mining

    Page 10 of 25

    Tasks Detailed Steps

    Figure 6: The Mining Models View

    Note:The column next to the Structure column may be called something else than

    Customers.

    b. In the Mining Modelsgrid, right-click on the second columnsheading, and thenclick Properties.

    c. In the Properties window, in the Name property text box, type Customers DT torename the mining model, and then press .

    Note:Step c renames the Decision Tree mining model, but does not rename the mining

    model structure.8. Create a Related

    Mining Model

    a. Click on the Create a Related Mining Modelicon on the Mining Models iconbar, as shown in Figure 7.

    Figure 7: The Create a Related Mining Model icon

    b. In the Model Name text box, type Customers NB.c. In the New Mining Model dialog box, in the Algorithm Namedrop-down list,

    click Microsoft Naive Bayesand click OK.

  • 8/12/2019 SQL Server 2005_Data Mining

    13/27

    SQL Server 2005: Data Mining

    Page 11 of 25

    Tasks Detailed Steps

    d. When the alert appears confirming that you want to use the Microsoft Naive Bayesalgorithm and that some columns will be ignored, click Yesto approve and dismiss

    the dialog box.

    Note: The Nave Bayes algorithm does not support continuous columns. Therefore, the

    Age column will be ignored in this mining model. Instead, you will use the Age

    Discretized column.

    e. Click in the Age Discretizedcell in the Customers NBcolumn (the content iscurrently Ignore) in the cell drop-down list, select Input as in Figure 8.

    Figure 8 Changing Usage of a Mining Model Column

    f. You should now have an end result as shown in Figure 9.

    Figure 9: The Customers Mining Model

    9. Deploy the AnalysisServices Solution

    a. Select the Build| Deploy DM Exercise 1menu item.Note: The deployment progress is shown in theDeployment Progresswindow

    normally on the right hand side of Business Intelligence Development Studio, as in

    Figure 10. TheDeployment Progresspane gives you detailed information about what

    happens during deployment. Figure 11 displays the results of a successful deployment.

  • 8/12/2019 SQL Server 2005_Data Mining

    14/27

    SQL Server 2005: Data Mining

    Page 12 of 25

    Tasks Detailed Steps

    Figure 10: The Deployment Progress window showing a deployment starting.

    Figure 11: The Deployment Progress Pane showing successful deployment.

    Note:Analysis Services may take a while to process the data mining models.

    Note:The Analysis Services project is automatically saved when you click Deploy

    Solution.

  • 8/12/2019 SQL Server 2005_Data Mining

    15/27

    SQL Server 2005: Data Mining

    Page 13 of 25

    Tasks Detailed Steps

    Note:In the above procedures, various wizards and editors have been creating XML

    code based on your input. Deployment sends the XML code to the Analysis Server and

    then processes the Analysis Services database.

    10.View the CustomersDT Mining Model

    Decision Tree

    a. On the tabs above the designer window, click the Mining Model Viewertab.Note:If an alert appears indicating that changes have been made, clickNo.

    b. In the Mining Modeldrop-down list, select Customers DT.c. Select View | Full Screen(or press Shift+Left-Alt+Enter)to view the designer

    window full screen. Repeat the process to return to normal view.

    Note:If accidentally closed, the Mining Model Viewer of the Mining Model Designer

    can be re-opened. Select the View| Solution Explorer menu item. In the Solution

    Explorer window, under the Mining Models folder, right-click Customers.dmmand

    selectBrowsefrom the context menu.

    d. In the Treedrop-down list, make sure Bike Buyer is selected; Figure 12 shows theresult.

    Figure 12: Browsing the Mining Model

    e. In the lower-right corner of the Mining Model Viewer, click and hold on the small+ icon in the lower right corner of the Mining Model Viewer. The mouse pointer

    will change to a cross-arrow icon and the Navigationwindow will appear. You

    may drag the mouse to navigate within the Mining Model Viewer. Figure 13

    shows the location of the navigation button (it is highlighted in a circle). You

    might need to use the scroll bars (highlighted in a rectangle) to see the + icon.

  • 8/12/2019 SQL Server 2005_Data Mining

    16/27

    SQL Server 2005: Data Mining

    Page 14 of 25

    Tasks Detailed Steps

    Figure 13: Finding the + icon for navigation.

    Note: TheMining Legendwindow on the right side of the display may be relocated

    and resized to improve the display of the decision tree. If you accidentally close the

    Mining Legendwindow, select the Mining Model tab and then reselect the Mining

    Model Viewer tab, and theNode Legendwindow will re-appear when the viewer is

    redisplayed.

    f. On the Show Levelslider control, drag the pointer to the left so that only one levelof the decision tree is displayed.

    g.

    Click the Allnode.Note: TheAllnode contains a histogram with blue representing bike buyers and red

    representing non-bike buyers.

    Note:Information about all customers is displayed in theMining Legendwindow.

    Notice that 49.39% of the 18,484 customers are bike buyers. (You may need to widen

    the Mining Legend window in order to be able to see the percentages.)

    h. On the Show Levelslider control, drag the pointer to the right so that two levels ofthe decision tree are displayed.

    Note:Age is most predictive of a customer's bike buying behavior.

    i. Click on each node of level 2. The Mining Legendwindow will display detailedinformation for each node.

    j. In the Backgrounddrop-down list, click Yes.Note: The shade of each node indicates the concentration of the value in theBackground drop-down list. Expand and contract nodes in the diagram in order to

    investigate the predicting factors for each group.

    11.View the CustomersDT Mining Model

    Dependency

    Network

    a. Within the designer, click the Dependency Networktab.Note: The Dependency Network viewer displays the strength of the relationships

    between the attributes in a decision tree model.

    b. On the Linksslider control, drag the pointer to the bottom.c. In the Dependency Network diagram, click the Bike Buyernode.Note: The color of each node indicates that attribute's relationship to the Bike Buyer

    attribute.

    d. On the Linksslider control, slowly drag the pointer up to the top. As you drag thepointer upward the relationships within the data are displayed, as shown in Figure14.

  • 8/12/2019 SQL Server 2005_Data Mining

    17/27

    SQL Server 2005: Data Mining

    Page 15 of 25

    Tasks Detailed Steps

    Figure 14: View Strength of Relationships

    12.View the NB NaveBayes Mining Model

    Attribute Profile

    display

    a. In the Mining Modeldrop-down list, click Customers NB to view the NaveBayes mining model.

    b. Select the Attribute Profiles tabc. In the Predictable drop-down list, ensure that Bike Buyer is selected.Note: The Attribute Profiles tab displays the other attributes that impact the state of

    the predictable value selected.

    13.View the AttributeCharacteristics

    display

    a. Click the Attribute Characteristicstab.b. In the Attributedrop-down list, ensure that Bike Buyer is selected. In the Value

    drop-down list, select Yes.

    Note: The characteristics of bike buyers, ordered by their frequency, are displayed..

    c. In the Valuedrop-down list, select No.Note:Notice that the characteristics of non-bike buyers are different than the

    characteristics of bike buyers.

    14.View the AttributeDiscrimination

    display

    a. Click the Attribute Discriminationtab.b. In Attributedrop-down list, ensure that Bike Buyeris selected.c. In the Value1drop-down list, select Yes.d. In the Value 2drop-down list, select No.Note: The attribute values that impact a customer's bike buying decision are

    displayed. The attribute values are ordered by how strongly they favor bike buyers or

    non-bike buyers.

    15.View theDependency

    Network

    a. Click the Dependency Networktab.b.

    On the Linksslider control, drag the pointer to the bottom.

    c. In the Dependency Network diagram, click the Bike Buyernode.Note: The color of each node indicates that attribute's relationship to the Bike Buyer

    attribute.On theLinksslider control, slowly drag the pointer up to the top.

    Note:As you drag the pointer upward the relationships within the data are displayed.

    16.Close the AnalysisServices Project

    a. Select File| Close Project. If prompted to save changes, select Yes.b. If youre done working on this lab, select File | Exit; otherwise continue to the

    next exercise.

  • 8/12/2019 SQL Server 2005_Data Mining

    18/27

    SQL Server 2005: Data Mining

    Page 16 of 25

    Exercise 3Viewing Mining Accuracy Charts

    ScenarioThe management team at Adventure Works wants to determine the accuracy of their data mining models. Using avalidation data set that was held out of the training set, they create Mining Accuracy Charts to visually identify

    which model is performing most accurately.

    In this exercise, you will validate the mining models created in Exercise 1 by using the Mining Accuracy Chart view

    of the Mining Structure Designer.

    To view the Mining Accuracy Chart, you will: Create a prediction query by selecting an input table and mapping the columns of the data mining model to

    the columns in the validation data set.

    View and interpret a Lift Chart

    Tasks Detailed Steps

    Complete the following 4

    tasks on:

    SQL BI

    1. Open an existingproject

    a. If Business Intelligence Development Studio is not already running, from theWindows task bar, select Start | All Programs | Microsoft SQL Server 2005 |

    SQL Server Business Intelligence Development Studio.

    b. Select File | Open | Project/Solution.c. In the Open Projectdialog box, navigate to the C:\MSLabs\SQL Server

    2005\Lab Projects\Data Mining Lab\DM Exercise 2folder, click DM Exercise

    2.sln, and then click Open.

    Note: The solution used in Exercise 2 is different than the solution created in Exercise

    1.

    2. Deploy the AnalysisServices Solution

    a. Select Build | Deploy DM Exercise 2.Note: The deployment progress is shown in theDeployment Progresswindownormally on the right hand side of Business Intelligence Development Studio, as in

    Figure 1. TheDeployment Progresspane gives you detailed information about what

    happens during deployment. Figure 2 displays the results of a successful deployment.

    Note:Analysis Services may take quite a while to process the data mining models.

  • 8/12/2019 SQL Server 2005_Data Mining

    19/27

    SQL Server 2005: Data Mining

    Page 17 of 25

    Tasks Detailed Steps

    Figure 1: The Deployment Progress window showing a deployment starting.

    Figure 2: The Deployment Progress window showing a successful deployment.

    b. Once deployment is complete, you can close the Deployment Progress window, ifyou wish, by clicking the X located at the far right in the title bar of the

    Deployment window.

    3. Create a Prediction a. In the Solution Explorer window, in the Mining Structuresfolder, double-click

  • 8/12/2019 SQL Server 2005_Data Mining

    20/27

    SQL Server 2005: Data Mining

    Page 18 of 25

    Tasks Detailed Steps

    Query Customers.dmm.

    Note:If the Solution Explorer window is not visible, select theView |Solution

    Explorermenu item.

    b. From the list of tabs above the designer window, select the Mining AccuracyCharticon. The Mining Accuracy Chart view will open, displaying the Column

    Mapping page, as shown in Figure 3.

    Figure 3: Column Mapping Page

    Note:Next, you will use the Column Mappingpage to design aPrediction Querythat

    will be executed in order to compare the mining model's predicted values to the

    validation data set's actual values.

    c. On the Column Mappingtab, in the Select Input Table(s)window, click theSelect Case Tablebutton.

    d. In the Select Table window, make sure Customersis selected in the Data Sourcedrop down list-box,click the vDMLabCustomerValidatetable, and then click

    OK.

    Note:Relationships between the mining structure and the input table are

    automatically created between columns with the same name. Relationships can be

    added or deleted by the user.

    e. In the table at the bottom of the Column Mapping page, verify that the Show checkbox is selected for both the Customers DTand Customers NBmining models.

    f. In the Predictable Column Name column, verify that Bike Buyeris selected forboth mining models.

    Note:In the Predictable Column Name drop-down lists, the mining model column

    names are restricted to columns that have the usage type set to Predict or Predict

    Only.

  • 8/12/2019 SQL Server 2005_Data Mining

    21/27

    SQL Server 2005: Data Mining

    Page 19 of 25

    Tasks Detailed Steps

    g. In the Predict Valuecolumn, in the drop-down list, click Yesfor both miningmodels, as shown in Figure 4.

    Figure 4: Predict Value

    h. Click the Lift Charttab.Note: The prediction query designed on the Column Mapping page will be executed.

    The prediction query will return a prediction for each case in the validation data set.

    Note:Analysis Services may take a while to process the prediction query.

    Note: The mining model will have greater confidence in its prediction for some cases

    than for others. For each case, the prediction query will also return the probability

    that the prediction is correct.

    Note: The cases are sorted by the probability that the prediction is correct, and then

    the percentages of correct predictions are displayed on the lift chart, similar to Figure

    5 (your colors might be different).

  • 8/12/2019 SQL Server 2005_Data Mining

    22/27

    SQL Server 2005: Data Mining

    Page 20 of 25

    Tasks Detailed Steps

    Figure 5: Data Mining Lift Chart

    i. Point and click at any of the lines on the chart. The Mining Legend pane will opentogether with a tool tip and display information.

    Note:As you move along different line points on a line or point to a different line, the

    values displayed in the Mining Legend pane and tool tip will change.

    j. Point at the locations where each line reaches a Target Populationof 90%, andwait for the Mining Legend pane to open.

    Note: The example in Figure 5 shows that the Customers DT model (the blue line)

    only needs to select 71% of the customers in order to identify 90% of the bike buyers.

    The Ideal Model (the red line) only needs to select 45% of the customers in order to

    identify 90% of the bike buyers.

    The Customers NB model needs to select 85% of the customers in order to identify

    90% of the bike buyers.The Random Guess model (the yellow line) needs to select 90% of the customers to in

    order to identify 90% of the bike buyers.

    Note:Because the Customers DT mining model needs to select fewer cases in order to

    identify a given percentage of bike buyers, it would be deemed more accurate than the

    Customers NB model.

    4. Close the AnalysisServices Project

    a. Select File| Close Project. If prompted to save changes, select Yes.b. If youre done working on this lab, select File | Exit; otherwise continue to the

    next exercise.

  • 8/12/2019 SQL Server 2005_Data Mining

    23/27

    SQL Server 2005: Data Mining

    Page 21 of 25

    Exercise 4Creating a Prediction Query

    ScenarioThe sales and marketing department at Adventure Works received a potential customer list containing demographicdata. The department will use the Customers DT data mining model to predict the likelihood that an individual in the

    list is a bike buyer.

    In this exercise, you will make predictions by using the Mining Model Prediction view of the Mining Structure

    Designer.

    To make predictions, you will:

    Select an input table and map the columns of the data mining model to the columns in the prediction dataset.

    Create a prediction query in the Prediction Query Builder Design pane. View the prediction query results.

    Tasks Detailed Steps

    Complete the following 6

    tasks on:

    SQL BI

    1. Open an existingproject

    a. If Business Intelligence Development Studio is not already running, from theWindows task bar, select Start | All Programs | Microsoft SQL Server 2005 |

    SQL Server Business Intelligence Development Studio.

    b. Select File | Open | Project/Solution.c. In the Open Projectdialog box, navigate to the C:\MSLabs\SQL Server

    2005\Lab Projects\Data Mining Lab\DM Exercise 3folder, click DM Exercise

    3.sln, and then click Open.

    Note: The solution used in Exercise 3 is different than the solutions examined in the

    previous exercises.

    2. Deploy the AnalysisServices Solution a. Select Build | Deploy DM Exercise 3.Note: The deployment progress is shown in theDeployment Progresswindownormally on the right hand side of Business Intelligence Development Studio, as in

    Figure 1. TheDeployment Progresspane gives you detailed information about what

    happens during deployment. Figure 2 displays the results of a successful deployment.

    Note:Analysis Services may take quite a while to process the data mining models.

  • 8/12/2019 SQL Server 2005_Data Mining

    24/27

    SQL Server 2005: Data Mining

    Page 22 of 25

    Tasks Detailed Steps

    Figure 1: The Deployment Progress window showing a deployment starting.

    Figure 2: The Deployment Progress window showing a successful deployment.

    b. Once deployment is complete, you can close the Deployment Progress window, ifyou wish, by clicking the X located at the far right in the title bar of the

    Deployment window.

    3. Create a Prediction a. In the Solution Explorer window, in the Mining Structuresfolder, double-click

  • 8/12/2019 SQL Server 2005_Data Mining

    25/27

    SQL Server 2005: Data Mining

    Page 23 of 25

    Tasks Detailed Steps

    Query using the

    Decision Tree

    Mining Model

    Customers.dmm.

    Note:If the Solution Explorer pane is not visible, select View |Solution Explorer.

    b. From the list of tabs above the designer window, select the Mining ModelPredictiontab.

    c. In the Mining Model window, click the Select Model button.d. In the Select Mining Model dialog box, expand Customers, click Customers DT,as shown in Figure 3, and then click OK.

    Figure 3: Select Mining Model in Mining Model Prediction Viewe. In the Select Input Table(s)window, click the Select Case Tablebutton.f. In the Select Tablewindow, make sure that Customersis selected in the Data

    Source combo-box and then click the vDMLabCustomerPredicttable, and then

    click OK.

    Note:Relationships between the mining structure and the input table are

    automatically created between columns with the same name. Relationships can be

    added or deleted by the user.

    4. Build the DecisionTree Prediction

    Query

    a. Enter values into the first row of the table at the bottom of the designer as shownbelow:

    Column Value

    Source vDMLabCustomerPredict

    Field CustomerKey

    Show (Checked)

    Note: The columns of the table may be re-sized by dragging the dividing line between

    the column headings.

    b. Input values into the second row of the table as shown below:

  • 8/12/2019 SQL Server 2005_Data Mining

    26/27

    SQL Server 2005: Data Mining

    Page 24 of 25

    Tasks Detailed Steps

    Column Value

    Source Customers DT mining model

    Field Bike Buyer

    Show (Checked)

    Note: The value in the Source column will change from Customers DT mining model

    to Customers DT.

    c. Enter values into the third row of the table as shown below:Column Value

    Source Prediction Function

    Field PredictProbability

    Alias Confidence

    Show (Checked)

    Criteria/Argument [Customers DT].[Bike Buyer]

    Note: Figure 4 shows what you should see after having entered the values as

    described in the previous tables.

    Figure 4: Values for the Decision Tree Prediction Query

    d. Select the Mining Model | Querymenu item to view the SQL of the query.Note: You can also do this by clicking on the down arrow in the Switch to query result

    viewicon in the upper left corner of the Mining Model Prediction View as in Figure 5,

    and then click Query.

    Figure 5: Switch to SQL Query View

    Note:In the bottom pane of the Mining Model Prediction view, the text of the

    prediction query is displayed.

    5. Display the DecisionTree query result

    a. You view the result of the Decision Tree Query result by selecting the MiningModel| Resultmenu item.

    Note:As with the SQL view, you can also view the result by clicking the down arrow

    in the Switch to query result viewicon in the upper left corner of the Mining Model

    Prediction View as in Figure 4, and then clickResult.

    Note: The results of the prediction query are displayed.

  • 8/12/2019 SQL Server 2005_Data Mining

    27/27

    SQL Server 2005: Data Mining

    Tasks Detailed Steps

    The CustomerKey column identifies each record from the input table.

    The Bike Buyer column contains the mining model's prediction of the customer's bike

    buying behavior.

    Larger values in the Confidence column mean the mining model has more confidence

    in the prediction contained in the Bike Buyer column.

    By contacting potential customers predicted to be bike buyers with a high Confidence

    value, Adventure Works can use the results of the prediction query to promote theirbikes to those individuals most likely to be bike buyers. Adventure Works marketing

    expenses will be reduced because potential customers who are not likely to be bike

    buyers will not be contacted.

    6. Close the AnalysisServices Project

    a. Select File| Close Project. If prompted to save changes, select Yes.b. Select File | Exitto close Business Intelligence Studio.