8
You will submit two files Word file with q1 and q2 EXCEL file for q1 (show all work) READ THE INSTRUCTIONS CAREFULLY!!!!!!!! NO LATE SUBMISSION!!!!!!! Note: Email me for any GENERAL clarifications All work should be done on the sheet itself except EXCEL file. ORACLE queries and output should be embedded in the exam. Make sure to include all Oracle statements that you use (including how many rows selected etc). I will be going and checking your queries in ORACLE in your account 1

INSS 651 - files.transtutors.com file · Web viewYou will submit two files. Word file with q1 and q2. EXCEL file for q1 (show all work) READ THE INSTRUCTIONS CAREFULLY!!!!! NO LATE

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: INSS 651 - files.transtutors.com file · Web viewYou will submit two files. Word file with q1 and q2. EXCEL file for q1 (show all work) READ THE INSTRUCTIONS CAREFULLY!!!!! NO LATE

You will submit two files

Word file with q1 and q2 EXCEL file for q1 (show all work)

READ THE INSTRUCTIONS CAREFULLY!!!!!!!!

NO LATE SUBMISSION!!!!!!!

Note: Email me for any GENERAL clarifications All work should be done on the sheet itself except EXCEL

file. ORACLE queries and output should be embedded in the exam. Make sure to include all Oracle statements that you use (including how many rows selected etc). I will be going and checking your queries in ORACLE in your account

1

Page 2: INSS 651 - files.transtutors.com file · Web viewYou will submit two files. Word file with q1 and q2. EXCEL file for q1 (show all work) READ THE INSTRUCTIONS CAREFULLY!!!!! NO LATE

Please Show All Work -- ORACLE queries, their output For Full Credit

Q1. Given the following file for assignment worker.com, identify data anomalies that must be removed before data can be loaded in data warehouse.

Worker_assignment -----------------on course web site

Assignment_worker(assignment_no, assignment_date, emp_number, chg_hour,assigned_hour, charges)

Where assignment number is the number assigned to an assignmentAssignment_date is the date assignment startedEmp_number is the number of employee assigned to that assignmentChg_hour is amount paid to that employee for that assignmentAssigned_hour is the hours assigned to that employee for that assignmentCharges are the Total charges for that employee for that assignment (this is calculated as Chg_hour*assigned_hour)

Rules: Assignment numbers always start with a letter followed by a 1 and are ALWAYS four characters

longex: A123, Z178

Emp No IS always 3 CHARCATER LONG

An employee cannot work more than 40 hour on a given project

Requirement:Count (using EXCEL formulas -- IF, countif etc. as done in class) four types of errors:

Missing data Incorrect Format

o To check length of empno--you can use LEN(cell address) to get length of item in that cello check for assignment number format (BONUS +1 points)

Zero values2

Page 3: INSS 651 - files.transtutors.com file · Web viewYou will submit two files. Word file with q1 and q2. EXCEL file for q1 (show all work) READ THE INSTRUCTIONS CAREFULLY!!!!! NO LATE

Incorrect Calculations o check for charges

charges= chg_hour*Assigned_Houro check for employee working more than 40 hours

Once counted

Draw the pie chart or line chart of data anomalies and Discuss what errors can be corrected and how. (submit in WORD)

(20 points)

Must submit the EXCEL worksheet where errors are calculated and graph is drawn

Q2(20 points)Data integrity is a required feature of data warehouses. P & G is building a data warehouse and have run in data integration problems. They need to get data from 2 different users and combine them to maintain data integrity in their data warehouse.

The sources are:

Asia regionNorth American Region

Both region have data stored in different formats in two different files (employee_asia and emp_NA

Both tables are available in account Aggarwal as READ ONLY. You must create a copy in your account before using it.

Or

you can create your own tables.

Asia region data is available as

EMPLOYEE_ASIA ( Emp_ID, Emp_Last, Emp_first, gender, country of origin, no of years working)

A fictitious sample data is presented below:Emp_ID Emp_Last Emp_first gender country of origin noofyearsS112 Bora Lakshmi female India 30S113 Teela Sony male Singapore 5S115 Patel Danny male Ceylon 20

3

Page 4: INSS 651 - files.transtutors.com file · Web viewYou will submit two files. Word file with q1 and q2. EXCEL file for q1 (show all work) READ THE INSTRUCTIONS CAREFULLY!!!!! NO LATE

S118 Raj Desai male Ceylon 10S121 Singh Linda female United States 15S411 Sawal Gary male India 40S124 Ye Linda female China 0.5S456 Saul Bee male United Kingdom 40S101 Marriott Uli male Ceylon 25

SQL> desc employee_asia Name Null? Type ----------------------------------------- -------- ---------------------------- EMP_ID NOT NULL CHAR(4) EMP_LAST CHAR(15) EMP_FIRST CHAR(10) GENDER CHAR(7) COUNTRY CHAR(50) NOOFYEARS NUMBER(3,1)

North America data is available as

EMP_NA (Emp_num, Employee_first, emp_last, emp_gender, emp_country, job_title)

A sample is available as (note m for male and f for female)

Emp_num Emp_first emp_last Emp_gender Emp_country job_titlePM112 Maria Santa f USA ManagerPM345 Mary Bowie f USA SalesmanPM455 Bora Bora m Canada SalesmanPM233 Lucky Willy f Canada ManagerPM101 Bobby Reyas f Canada CEOPM202 Wheely Sancez f Mexico ManagerPM221 Li Chi m USA DBAPM312 Perry Well m USA CIO

SQL> desc emp_NA Name Null? Type ----------------------------------------- -------- ---------------------------- EMP_NUM NOT NULL CHAR(5) EMP_FIRST CHAR(10) EMP_LAST CHAR(12) EMP_GENDER CHAR(6) EMP_COUNTRY CHAR(30) JOB_TITLE CHAR(12)

P&G wants to develop a table a following integrated table:4

Page 5: INSS 651 - files.transtutors.com file · Web viewYou will submit two files. Word file with q1 and q2. EXCEL file for q1 (show all work) READ THE INSTRUCTIONS CAREFULLY!!!!! NO LATE

EMPLOYEE_DIM (Employee Id, Employee_name, seniority, gender, country, job_class)

Note:Job_class is classified as:

Job Job_classCEO, CIO TOP Manager, DBA MIDDLESalesman OPERATIONS

In addition seniority is defined as:

No of years Seniority<1 temporary Between 1 and 5 juniorBetween 5.1 and 10 senior10.1 and above Super senior

emp_ID and emp_NUM are the same fields.

SHOW ALL QUERIES AND OUTPUTS

1. CLEAN the data in required format (for gender, country of origin, job_class and seniority) a. Employee gender should be standardized, i.e., male should be changed to m and female to fb. Country should be spelled completely, i.e, USA should be spelled out as United States of Americac. Ceylon no longer exists, change the name to Sri Lankad. Name is one attribute in dimension table, combine name as last and first, example Bora (last) and

Lakshmi (first) should be modified to Bora, Lakshmi e. Calculate both job_class and seniority

2. Create CLEAN_ASIA table

3. Create CLEAN_NA table

4. Combine the two using UNION to create following table

EMPLOYEE_DIM (Employee Id, Employee_name, seniority, gender, country, job_class)

5. Show the contents and structure of EMPLOYEE_DIM table.6. Give a count of male and female employees

5

Page 6: INSS 651 - files.transtutors.com file · Web viewYou will submit two files. Word file with q1 and q2. EXCEL file for q1 (show all work) READ THE INSTRUCTIONS CAREFULLY!!!!! NO LATE

Q3 Revise the data warehouse based on new requirements (same as what we did in class)(10 points)

6