32
MySQL MySQL Data Mining and Verification Data Mining and Verification February 20, 2009 NCDPI Accountability Conference David J. Millush, Testing & Accountability Coordinator Carteret County Public Schools

MySQL Data Mining and Verification February 20, 2009 NCDPI Accountability Conference David J. Millush, Testing & Accountability Coordinator Carteret County

Embed Size (px)

Citation preview

MySQLMySQLData Mining and VerificationData Mining and VerificationFebruary 20, 2009NCDPI Accountability ConferenceDavid J. Millush, Testing & Accountability CoordinatorCarteret County Public Schools

• Introduce basic SQL elements

• Demonstrate common SQL query structure

• Examine SQL queries used to mine data

• Answer questions

• Encourage use!!!

OBJECTIVES

MySQL 5.1 Reference Manual http://dev.mysql.com/doc/refman/5.0/en/

W3Schools' Online SQL Tutorial http://www.w3schools.com/sql/

ABCTools 2009 Training Manual https://www.rep.dpi.state.nc.us/2009/2009_docs_abctools.html

SQL in ABCTools09 User Guide https://www.rep.dpi.state.nc.us/docs/2009/abctools/using_abctools_sql_2009.doc

ABCTools09 SQL Assist https://www.rep.dpi.state.nc.us/docs/2009/abctools/sample_sql_statements.doc

Accountability Documentation Library https://www.rep.dpi.state.nc.us/2009/2009_docs.html

KEY REFERENCES

• Database management system

• Stores data in separate tables

• Uses Structured Query Language (SQL)

• MySQL is the database server software.

• SQL is the language used to interact with MySQL.

MYSQL: WHAT IS IT?

1.SQL can’t hurt you.

2.You can’t break anything.

3.No data will disappear.

THINGS TO REMEMBER

Sample of Tables in ABCTools/MySQL

TABLES, FIELDS & ELEMENTS

Fields are the “structure" of the tables ACCHIST table fields:

TABLES, FIELDS & ELEMENTS

Elements form contents in table fields

TABLES, FIELDS & ELEMENTS

acchist - TABLE

collection_code - FIELD

For collection_code, the element is described as:

1-4 character text, i.e. 20D, FDF, MAR, etc.

Like your English Composition teacher:

• SQL requires good grammar;

• SQL requires proper syntax and punctuation;

• MySQL expects well thought-out statements;

• MySQL is easy to understand… when you pay attention;

• MySQL tells you when corrections are required;

• MySQL rewards you for your effort;

• MySQL doesn’t require any more than you can handle.

GRAMMAR COUNTS

Simple, but essential rules:•Commas separate all fields / elements in a query list.•Apostrophes enclose all field contents.

•Parentheses enclose statement groups.

•An asterisk ( * ) means “everything.”

•A percent sign ( % ) means “anything.”

•Keyboard math symbols are routinely used.

PUNCTUATION BASICS

Apostrophes enclose all elements

For example:

… WHERE (grade = ‘09’ and ma08score < ‘321’ and

(compscore < ‘100’ or compscore is null) …

or

… and englrsncd not in (‘E’,‘G’,‘J’,‘P’,‘Q’,‘S’) …

PUNCTUATION CONT’D

Commas separate all fields and elements.

For example:

SELECT school_code, collection_code, student_id, or

… and englrsncd not in ('E','G','J','P','Q','S','V‘) …

PUNCTUATION CONT’D

Parentheses enclose statement groupsFor example:

… or (grade = '09' and ma08score < '321' ) and

(compscore < '100' or compscore is null) …

or

… WHERE ( (acc2008.student_id = acc2009. student_id) or ((acc2008.last_name = acc2009.last_name) and (acc2008.first_name = acc2009.first_name) and (acc2008.dob = acc2009.dob)) )

PUNCTUATION CONT’D

An asterisk ( * ) means “everything.”For example:

SELECT * from accdata where collection_code = ‘FDF’

A percent sign ( % ) means “anything.”For example:

SELECT * from accdemo where last_name LIKE ‘%Hallor%’

PUNCTUATION CONT’D

Keyboard math symbols are routinely used.

For example:

L.reporting_year = '2009' and L.collection_code = 'FDF'

or

D.reporting_year = '2009' and D.collection_code != 'FDF'

or

H.test_date > '20070825‘

or

H.test_date <= '20080601’

PUNCTUATION CONT’D

Start with simple questions:

• What do you want reported?

• What table(s) contains the information?

• What restrictions need to apply?

• How should the results be sorted?

WHO? WHAT? WHEN?

(SELECT)

(FROM)

(WHERE)

(ORDER BY)

• SELECT statement starts routine queries

• SELECT determines what will be displayed

• SELECT is followed by the fields to display

• Must use the exact field names from the table

• The SELECT statement sequence establishes the output (report) display order.

SELECT = REPORT DISPLAY

•Start with SELECT and add field names•Fields are separated by commasFor example:

SELECT collection_code, school_code, student_id

Or

SELECT * from accdemo

SELECT

• After SELECT statement, add FROM statement

• FROM specifies table(s) where data is recorded

• Multiple tables in FROM statement creates a “inner join” of the tables… this is slightly more complex coding requiring specification in the SELECT, WHERE, and ORDER BY statements

• • Must use exact table names

FROM = WHICH TABLE

FROM follows immediately after the last field in the SELECT statement (without a comma)

For example:

SELECT school_code, collection_code, student_id, grade, dob FROM accdemo

or

SELECT L.collection_code, L.school_code, D.last_name, D.student_id, D.test_date FROM accdemo L, acchist D

FROM

After FROM statement, add WHERE• WHERE sets the limits or conditions on the data• WHERE statements “subset” the data

• WHERE statements use field names, qualifying indicators, and data limits

• Add additional WHERE conditions with AND or OR

For example:

WHERE grade = ‘09’ AND school_code = ‘160312’ OR school_code = ‘160313’

WHERE = CONDITIONS

• Plan WHERE statements based on results desired

• Group WHERE elements in parentheses

• Element order in a group may alter the output

(2 X 10) + 6 = 26 or 2(10 + 6) = 32

• Test the output. If the output is questionable, rethink the WHERE statement

PLANNING FOR WHERE

Simple WHERE constructions can be stacks or choices from within a group

For example:

SELECT school_code, collection_code, student_id, grade, dob FROM accdemo WHERE grade = ‘09’ AND (school_code = ‘160312’ OR school_code = ‘160313’)

Or…

FROM accdemo WHERE grade = ‘09’ AND school_code IN (‘160311’,’160316’,’160317’)

WHERE

SELECT DISTINCT h.student_id, h.reporting_year, h.collection_code, h.school_code, h.last_name, h.first_name, h.grade, h.i_path, a.date_enter_hs, h.ctamrsncd, h.ctamscore, h.ma08rsncd, h.ma08score, h.ma08level, h.alg1rsncd, h.alg1score, h.alg1level,h.ctarrsncd, h.ctarscore, h.rd08rsncd, h.rd08score, h.rd08level, h.englrsncd, h.englscore, h.engllevel FROM histaudit h, accdemo a WHERE ((h.collection_code = a.collection_code) and h.collection_code = 'FDF') and (h.student_id = a.student_id) and (a.date_enter_hs is null or a.date_enter_hs < '20060616') and (h.i_path is null or (h.i_path != 'OCC'and h.i_path != 'GNRL')) and h.grade in ('9','10','11','12') and ((h.ctamscore is null or h.ctamscore != 'P') and (h.ma08score is null or h.ma08score < '321') and ((h.alg1score is null or (h.alg1score BETWEEN '21' and '50') or (h.alg1score between '115' and '144')))) or ((h.collection_code = a.collection_code) and h.collection_code = 'FDF') and (h.student_id = a.student_id) and (a.date_enter_hs is null or a.date_enter_hs < '20060616') and (h.i_path is null or (h.i_path != 'OCC' and h.i_path !='GNRL')) and h.grade in ('9','10','11','12') and ((h.ctarscore is null or h.ctarscore != 'P') and ((h.rd08score is null or h.rd08score < '254') and (h.ma08score is null or h.ma08score < '321')) and (h.englscore is null or (h.englscore BETWEEN '18' and '48') or (h.englscore BETWEEN '120' and '142‘)))

WHERE IN THE...

ORDER BY = SORT SEQUENCE After WHERE statement, add ORDER BY

• ORDER BY organizes data in the same manner as sorting data in an Excel spreadsheet

• ORDER BY sort elements are field names

• Use commas to separate ORDER BY elements

For example:

… FROM histaudit WHERE grade = 10 and i_path != 'GNRL' and ((engllevel is NULL and (englrsncd is NULL or englrsncd not in ('E','G','J','P','Q','S','V'))) or (alg1level is NULL and (alg1rsncd is NULL or alg1rsncd not in ('E','G','J','P','Q','S','V')))) and collection_code = 'FDF'

ORDER BY school_code, last_name, first_name

NICE-TO-KNOWS• IS NULL is not the same as zero, IS NULL means blank

• != means “not equal”

• ‘09’ is not the same as ‘9’

• Use LIKE and % when you aren’t sure:…WHERE last_name LIKE ‘%Hallor%’

• Write your queries in Notepad so you can see them. Copy and paste into TOOLS09 MySQL tab to run.

• Cut and paste from queries that work.

• Use a scratch pad to outline your search criteria, then construct the SQL query. Run it and modify until you get what you need.

DATA MINING• What information do you need?

• Who needs to test?

• Who has tested?

• What were the results?

• What trends are identifiable?

• What information would assist

teachers?

• What information will assist in

planning?

DATA MINING - CONTINUED1. Identify table(s) containing the required

dataGo to Test Coordinators webpage:

https://www.rep.dpi.state.nc.us/Click on “Quick Links”Click on “ADB / Accountability Database”Click on “Coding Reference Tables”Select tables to see the fields they contain

2. Select the required fields for the report

3. Plan the WHERE statement

4. Write a draft query in Notepad

5. Paste and run in Tools

6. Edit and modify until it runs properly

ALG1 Prediction by Teacher with Gr6 -8 Math Displayed

SELECT m.student_id as 'accmem id', h.student_id as 'histaudit id', m.school_code as 'school', m.statecourse as 'state course', m.localcourse as 'local course', m.coursetitle, m.teachername as 'teacher', m.semester as 'SEM', m.section as 'SEC', m.student_id, m.grade, m.last_name, m.first_name, m.sex, m.minority_code as 'ethn', h.ma06level as 'MA06 Level', h.ma06score as 'MA06 score', h.ma06cscore as 'MA06 Cscore', h.ma07level as 'MA07 Level', h.ma07score as 'MA07 score', h.ma07cscore as 'MA07 Cscore', h.ma08level as 'MA08 Level', h.ma08score as 'MA08 score', h.ma08cscore as 'MA08 Cscore', h.alg1level as 'ALG1 Level', h.alg1score as 'ALG1 score', h.alg1cscore as 'ALG1 Cscore', t.current_testid as 'test', t.predicted_score, t.predicted_cscore FROM (SELECT DISTINCT student_id, last_name, first_name, grade, school_code, statecourse, localcourse, coursetitle, semester, section, teachername, sex, minority_code FROM accmem WHERE collection_code = 'CUR' AND semester = '2' AND statecourse in ('2018X','2022X','2023X','2052X') ) m LEFT JOIN (SELECT student_id, subject, current_testid, predicted_score, predicted_cscore FROM abc_target WHERE collection_code = 'CUR' and subject = 'A1' and current_testid = 'ALG1') t ON (m.student_id = t.student_id) LEFT JOIN (SELECT student_id,last_name,first_name, ma06level, ma06score, ma06cscore, ma07level, ma07score, ma07cscore, ma08level, ma08score, ma08cscore, alg1level, alg1score, alg1cscore FROM histaudit WHERE collection_code = 'CUR') h ON (m.student_id = h.student_id) ORDER BY m.school_code, m.teachername, m.semester, m.localcourse, m.section, m.last_name, m.first_name

QUESTIONS?