Upload
vankien
View
244
Download
4
Embed Size (px)
Citation preview
MANAGING DATA(BASES) USING SQL
(NON-PROCEDURAL SQL, X401.9)
Professional Program: Data Administration and Management
Instructor: Michael Kremer, Ph.D.Technology & Information Management
Class 6
AGENDA
9. Aggregate Queries
9.1 Aggregate Functions
9.2 Grouping Data
9.3 Filtering Grouped/Aggregated Data
9.1 AGGREGATE FUNCTIONS
Aggregate function: Values of multiple rows are grouped together
as input on certain criteria to form a single value of more
significant meaning or measurement
Aggregate queries are still part of SELECT queries (DQL)
Following functions are
part of SQL standard:
Additional aggregate
functions include StDev, Var, First, Last (db platform dependent)
169
9.1 AGGREGATE FUNCTIONS
Create expressions
combining multiple
aggregate functions
together
Rules about aggregate function expressions:
170
9.1 AGGREGATE FUNCTIONS
Use of SQL
predicates (such as
DISTINCT) is allowed:
Use WHERE clause
to apply aggregation
over subset of data:
171
9.1 AGGREGATE FUNCTIONS
SELECT query using dynamic WHERE clause criteria based on
aggregate query
Hard-coded WHERE
clause, average
salary of ST_CLERK
(based on previous
example)
Use of subquery
in WHERE clause
to dynamically
calculate average
salary of ST_CLERK
172
9.1 AGGREGATE FUNCTIONS
Subqueries will be
covered in next chapter
in more detail
Basic rules of
subqueries:
173
9.2 GROUPING DATA
So far, aggregation took place over entire table or subset using WHERE clause
Grouping of data Partition table data into groups and aggregate data within each group
New Keyword: GROUP BY
Partition of a set divides the set into subsets such that the union of the subsets returns the originalset (pizza, pie)
GROUP BY clause takes the result ofthe FROM and WHERE clauses and arranges the rows into groups defined as having the same values for the columns listed in the Group By clause
Each group is reduced to a single row in the resulting set
174
9.2 GROUPING DATA
If the grouped fields contain null values, all null values are
arranged into one group as if they were all equal
SELECT syntax including
GROUP BY clause
Rules for aggregate
functions:
176
9.2 GROUPING DATA
NULL behavior of
aggregate functions
(Aggregate functions
do not consider NULL
values)
Rules for GROUP BY:
177
9.2 GROUPING DATA
Error message when not
using GROUP BY in
SELECT query
Same query using
GROUP BY
178
9.2 GROUPING DATA
Use multiple columns
in GROUP BY clause
Order in GROUP BY
clause has no bearing
on results
To achieve desired
order, use ORDER BY
clause
179
9.2 GROUPING DATA
Use multiple aggregate
functions
Group on expressions,
this allows you to
customize your query in a
very specific way based
on certain business rules
You have to list expression twice, in SELECT clause as well as in
GROUP BY clause
Due to the fact that GROUP BY clause is processed before
SELECT clause
Better solution is to use subquery (demonstrated later in this
course)
180
9.2 GROUPING DATA
When using multiple tables, make sure to aggregate data only
on lowest child table
When using multiple tables, you denormalize data and display
parent records multiple times aggregation takes place on
redundant parent data!
182
9.2 GROUPING DATA
Remove COUNT function
to demonstrate number
of records
To get the correct count,
use the DISTINCT
predicate
183
9.2 GROUPING DATA
Display distinct
departments
Only shows count of
departments that have
employees associated
with them (INNER JOIN)
To count all departments,
use OUTER JOIN!
184
9.2 GROUPING DATA
Nest aggregate functions up to a level of two in Oracle (this is
not according to the SQL standard)
Must use GROUP BY for inner aggregate function returns
multiple values
Error message when
omitting GROUP BY:
Add GROUP BY clause
to be used for inner
aggregate function:
185
9.3 FILTERING GROUPED/AGGREGATED DATA
3 different kind of columns in aggregate SELECT statement:
Group By column
Aggregate column
Non-Select column (only used for additional filtering)
Hence there 3 filtering situations:
Filtering on Non-Select
column WHERE clause:
188
9.3 FILTERING GROUPED/AGGREGATED DATA
Same query as before but without WHERE clause
Resulting data is different!
189
9.3 FILTERING GROUPED/AGGREGATED DATA
Filtering on Group By
column WHERE clause
Previous two examples: WHERE clause filters data before
aggregation takes place
Next example shows filtering on aggregate column HAVING
clause
HAVING clause is used to restrict the number of rows after the
aggregation of data used for aggregate columns
190
9.3 FILTERING GROUPED/AGGREGATED DATA
HAVING clause comes
after GROUP BY
HAVING clause is processed
after aggregation takes
place
WHERE clause, on the other
hand, is processed before!
You can place HAVING
clause before GROUP BY
This is just semantic, it does not represent processing order!
191
9.3 FILTERING GROUPED/AGGREGATED DATA
Filtering on aggregate
column HAVING clause
In order to filter average
salary greater than 10000,
we must know the average
salary first!
Aggregation must take
place first not very
efficient as the database must aggregate all rows first before
applying the HAVING clause to filter out rows
SQL processing order:
192