29
Analytic Functions Analytic functions compute an aggregate value based on a group of rows. They differ from aggregate functions in that they return multiple rows for each group. The group of rows is called a window and is defined by the analytic_clause. For each row, a sliding window of rows is defined. The window determines the range of rows used to perform the calculations for the current row. Window sizes can be based on either a physical number of rows or a logical interval such as time. Analytic functions are the last set of operations performed in a query except for the final ORDER BY clause. All joins and all WHERE, GROUP BY, and HAVING clauses are completed before the analytic functions are processed. Therefore, analytic functions can appear only in 1) the select list; 2) ORDER BY clause. Analytic functions are commonly used to compute cumulative, moving, centered and reporting aggregates. 1

Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

  • Upload
    phamanh

  • View
    214

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

Analytic Functions

Analytic functions compute an aggregate value based on a group of rows. They differ from aggregate functions in that they return multiple rows for each group. The group of rows is called a window and is defined by the analytic_clause. For each row, a sliding window of rows is defined. The window determines the range of rows used to perform the calculations for the current row. Window sizes can be based on either a physical number of rows or a logical interval such as time.Analytic functions are the last set of operations performed in a query except for the final ORDER BY clause. All joins and all WHERE, GROUP BY, and HAVING clauses are completed before the analytic functions are processed. Therefore, analytic functions can appear only in 1) the select list;2) ORDER BY clause.Analytic functions are commonly used to compute cumulative, moving, centered and reporting aggregates.

1

Page 2: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

Sliding Window Example

2

Page 4: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

SQL papildinājumi datu analīzei

1. Kārtošanas (ranking), klona tabulas (windowing) un pārskatu

funkcijas.

2. Uzlabotās agregātfunkcijas datu analīzei.

3. Šķērstabulu (pivoting) darbības.

4. Datu sablīvēšana pārskatu veidošanai.

5. Laika sēriju aprēķini sablīvētiem datiem.

6. Dažādu analīžu (miscellaneous analysis) iespējas.

Type Used forRanking  Calculating ranks, percentiles, and n-tiles of the values

in a result set. Windowing  Calculating cumulative and moving aggregates. Works

with these functions: SUM, AVG, MIN, MAX, COUNT, VARIANCE, STDDEV, FIRST_VALUE, LAST_VALUE, and new statistical functions 

Reporting  Calculating shares (akcijas), for example, market share. Works with these functions: SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, RATIO_TO_REPORT, and new statistical functions 

LAG/LEAD  Finding a value in a row a specified number of rows from a current row. 

FIRST/LAST  First or last value in an ordered group. Linear Regression  Calculating linear regression and other statistics (slope,

intercept, and so on). Inverse Percentile  The value in a data set that corresponds to a specified

percentile. Hypothetical Rank and Distribution 

The rank or percentile that a row would have if inserted into a specified data set. 

4

Page 5: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

Analītisko funkciju pamatkoncepcijas

1. Processing order. Query processing using analytic functions takes place in three stages. First, all joins, WHERE, GROUP BY and HAVING clauses are performed. Second, the result set is made available to the analytic functions, and all their calculations take place. Third, if the query has an ORDER BY clause at its end, the ORDER BY is processed to allow for precise output ordering.

2. Result set partitions. The analytic functions allow users to divide query result sets into groups of rows called partitions. Note that the term partitions used with analytic functions is unrelated to the table partitions feature. Throughout this chapter, the term partitions refers to only the meaning related to analytic functions. Partitions are created after the groups defined with GROUP BY clauses, so they are available to any aggregate results such as sums and averages. Partition divisions may be based upon any desired columns or expressions. A query result set may be partitioned into just one partition holding all the rows, a few large partitions, or many small partitions holding just a few rows each.

3. Window. For each row in a partition, you can define a sliding window of data. This window determines the range of rows used to perform the calculations for the current row. Window sizes can be based on either a physical number of rows or a logical interval such as time. The window has a starting row and an ending row. Depending on its definition, the window may move at one or both ends. For instance, a window defined for a cumulative sum function would have its starting row fixed at the first row of its partition, and its ending row would slide from the starting point all the way to the last row of the partition. In contrast, a window defined for a moving average would have both its starting and end points slide so that they maintain a constant physical or logical range.A window can be set as large as all the rows in a partition or just a sliding window of one row within a partition. When a window is near a border, the function returns results for only the available rows, rather than warning you that the results are not what you want.

4. Current row. Each calculation performed with an analytic function is based on a current row within a partition. The current row serves as the reference point determining the start and end of the window. For instance, a centered moving average calculation could be defined with a window that holds the current row, the six preceding rows, and the following six rows.

5

Page 6: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

Klona tabulas (windowing) izmantošana

Tabula DARBINIEKINODALAS_NUMURS UZVARDS ALGA1 Koks 2001 Zars 3502 Sakne 3002 Ozols 2502 Liepa 350

Vaicājuma rezultātsUZVARDS ALGA Kopeja_algaKoks 200 550Zars 350 550Sakne 300 900Ozols 250 900Liepa 350 900

select UZVARDS, ALGA, SUM(ALGA) OVER(order by NODALAS_NUM) Kopeja_algafrom DARBINIEKI;

6

Page 7: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

Piemēri klona tabulas izmantošanai. Tabulas izveidošana

create table DARBINIEKI(D_NUM number Primary Key,UZV varchar2(10),NODALA number,AMATS varchar2(15),ALGA number(10,2));

begininsert into DARBINIEKI values(1001, 'Koks', 1, 'direktors', 500);insert into DARBINIEKI values(1002, 'Celms', 2, 'direktors', 550);insert into DARBINIEKI values(1003, 'Sakne', 3, 'direktors', 400);insert into DARBINIEKI values(1004, 'Lapa', 1, 'sekretāre', 300);insert into DARBINIEKI values(1005, 'Oga', 2, 'sekretāre', 250);insert into DARBINIEKI values(1006, 'Zars', 3, 'sekretāre', 150);insert into DARBINIEKI values(1007, 'Irbe', 1, 'galdnieks', 400);insert into DARBINIEKI values(1008, 'Cauna', 2, 'skārdnieks', 500);insert into DARBINIEKI values(1009, 'Sesks', 3, 'krāsotājs', 450);end;

select * from DARBINIEKI;

D_NUM UZV NODALA AMATS ALGA---------------------------------------------------------------------------------------------------- 1001 Koks 1 direktors 500 1002 Celms 2 direktors 550 1003 Sakne 3 direktors 400 1004 Lapa 1 sekretāre 300 1005 Oga 2 sekretāre 250 1006 Zars 3 sekretāre 150 1007 Irbe 1 galdnieks 400 1008 Cauna 2 skārdnieks 500 1009 Sesks 3 krāsotājs 450

7

Page 8: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

Piemēri klona tabulas izmantošanaiselect NODALA, SUM(ALGA) ALGU_SUMMAfrom DARBINIEKIgroup by NODALA;

NODALA ALGU_SUMMA-------------------------------------------- 1 1200 2 1300 3 1000

select NODALA, ALGA, SUM(ALGA) OVER (PARTITION BY NODALA) ALGU_SUMMA from DARBINIEKI;

NODALA ALGA ALGU_SUMMA-------------------------------------------------------------------- 1 500 1200 1 300 1200 1 400 1200 2 550 1300 2 500 1300 2 250 1300 3 400 1000 3 450 1000 3 150 1000

8

Page 9: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

Piemēri klona tabulas izmantošanai

select NODALA, ALGA, ALGA*100/SUM(ALGA) OVER (PARTITION BY NODALA) ALGAS_PROCENTSfrom DARBINIEKI;

NODALA ALGA ALGAS_PROCENTS---------------------------------------------------------------- 1 500 41,6666667 1 300 25 1 400 33,3333333 2 550 42,3076923 2 500 38,4615385 2 250 19,2307692 3 400 40 3 450 45 3 150 15

select UZV, NODALA, ALGA, RANK() OVER (PARTITION BY NODALA ORDER BY ALGA) ALGAS_RANGSfrom DARBINIEKI;

UZV NODALA ALGA ALGAS_RANGS----------------------------------------------------------------------------Lapa 1 300 1Irbe 1 300 1 Vienādās vērtībasKoks 1 500 3Oga 2 250 1Cauna 2 500 2Celms 2 550 3Zars 3 150 1Sakne 3 400 2Sesks 3 450 3

9

Page 10: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

Piemēri klona tabulas izmantošanai

select UZV, NODALA, ALGA, DENSE_RANK() OVER (PARTITION BY NODALA ORDER BY ALGA) ALGAS_RANGSfrom DARBINIEKI;

UZV NODALA ALGA ALGAS_RANGS--------------------------------------------------------------------Lapa 1 300 1Irbe 1 300 1 Vienādās vērtībasKoks 1 500 2Oga 2 250 1Cauna 2 500 2Celms 2 550 3Zars 3 150 1Sakne 3 400 2Sesks 3 450 3

select UZV, NODALA, ALGA, SUM(ALGA) OVER (PARTITION BY NODALA ORDER BY ALGA RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) asALGU_SUMMAfrom DARBINIEKI;

UZV NODALA ALGA ALGU_SUMMA----------------------------------------------------------------------Lapa 1 300 300Irbe 1 400 700Koks 1 500 1200Oga 2 250 250Cauna 2 500 750Celms 2 550 1300Zars 3 150 150Sakne 3 400 550Sesks 3 450 1000

10

Page 11: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

Piemēri klona tabulas izmantošanai

select UZV, NODALA, ALGA, MAX(ALGA) OVER (PARTITION BY NODALA ORDER BY ALGA DESC ROWS 1 PRECEDING) AS ALGA_MAXfrom DARBINIEKI;

UZV NODALA ALGA ALGA_MAX----------------------------------------------------------------------Koks 1 500 500Irbe 1 400 500Lapa 1 300 400Celms 2 550 550Cauna 2 500 550Oga 2 250 500Sesks 3 450 450Sakne 3 400 450Zars 3 150 400

11

Page 12: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

Piemēri klona tabulas izmantošanai laika datu gadījumāselect * from ALGA_T;

UZV NODALA ALGA DATUMS------------------------------------------------------------Koks 1 150 2000.01.10Koks 1 160 2000.02.10Koks 1 170 2000.03.10Koks 1 180 2000.04.10Koks 1 190 2000.05.10Koks 1 200 2000.06.10Celms 2 150 2000.02.10Celms 2 160 2000.03.10Celms 2 160 2000.04.10Celms 2 170 2000.05.10Celms 2 190 2000.06.10Celms 2 200 2000.07.10

SELECT UZV, ALGA, SUM(ALGA) OVER (ORDER BY ALGA ROWS 1 PRECEDING) as PLUS_ALGA_IEPRIEKŠĒJĀ from ALGA_Twhere NODALA =1;

UZV ALGA PLUS_ALGA_IEPRIEKŠĒJĀ-----------------------------------------------------------Koks 150 150Koks 160 310Koks 170 330Koks 180 350Koks 190 370Koks 200 390

12

Page 13: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

Piemēri klona tabulas izmantošanai laika datu gadījumā

select UZV, ALGA, SUM(ALGA) OVER (ORDER BY DATUMS RANGE BETWEEN CURRENT ROW AND INTERVAL '1' MONTH FOLLOWING) as PLUS_MĒNESIS_UZ_PRIEKŠUfrom ALGA_Twhere NODALA =1;

UZV ALGA PLUS_MĒNESIS_UZ_PRIEKŠU-----------------------------------------------------------------------------Koks 150 310Koks 160 330Koks 170 350Koks 180 370Koks 190 390Koks 200 200

select UZV, ALGA, LAST_VALUE(ALGA) OVER (ORDER BY ALGA ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING) AS NĀKOŠĀfrom ALGA_Twhere NODALA =1;

UZV ALGA NĀKOŠĀ--------------------------------------------Koks 150 160Koks 160 170Koks 170 180Koks 180 190Koks 190 200Koks 200 200

13

Page 14: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

Piemēri klona tabulas izmantošanai laika datu gadījumā

select UZV, ALGA, ALGA - LAST_VALUE(ALGA) OVER (ORDER BY ALGA ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING) AS NĀKOŠĀfrom ALGA_Twhere NODALA =1;

UZV ALGA NĀKOŠĀ------------------------------------------------Koks 150 -10Koks 160 -10Koks 170 -10Koks 180 -10Koks 190 -10Koks 200 0

select A.UZV, A.ALGA, ABS(A.STARPĪBA)from (select UZV, ALGA, ALGA - LAST_VALUE(ALGA) OVER (ORDER BY ALGA ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING) as STARPĪBA from ALGA_T where NODALA =1) A ;

UZV ALGA ABS(A.NAKOSA)--------------------------------------------------------Koks 150 10Koks 160 10Koks 170 10Koks 180 10Koks 190 10Koks 200 0

14

Page 15: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

Piemēri klona tabulas izmantošanai laika datu gadījumā

select A.UZV, MAX(ABS(A.STARPĪBA))from (select UZV, ALGA, ALGA - LAST_VALUE(ALGA) OVER (ORDER BY ALGA ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING) as STARPĪBA from ALGA_T where NODALA =1) Agroup by A.UZV;

UZV MAX(ABS(A.STARPĪBA))------------------------------------------------Koks 10

select CASE WHEN ABS(A.NAKOSA) <5 THEN 'Zem 5' WHEN ABS(A.NAKOSA) >= 5 THEN 'Virs 5' END, COUNT(ABS(A.NAKOSA))from (select UZV, ALGA, ALGA - LAST_VALUE(ALGA) OVER (ORDER BY ALGA ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING) AS NAKOSA from ALGA_T where NODALA =1) A group by CASE WHEN ABS(A.NAKOSA) <5 THEN 'Zem 5' WHEN ABS(A.NAKOSA) >= 5 THEN 'Virs 5' END;

CASE WH COUNT(ABS(A.NAKOSA))-----------------------------------------------------------------Virs 5 5Zem 5 1

15

Page 16: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

Piemēri klona tabulas izmantošanai laika datu gadījumā

create table D_ALGA (ID number Primary Key,UZV varchar2(10),DZIMUMS char(8),VIENIBA number,ALGA number(9,2), DATUMS date);

select * from D_ALGA;

ID UZV DZIMUMS VIENIBA ALGA DATUMS------------------------------------------------------------------------------ 1 Koks sieviete 1 150 2002.01.18 2 Koks sieviete 1 160 2002.02.18 3 Koks sieviete 1 160 2002.03.18 4 Koks sieviete 1 160 2002.04.18 5 Koks sieviete 1 170 2002.05.18 6 Celms vīrietis 2 160 2002.01.18 7 Celms vīrietis 2 160 2002.02.18 8 Celms vīrietis 2 170 2002.03.18 9 Celms vīrietis 2 170 2002.04.18 10 Celms vīrietis 2 180 2002.05.18 11 Celms vīrietis 2 180 2002.06.18

select ID, ALGA, 100*CUME_DIST() OVER (PARTITION BY UZV ORDER BY ALGA)from D_ALGAwhere VIENIBA = 1;

ID ALGA 100*CUME_DIST()OVER(PARTITION BY UZV ORDER BY ALGA)---------- ---------- ---------------------------------------------- 1 150 20 2 160 80 3 160 80 4 160 80 5 170 100

16

Page 17: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

Piemēri klona tabulas izmantošanai laika datu gadījumā

select ID, ALGA, LAG(ALGA,1) OVER (PARTITION BY UZV ORDER BY ALGA)from D_ALGAwhere VIENIBA = 1;

ID ALGA LAG(ALGA,1)OVER(PARTITIONBYUZVORDERBYALGA)-------------------------------------------------------------------------------------------------- 1 150 2 160 150 3 160 160 4 160 160 5 170 160

17

Page 18: Analytic Functions - Datu bāzes tehnoloģijas | Izglītība …€¦  · Web view · 2016-10-26SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, ... A query

Piemēri klona tabulas izmantošanai laika datu gadījumā

select ID, ALGA, LEAD(ALGA,1) OVER (PARTITION BY UZV ORDER BY ALGA)from D_ALGAwhere VIENIBA = 1;

ID ALGA LEAD(ALGA,1)OVER(PARTITIONBYUZVORDERBYALGA)-------------------------------------------------------------------------------------------------- 1 150 160 2 160 160 3 160 160 4 160 170 5 170

18