Hubble支持在选择查询返回的行的子集上进行函数操作。即为为窗口函数,它允许您通过一次操作多个行来计算值。窗口函数操作的行子集称为窗口框架。 有关受支持的窗口函数的完整列表,请参见函数内容。所有聚合函数也可以用作窗口函数。有关更多信息,请参见下面的示例。
窗口使用over
或 window
关键字定义,窗口函数,也叫OLAP函数(Online Anallytical Processing,联机分析处理),可以对数据库数据进行实时分析处理。
<窗口函数> over (
partition by <用于分组的列名>
order by <用于排序的列名>
)
因为窗口函数是对where
或者group by
子句处理后的结果进行操作,所以窗口函数原则上只能写在select
子句中。
语法
- windowFunction
参数介绍
参数 | 说明 |
---|---|
window_name | 新窗口的名称 |
existing_window_name | 现有窗口框架的可选名称,在不同的窗口定义中定义。 |
partition_clause | 可选的PARTITION BY 子句。 |
sort_clause | 可选的ORDER BY 子句。 |
frame_clause | 可选的框架子句。 |
使用窗口函数最重要的部分是理解窗口函数将在框架中操作哪些数据。默认情况下,窗口框架包含分区的所有行。如果对分区进行排序,默认框架将包含从分区的第一行到当前行的所有行。换句话说,在创建窗口框架时添加一个ORDER BY
子句(例如,PARTITION BY x ORDER by y
)会产生以下效果:
它使窗口框架内的行有序。
它改变了函数调用的行,不再是窗口框架中的所有行,而是'第一行'和当前行之间的一个子集。
您应该了解作为窗口函数使用的任何聚合函数的行为。如果您没有看到您期望从窗口函数中看到的结果,这种行为可以解释原因。您可能需要在窗口定义中显式地指定框架边界。
在日常工作中,经常会遇到需要在每组内排名,比如下面的业务需求:
面对这类需求,就需要使用sql的高级功能窗口函数了。
窗口功能通过以下方式工作:
OVER
子句中直接在窗口函数之后定义窗口框架,也可以在window
子句中作为选择查询的一部分定义窗口框架。例如,考虑一个查询,其中为每个窗口函数调用定义了窗口:
select *, rank() over (partition by class_no order by fenshu desc) as ranking from score;
其操作可以描述如下:
rank() OVER ()
在包含查询输出的所有行的窗口框架上进行操作。rank() OVER (PARTITION BY class_no order by fenshu)
轮流操作多个窗框,每个框架包含不同班级分区的fenshu
列rank() over()
在实际项目中有较多的应用,相比于dense_rank() over()
,row_number() over()
能够更准的把握分区后取值,以下有示例对比。
比较 rank() over()
,dense_rank() over()
,row_number() over()
create table score(study_no varchar(10),class_no int,fenshu int);
insert into score values('0002',1,95);
insert into score values('0008',1,84);
insert into score values('0001',1,86);
insert into score values('0004',1,86);
insert into score values('0003',2,89);
insert into score values('0005',2,86);
insert into score values('0006',3,92);
insert into score values('0007',3,86);
rank() over()
select *, rank() over (partition by class_no order by fenshu desc) as ranking from score;
study_no | class_no | fenshu | ranking
+----------+----------+--------+---------+
0002 | 1 | 95 | 1
0008 | 1 | 84 | 4
0001 | 1 | 86 | 2
0004 | 1 | 86 | 2
0003 | 2 | 89 | 1
0005 | 2 | 86 | 2
0006 | 3 | 92 | 1
0007 | 3 | 86 | 2
dense_rank() over()
select *,dense_rank() over (partition by class_no order by fenshu desc) as dese_rank from score;
study_no | class_no | fenshu | dese_rank
+----------+----------+--------+-----------+
0002 | 1 | 95 | 1
0008 | 1 | 84 | 3
0001 | 1 | 86 | 2
0004 | 1 | 86 | 2
0003 | 2 | 89 | 1
0005 | 2 | 86 | 2
0006 | 3 | 92 | 1
0007 | 3 | 86 | 2
row_number() over()
select *,row_number() over (partition by class_no order by fenshu desc) as row_num from score;
study_no | class_no | fenshu | row_num
+----------+----------+--------+---------+
0002 | 1 | 95 | 1
0008 | 1 | 84 | 4
0001 | 1 | 86 | 2
0004 | 1 | 86 | 3
0003 | 2 | 89 | 1
0005 | 2 | 86 | 2
0006 | 3 | 92 | 1
0007 | 3 | 86 | 2
select *,
avg(fenshu) over (order by study_no) as current_avg,
count(fenshu) over (order by study_no) as current_count,
max(fenshu) over (order by study_no) as current_max,
min(fenshu) over (order by study_no) as current_min
from score;
study_no | class_no | fenshu | current_avg | current_count | current_max | current_min
-----------+----------+--------+-----------------------+---------------+-------------+--------------
0002 | 1 | 95 | 90.5 | 2 | 95 | 86
0008 | 1 | 84 | 88 | 8 | 95 | 84
0001 | 1 | 86 | 86 | 1 | 86 | 86
0004 | 1 | 86 | 89 | 4 | 95 | 86
0003 | 2 | 89 | 90 | 3 | 95 | 86
0005 | 2 | 86 | 88.4 | 5 | 95 | 86
0006 | 3 | 92 | 89 | 6 | 95 | 86
0007 | 3 | 86 | 88.571428571428571429 | 7 | 95 | 86
以下示例根据emp
消费信息和dept
部门信息关系查询相应的数据
SELECT * FROM
(SELECT distinct(t.dname) as "dname",
COUNT(*) OVER (PARTITION BY t.dname) AS deptnum
FROM dept t JOIN emp e ON e.deptno = t.deptno)
ORDER BY deptnum desc limit 3;
dname | deptnum
-----------+-----------
RESEARCH | 10
ACCOUNTING | 9
SALES | 6
SELECT DISTINCT t.dname,
SUM(e.sal) OVER (PARTITION BY t.dname) AS "total revenue"
FROM dept t JOIN emp e ON e.deptno = t.deptno
ORDER BY "total revenue"
dname | total revenue
-----------+-----------
RESEARCH | 34904.55
ACCOUNTING | 66360.63
SALES | 18061.75
OPERATIONS | 5381.50
BOSS | 1000.00
SELECT DISTINCT t.dname,
COUNT(*) OVER w AS "number of all",
AVG(e.sal) OVER w AS "average revenue"
FROM dept t JOIN emp e ON e.deptno = t.deptno
WINDOW w AS (PARTITION BY t.dname)
ORDER BY "average revenue" DESC, "number of all" ASC
limit 3;
dname | number of all | average revenue
-----------+------------------+--------------
ACCOUNTING | 9 | 7373.40
OPERATIONS | 1 | 5381.50
RESEARCH | 10 | 3490.46
SELECT
COUNT(a.empno) AS "total people",
SUM("total rider revenue") AS "total revenue" FROM (
SELECT DISTINCT e.empno as empno,
SUM(e.sal) OVER (PARTITION BY e.empno) AS "total rider revenue"
FROM dept t JOIN emp e ON e.deptno = t.deptno
ORDER BY "total rider revenue" DESC) a;
total people | total revenue
-------------------+-----------------
27 | 125708.43
select distinct d.dname, COUNT(*) over (partition by d.dname) as dc
from emp e left join dept d on e.deptno=d.deptno order by dc desc;
dname | dc
-----------+-----------
RESEARCH | 10
ACCOUNTING | 9
SALES | 6
OPERATIONS | 1
BOSS | 1