MariaDB 的窗口函数有哪些?如何使用窗口函数进行数据分析?
MariaDB 的窗口函数(Window Functions)是强大的分析工具,允许在查询结果集上执行复杂的计算,同时保留原始行的详细信息。MariaDB 从 10.2 版本开始支持窗口函数。1. 窗口函数语法window_function_name(expression) OVER ( [PARTITION BY partition_expression] [ORDER BY sort_expression [ASC|DESC]] [FRAME_CLAUSE])2. 常用窗口函数排名函数-- ROW_NUMBER:为每一行分配唯一的序号SELECT employee_id, name, department, salary, ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rankFROM employees;-- RANK:相同值的行获得相同排名,可能有间隔SELECT employee_id, name, department, salary, RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rankFROM employees;-- DENSE_RANK:相同值的行获得相同排名,无间隔SELECT employee_id, name, department, salary, DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rankFROM employees;聚合函数-- SUM:计算累计总和SELECT order_date, amount, SUM(amount) OVER (ORDER BY order_date) AS running_totalFROM orders;-- AVG:计算移动平均值SELECT order_date, amount, AVG(amount) OVER ( ORDER BY order_date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW ) AS moving_avgFROM orders;-- COUNT:计算累计计数SELECT order_date, amount, COUNT(*) OVER (ORDER BY order_date) AS cumulative_countFROM orders;偏移函数-- LAG:访问前一行的值SELECT order_date, amount, LAG(amount, 1, 0) OVER (ORDER BY order_date) AS prev_amount, amount - LAG(amount, 1, 0) OVER (ORDER BY order_date) AS changeFROM orders;-- LEAD:访问后一行的值SELECT order_date, amount, LEAD(amount, 1, 0) OVER (ORDER BY order_date) AS next_amountFROM orders;-- FIRST_VALUE:获取分区的第一个值SELECT employee_id, name, department, salary, FIRST_VALUE(salary) OVER ( PARTITION BY department ORDER BY salary DESC ) AS highest_salaryFROM employees;-- LAST_VALUE:获取分区的最后一个值SELECT employee_id, name, department, salary, LAST_VALUE(salary) OVER ( PARTITION BY department ORDER BY salary DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS lowest_salaryFROM employees;3. 窗口框架(Window Frame)-- ROWS:基于物理行SELECT order_date, amount, SUM(amount) OVER ( ORDER BY order_date ROWS BETWEEN 3 PRECEDING AND CURRENT ROW ) AS sum_4_rowsFROM orders;-- RANGE:基于逻辑值范围SELECT order_date, amount, SUM(amount) OVER ( ORDER BY order_date RANGE BETWEEN INTERVAL 7 DAY PRECEDING AND CURRENT ROW ) AS sum_7_daysFROM orders;-- GROUPS:基于分组SELECT order_date, amount, SUM(amount) OVER ( ORDER BY order_date GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW ) AS sum_2_groupsFROM orders;4. 实际应用场景计算同比增长率SELECT YEAR(order_date) AS year, MONTH(order_date) AS month, SUM(amount) AS monthly_sales, LAG(SUM(amount), 12) OVER (ORDER BY YEAR(order_date), MONTH(order_date)) AS same_month_last_year, (SUM(amount) - LAG(SUM(amount), 12) OVER (ORDER BY YEAR(order_date), MONTH(order_date))) / LAG(SUM(amount), 12) OVER (ORDER BY YEAR(order_date), MONTH(order_date)) * 100 AS yoy_growthFROM ordersGROUP BY YEAR(order_date), MONTH(order_date);计算移动平均SELECT order_date, amount, AVG(amount) OVER ( ORDER BY order_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW ) AS moving_avg_7_daysFROM orders;找出前 N 名SELECT * FROM ( SELECT employee_id, name, department, salary, RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank FROM employees) rankedWHERE rank <= 3;窗口函数提供了强大的数据分析能力,能够简化复杂的 SQL 查询,提高代码的可读性和维护性。