news 2026/5/29 19:44:30

SQL窗口函数(使用场景)

作者头像

张小明

前端开发工程师

1.2k 24
文章封面图
SQL窗口函数(使用场景)

1. 常见排名和排序

-- 行号 ROW_NUMBER() OVER (ORDER BY sales DESC) AS rank -- 排名(允许并列) RANK() OVER (ORDER BY sales DESC) AS rank -- 密集排名 DENSE_RANK() OVER (ORDER BY sales DESC) AS dense_rank -- 分位数 NTILE(4) OVER (ORDER BY salary DESC) AS quartile

2. 时间序列分析

-- 移动平均 SELECT date, sales, AVG(sales) OVER ( ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW ) AS 7_day_avg FROM daily_sales; -- 环比增长 SELECT month, revenue, (revenue - LAG(revenue, 1) OVER (ORDER BY month)) / LAG(revenue, 1) OVER (ORDER BY month) * 100 AS growth_rate FROM monthly_revenue;

3. 累计计算

-- 累计求和 SELECT date, sales, SUM(sales) OVER ( ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS running_total FROM sales; -- 累计百分比 SELECT customer, revenue, revenue / SUM(revenue) OVER () * 100 AS pct_total, SUM(revenue) OVER ( ORDER BY revenue DESC ROWS UNBOUNDED PRECEDING ) * 100.0 / SUM(revenue) OVER () AS cum_pct FROM customers;

4. 数据比较

-- 与前一行比较 SELECT date, temperature, temperature - LAG(temperature, 1) OVER (ORDER BY date) AS diff_prev, LEAD(temperature, 1) OVER (ORDER BY date) - temperature AS diff_next FROM weather_data; -- 与分组内第一行比较 SELECT department, employee, salary, FIRST_VALUE(salary) OVER (PARTITION BY department ORDER BY salary DESC) as top_salary, salary - FIRST_VALUE(salary) OVER (PARTITION BY department ORDER BY salary DESC) as gap_from_top FROM employees;

5. 高级分析场景

-- 会话分析(找出用户连续访问) SELECT user_id, login_time, LAG(logout_time) OVER (PARTITION BY user_id ORDER BY login_time) as prev_logout, CASE WHEN login_time <= LAG(logout_time) OVER (PARTITION BY user_id ORDER BY login_time) THEN 0 ELSE 1 END as is_new_session FROM user_sessions; -- 查找重复记录 SELECT *, ROW_NUMBER() OVER (PARTITION BY email, name ORDER BY created_at) as dup_count FROM users WHERE dup_count > 1; -- 计算留存率 WITH user_activity AS ( SELECT user_id, login_date, MIN(login_date) OVER (PARTITION BY user_id) as first_login, LEAD(login_date) OVER (PARTITION BY user_id ORDER BY login_date) as next_login FROM logins ) SELECT first_login as cohort_date, COUNT(DISTINCT user_id) as cohort_size, COUNT(DISTINCT CASE WHEN next_login = first_login + INTERVAL '1 day' THEN user_id END) as day1_retained FROM user_activity GROUP BY first_login;

6. 复杂业务场景

-- 漏斗分析 WITH funnel AS ( SELECT user_id, MAX(CASE WHEN event = 'view' THEN 1 ELSE 0 END) as viewed, MAX(CASE WHEN event = 'click' THEN 1 ELSE 0 END) as clicked, MAX(CASE WHEN event = 'purchase' THEN 1 ELSE 0 END) as purchased FROM events GROUP BY user_id ) SELECT COUNT(*) as total_users, SUM(viewed) as viewers, SUM(clicked) as clickers, SUM(purchased) as buyers, 100.0 * SUM(clicked) / NULLIF(SUM(viewed), 0) as click_rate FROM funnel; -- 间隔计算 SELECT user_id, event_time, EXTRACT(EPOCH FROM (event_time - LAG(event_time) OVER (PARTITION BY user_id ORDER BY event_time) )) as seconds_since_last_event FROM events;

7. 性能优化技巧

-- 避免自连接 -- 传统方式(需要自连接) SELECT a.id, a.value, MAX(b.value) as max_so_far FROM table a JOIN table b ON a.id >= b.id GROUP BY a.id, a.value; -- 使用窗口函数(更高效) SELECT id, value, MAX(value) OVER (ORDER BY id ROWS UNBOUNDED PRECEDING) as max_so_far FROM table;

最佳实践建议:

  1. 注意性能:窗口函数在大量数据上可能较慢,合理使用分区

  2. 结合索引:ORDER BY子句中的字段建议有索引

  3. 使用FILTER(如果数据库支持):

    AVG(salary) FILTER (WHERE department = 'Sales') OVER () as avg_sales_salary
  4. 明确窗口范围:明确指定ROWS或RANGE避免歧义

版权声明: 本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若内容造成侵权/违法违规/事实不符,请联系邮箱:809451989@qq.com进行投诉反馈,一经查实,立即删除!
网站建设 2026/5/22 13:52:49

LobeChat能否部署在边缘计算节点?低延迟场景应用

LobeChat能否部署在边缘计算节点&#xff1f;低延迟场景应用 在智能制造车间&#xff0c;一台工业机器人因传感器异常停机。现场工程师打开平板电脑&#xff0c;连接厂区本地网络中的AI助手&#xff0c;输入&#xff1a;“根据历史日志&#xff0c;分析最近三次类似报警的处理方…

作者头像 李华
网站建设 2026/5/30 9:13:40

金融领域语音识别技术的优化与应用

金融领域语音识别技术的优化与应用关键词&#xff1a;金融领域、语音识别技术、优化、应用、深度学习摘要&#xff1a;本文聚焦于金融领域语音识别技术的优化与应用。首先介绍了研究的背景、目的、预期读者等内容。接着阐述了语音识别的核心概念、联系及架构&#xff0c;详细讲…

作者头像 李华
网站建设 2026/5/26 5:32:50

Qwen3-14B Docker部署与Function Calling实战

Qwen3-14B Docker部署与Function Calling实战 在企业AI落地的今天&#xff0c;真正决定成败的早已不是“模型能不能生成一段漂亮的文案”&#xff0c;而是——它能不能读完一份20页的合同后指出风险条款&#xff1f;能不能看到发票就自动走报销流程&#xff1f;甚至&#xff0c…

作者头像 李华
网站建设 2026/5/29 14:21:01

宝塔面板下两个WordPress网站共用Memcached完整配置指南

宝塔面板下两个WordPress网站共用Memcached完整配置指南 在宝塔面板中部署多个WordPress&#xff08;简称WP&#xff09;网站后&#xff0c;通过Memcached实现内存缓存是提升站点性能的关键手段。Memcached可将WP的数据库查询结果、文章内容等常用数据暂存于内存&#xff0c;大…

作者头像 李华
网站建设 2026/5/27 11:53:05

【Maven】生命周期、依赖与继承

生命周期Maven的生命周期是对所有的构建过程进行抽象和统一。Maven的生命周期是抽象的&#xff0c;这意味着生命周期本身不做任何实际的工作&#xff0c;生命周期只是定义了一系列的阶段&#xff0c;并确定这些阶段的执行顺序。而在执行这些阶段时&#xff0c;实际的工作还是由…

作者头像 李华
网站建设 2026/5/27 22:40:05

Docker部署Qwen3-8B与vLLM推理加速实战

Docker部署Qwen3-8B与vLLM推理加速实战 在消费级显卡上跑通一个真正能用的大语言模型&#xff0c;曾是许多开发者遥不可及的梦想。但随着Qwen3-8B这类高性价比模型的出现&#xff0c;以及vLLM等高效推理框架的成熟&#xff0c;如今只需一块RTX 4090&#xff0c;就能搭建出响应迅…

作者头像 李华