hive常用需求写法
hive常用需求分析
会一直更新 值得收藏
1. hive中划分年龄段函数用法
第一种 case when
case
when age >= 0 and age <= 10 then '0-10'
when age > 10 and age <= 20 then '10-20'
when age > 20 and age <= 30 then '20-30'
when age > 30 and age <= 40 then '30-40'
when age > 40 and age <= 50 then '40-50'
when age > 50 and age <= 60 then '50-60'
when age > 60 and age <= 70 then '60-70'
end age_phase
第二种 数学函数拼接
select concat(floor(20 / 10) * 10, '-', (floor(20 / 10) + 1) * 10) as x;
2. 求连续登录的活跃用户
方法一:窗口函数按照用户id分组 日期减去出现的次数 相同日期则为连续登录 最后 hiving 筛选 >= 为连续登录两天的用户
select user_id,
ds,
max(age) as age,
count(2) as cnt
from (select *,
date_sub(t2.dt, rn) as ds
from (select t1.dt,
t1.user_id,
t1.age,
row_number() over ( partition by t1.user_id order by t1.dt asc ) as rn
from (select distinct dt, user_id, age
from test5) t1) t2) t3
group by ds, user_id
having cnt >= 2