# 第一阶段模块三四 **Repository Path**: old_he/bigdata_1_3 ## Basic Information - **Project Name**: 第一阶段模块三四 - **Description**: No description available - **Primary Language**: Java - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-10-28 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # 第一阶段模块三四 #### 模块三 第一步: (select user_id,click_time, if(nvl( unix_timestamp(click_time,'yyyy-MM-dd HH:mm:ss') - lag(unix_timestamp(click_time,'yyyy-MM-dd HH:mm:ss')) over(partition by user_id order by click_time asc),0)/60>30 , 1, 0) flag from it1) a 计算出每行数据与上一行的时间差,时间差大于30分钟的,用1标记。 第二步: (select user_id,click_time,sum(flag) over(partition by user_id order by click_time) s from (select user_id,click_time, if(nvl( unix_timestamp(click_time,'yyyy-MM-dd HH:mm:ss') - lag(unix_timestamp(click_time,'yyyy-MM-dd HH:mm:ss')) over(partition by user_id order by click_time asc),0)/60>30 , 1, 0) flag from it1) a order by user_id,click_time) b 利用sum函数的特性,找出用户不同会话 第三步: select user_id,click_time, rank() over(partition by user_id,s order by click_time) r from (select user_id,click_time,sum(flag) over(partition by user_id order by click_time) s from (select user_id,click_time, if(nvl( unix_timestamp(click_time,'yyyy-MM-dd HH:mm:ss') - lag(unix_timestamp(click_time,'yyyy-MM-dd HH:mm:ss')) over(partition by user_id order by click_time asc),0)/60>30 , 1, 0) flag from it1) a order by user_id,click_time) b; 对不同用户,不同会话进行排序。 #### 模块四 软件架构说明