列表

详情


1142. 过去30天的用户活动 II

Activity 表:

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| user_id       | int     |
| session_id    | int     |
| activity_date | date    |
| activity_type | enum    |
+---------------+---------+
该表没有主键,它可能有重复的行。
activity_type 列是 ENUM 类型,可以取(“ open_session”,“ end_session”,“ scroll_down”,“ send_message”)四种活动类型之一。
该表显示了社交媒体网站的用户活动。
请注意,每个会话只属于一个用户。

 

编写解决方案,统计截至 2019-07-27(含)的 30 天内每个用户的平均会话数,四舍五入到小数点后两位。只统计那些会话期间用户至少进行一项活动的有效会话。

结果格式如下例所示。

 

示例:

输入:
Activity 表:
+---------+------------+---------------+---------------+
| user_id | session_id | activity_date | activity_type |
+---------+------------+---------------+---------------+
| 1       | 1          | 2019-07-20    | open_session  |
| 1       | 1          | 2019-07-20    | scroll_down   |
| 1       | 1          | 2019-07-20    | end_session   |
| 2       | 4          | 2019-07-20    | open_session  |
| 2       | 4          | 2019-07-21    | send_message  |
| 2       | 4          | 2019-07-21    | end_session   |
| 3       | 2          | 2019-07-21    | open_session  |
| 3       | 2          | 2019-07-21    | send_message  |
| 3       | 2          | 2019-07-21    | end_session   |
| 3       | 5          | 2019-07-21    | open_session  |
| 3       | 5          | 2019-07-21    | scroll_down   |
| 3       | 5          | 2019-07-21    | end_session   |
| 4       | 3          | 2019-06-25    | open_session  |
| 4       | 3          | 2019-06-25    | end_session   |
+---------+------------+---------------+---------------+
输出:
+---------------------------+ 
| average_sessions_per_user |
+---------------------------+ 
| 1.33                      |
+---------------------------+
解释:用户 1 和 2 每人在过去 30 天有 1 个会话,而用户 3 有 2 个会话。所以平均是 (1 + 1 + 2) / 3 = 1.33 。

原站题解

去查看

上次编辑到这里,代码来自缓存 点击恢复默认模板
# Write your MySQL query statement below

pythondata 解法, 执行用时: 312 ms, 内存消耗: 61.4 MB, 提交时间: 2023-10-15 12:58:03

import pandas as pd

def user_activity(activity: pd.DataFrame) -> pd.DataFrame:
    activity.loc[activity['activity_date'].between('2019-06-28','2019-07-27','both'),'final'] = 'true'
    activity = activity.loc[activity['final']=='true']    
    if len(activity) == 0:
        return pd.DataFrame(data=[0.00],columns=['average_sessions_per_user'])
    per = round(len(activity.groupby('session_id').agg('first'))/len(activity.groupby('user_id').agg('first')),2)
    return pd.DataFrame(data=[per],columns=['average_sessions_per_user'])

mysql 解法, 执行用时: 298 ms, 内存消耗: 0 B, 提交时间: 2023-10-15 12:57:33

# Write your MySQL query statement below
SELECT IFNULL(ROUND(COUNT(DISTINCT session_id) / COUNT(DISTINCT user_id), 2), 0) AS average_sessions_per_user
FROM Activity
WHERE DATEDIFF('2019-07-27', activity_date) < 30
-- WHERE activity_date BETWEEN '2019-06-28' AND '2019-07-27'

上一题