列表

详情


1141. 查询近30天活跃用户数

活动记录表:Activity

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| user_id       | int     |
| session_id    | int     |
| activity_date | date    |
| activity_type | enum    |
+---------------+---------+
该表是用户在社交网站的活动记录。
该表没有主键,可能包含重复数据。
activity_type 字段为以下四种值 ('open_session', 'end_session', 'scroll_down', 'send_message')。
每个 session_id 只属于一个用户。

 

请写SQL查询出截至 2019-07-27(包含2019-07-27),近 30 天的每日活跃用户数(当天只要有一条活动记录,即为活跃用户)。

任意顺序 返回结果表。

查询结果示例如下。

 

示例 1:

输入:
Activity table:
+---------+------------+---------------+---------------+
| user_id | session_id | activity_date | activity_type |
+---------+------------+---------------+---------------+
| 1       | 1          | 2019-07-20    | open_session  |
| 1       | 1          | 2019-07-20    | scroll_down   |
| 1       | 1          | 2019-07-20    | end_session   |
| 2       | 4          | 2019-07-20    | open_session  |
| 2       | 4          | 2019-07-21    | send_message  |
| 2       | 4          | 2019-07-21    | end_session   |
| 3       | 2          | 2019-07-21    | open_session  |
| 3       | 2          | 2019-07-21    | send_message  |
| 3       | 2          | 2019-07-21    | end_session   |
| 4       | 3          | 2019-06-25    | open_session  |
| 4       | 3          | 2019-06-25    | end_session   |
+---------+------------+---------------+---------------+
输出:
+------------+--------------+ 
| day        | active_users |
+------------+--------------+ 
| 2019-07-20 | 2            |
| 2019-07-21 | 2            |
+------------+--------------+ 
解释:注意非活跃用户的记录不需要展示。

原站题解

去查看

上次编辑到这里,代码来自缓存 点击恢复默认模板
# Write your MySQL query statement below

pythondata 解法, 执行用时: 276 ms, 内存消耗: 60.8 MB, 提交时间: 2023-09-17 10:34:42

import pandas as pd

'''
先筛选出符合要求的时间,顺便改下列名
使用 groupby 按日期聚合, 使用 nunique 计算 unique 的数量
'''
def user_activity(activity: pd.DataFrame) -> pd.DataFrame:
    activity =  activity[(activity['activity_date']>'2019-06-27') & (activity['activity_date']<='2019-07-27')] \
    .rename(columns={'activity_date':'day','user_id':'active_users'})
    return activity.groupby('day')['active_users'].nunique().reset_index()
    


def user_activity2(activity: pd.DataFrame) -> pd.DataFrame:
    #一个用户在同一天只能算一个活跃用户
    activity.drop_duplicates(inplace = True,subset = ["user_id","activity_date"],keep = "first")
    #限制日期范围
    activity = activity[(activity['activity_date'] <= pd.to_datetime("2019-07-27")) & (activity['activity_date'] >= pd.to_datetime("2019-6-28"))]
    #统计活跃用户数
    date_count = activity.value_counts("activity_date").reset_index().rename(axis = 1,mapper = {"activity_date":"day","count":"active_users"})
    
    return date_count

mysql 解法, 执行用时: 539 ms, 内存消耗: 0 B, 提交时间: 2022-06-06 10:21:18

# Write your MySQL query statement below
select activity_date as day, count(distinct user_id) as active_users from activity where activity_date between '2019-06-28' and '2019-07-27' group by activity_date;

mysql 解法, 执行用时: 488 ms, 内存消耗: 0 B, 提交时间: 2022-05-27 15:05:45

# Write your MySQL query statement below
select activity_date as day, count(distinct user_id) as active_users from activity where activity_date between '2019-06-28' and '2019-07-27' group by activity_date;

上一题