# Write your MySQL query statement below
1141. 查询近30天活跃用户数
活动记录表:Activity
+---------------+---------+ | Column Name | Type | +---------------+---------+ | user_id | int | | session_id | int | | activity_date | date | | activity_type | enum | +---------------+---------+ 该表是用户在社交网站的活动记录。 该表没有主键,可能包含重复数据。 activity_type 字段为以下四种值 ('open_session', 'end_session', 'scroll_down', 'send_message')。 每个 session_id 只属于一个用户。
请写SQL查询出截至 2019-07-27
(包含2019-07-27),近 30
天的每日活跃用户数(当天只要有一条活动记录,即为活跃用户)。
以 任意顺序 返回结果表。
查询结果示例如下。
示例 1:
输入: Activity table: +---------+------------+---------------+---------------+ | user_id | session_id | activity_date | activity_type | +---------+------------+---------------+---------------+ | 1 | 1 | 2019-07-20 | open_session | | 1 | 1 | 2019-07-20 | scroll_down | | 1 | 1 | 2019-07-20 | end_session | | 2 | 4 | 2019-07-20 | open_session | | 2 | 4 | 2019-07-21 | send_message | | 2 | 4 | 2019-07-21 | end_session | | 3 | 2 | 2019-07-21 | open_session | | 3 | 2 | 2019-07-21 | send_message | | 3 | 2 | 2019-07-21 | end_session | | 4 | 3 | 2019-06-25 | open_session | | 4 | 3 | 2019-06-25 | end_session | +---------+------------+---------------+---------------+ 输出: +------------+--------------+ | day | active_users | +------------+--------------+ | 2019-07-20 | 2 | | 2019-07-21 | 2 | +------------+--------------+ 解释:注意非活跃用户的记录不需要展示。
原站题解
pythondata 解法, 执行用时: 276 ms, 内存消耗: 60.8 MB, 提交时间: 2023-09-17 10:34:42
import pandas as pd ''' 先筛选出符合要求的时间,顺便改下列名 使用 groupby 按日期聚合, 使用 nunique 计算 unique 的数量 ''' def user_activity(activity: pd.DataFrame) -> pd.DataFrame: activity = activity[(activity['activity_date']>'2019-06-27') & (activity['activity_date']<='2019-07-27')] \ .rename(columns={'activity_date':'day','user_id':'active_users'}) return activity.groupby('day')['active_users'].nunique().reset_index() def user_activity2(activity: pd.DataFrame) -> pd.DataFrame: #一个用户在同一天只能算一个活跃用户 activity.drop_duplicates(inplace = True,subset = ["user_id","activity_date"],keep = "first") #限制日期范围 activity = activity[(activity['activity_date'] <= pd.to_datetime("2019-07-27")) & (activity['activity_date'] >= pd.to_datetime("2019-6-28"))] #统计活跃用户数 date_count = activity.value_counts("activity_date").reset_index().rename(axis = 1,mapper = {"activity_date":"day","count":"active_users"}) return date_count
mysql 解法, 执行用时: 539 ms, 内存消耗: 0 B, 提交时间: 2022-06-06 10:21:18
# Write your MySQL query statement below select activity_date as day, count(distinct user_id) as active_users from activity where activity_date between '2019-06-28' and '2019-07-27' group by activity_date;
mysql 解法, 执行用时: 488 ms, 内存消耗: 0 B, 提交时间: 2022-05-27 15:05:45
# Write your MySQL query statement below select activity_date as day, count(distinct user_id) as active_users from activity where activity_date between '2019-06-28' and '2019-07-27' group by activity_date;