1661. 每台机器的进程平均运行时间
表: Activity
+----------------+---------+ | Column Name | Type | +----------------+---------+ | machine_id | int | | process_id | int | | activity_type | enum | | timestamp | float | +----------------+---------+ 该表展示了一家工厂网站的用户活动. (machine_id, process_id, activity_type) 是当前表的主键. machine_id 是一台机器的ID号. process_id 是运行在各机器上的进程ID号. activity_type 是枚举类型 ('start', 'end'). timestamp 是浮点类型,代表当前时间(以秒为单位). 'start' 代表该进程在这台机器上的开始运行时间戳 , 'end' 代表该进程在这台机器上的终止运行时间戳. 同一台机器,同一个进程都有一对开始时间戳和结束时间戳,而且开始时间戳永远在结束时间戳前面.
现在有一个工厂网站由几台机器运行,每台机器上运行着相同数量的进程. 请写出一条SQL计算每台机器各自完成一个进程任务的平均耗时.
完成一个进程任务的时间指进程的'end' 时间戳
减去 'start' 时间戳
. 平均耗时通过计算每台机器上所有进程任务的总耗费时间除以机器上的总进程数量获得.
结果表必须包含machine_id(机器ID)
和对应的 average time(平均耗时) 别名 processing_time
, 且四舍五入保留3位小数.
以 任意顺序 返回表。
具体参考例子如下。
示例 1:
输入: Activity table: +------------+------------+---------------+-----------+ | machine_id | process_id | activity_type | timestamp | +------------+------------+---------------+-----------+ | 0 | 0 | start | 0.712 | | 0 | 0 | end | 1.520 | | 0 | 1 | start | 3.140 | | 0 | 1 | end | 4.120 | | 1 | 0 | start | 0.550 | | 1 | 0 | end | 1.550 | | 1 | 1 | start | 0.430 | | 1 | 1 | end | 1.420 | | 2 | 0 | start | 4.100 | | 2 | 0 | end | 4.512 | | 2 | 1 | start | 2.500 | | 2 | 1 | end | 5.000 | +------------+------------+---------------+-----------+ 输出: +------------+-----------------+ | machine_id | processing_time | +------------+-----------------+ | 0 | 0.894 | | 1 | 0.995 | | 2 | 1.456 | +------------+-----------------+ 解释: 一共有3台机器,每台机器运行着两个进程. 机器 0 的平均耗时: ((1.520 - 0.712) + (4.120 - 3.140)) / 2 = 0.894 机器 1 的平均耗时: ((1.550 - 0.550) + (1.420 - 0.430)) / 2 = 0.995 机器 2 的平均耗时: ((4.512 - 4.100) + (5.000 - 2.500)) / 2 = 1.456
原站题解
pythondata 解法, 执行用时: 288 ms, 内存消耗: 60 MB, 提交时间: 2023-09-17 10:30:17
import pandas as pd def get_average_time(activity: pd.DataFrame) -> pd.DataFrame: activity["start_time"] = activity.groupby(["machine_id","process_id"])["timestamp"].transform('min') activity["processing_time"] = activity["timestamp"] -activity["start_time"] df = activity[activity["processing_time"]>0].groupby("machine_id")["processing_time"].agg("mean").reset_index() df["processing_time"] = round(df["processing_time"],3) return df ''' 使用 pivot 将 type 展开,即 start 变一列,end 变一列 算两列的时间差 按机器grouoby算平均值 四舍五入用 round ''' def get_average_time2(activity: pd.DataFrame) -> pd.DataFrame: pivot = pd.pivot(activity, index=['machine_id', 'process_id'], columns='activity_type', values='timestamp').reset_index() pivot['run_time'] = pivot['end'] - pivot['start'] ans = pivot.groupby('machine_id')['run_time'].mean().reset_index() ans['run_time'] = ans['run_time'].round(3) return ans.rename(columns={'run_time':'processing_time'})
mysql 解法, 执行用时: 157 ms, 内存消耗: 0 B, 提交时间: 2023-04-02 11:36:41
# Write your MySQL query statement below SELECT machine_id AS 'machine_id', ROUND( SUM(IF(activity_type = 'start', -timestamp, timestamp)) / COUNT(DISTINCT process_id) ,3) AS 'processing_time' FROM Activity GROUP BY machine_id;
mysql 解法, 执行用时: 181 ms, 内存消耗: 0 B, 提交时间: 2023-04-02 11:36:28
# Write your MySQL query statement below SELECT machine_id AS 'machine_id', ROUND( SUM(IF(activity_type = 'start', -timestamp, timestamp)) / COUNT(*) * 2 ,3) AS 'processing_time' FROM Activity GROUP BY machine_id;
mysql 解法, 执行用时: 164 ms, 内存消耗: 0 B, 提交时间: 2023-04-02 11:36:10
# Write your MySQL query statement below SELECT a1.machine_id AS 'machine_id', ROUND(AVG(a2.timestamp - a1.timestamp) ,3) AS 'processing_time' FROM Activity AS a1 INNER JOIN Activity AS a2 ON a1.machine_id = a2.machine_id AND a1.process_id = a2.process_id AND a1.activity_type = 'start' AND a2.activity_type = 'end' GROUP BY a1.machine_id;
mysql 解法, 执行用时: 223 ms, 内存消耗: 0 B, 提交时间: 2023-04-02 11:35:45
# Write your MySQL query statement below SELECT machine_id, round(avg(times) * 2, 3) processing_time FROM (SELECT machine_id, process_id, IF(activity_type = 'start', round(-`timestamp`, 3), round(`timestamp`, 3)) times from Activity ) t GROUP BY machine_id;
mysql 解法, 执行用时: 150 ms, 内存消耗: 0 B, 提交时间: 2023-04-02 11:35:25
# Write your MySQL query statement below select machine_id, round(sum(case when activity_type = 'end' then timestamp else -timestamp end)/count(distinct process_id), 3) as processing_time from activity group by machine_id;
mysql 解法, 执行用时: 157 ms, 内存消耗: 0 B, 提交时间: 2023-04-02 11:35:09
# Write your MySQL query statement below select machine_id, round(avg(if(activity_type='start', -timestamp, timestamp))*2, 3) processing_time from Activity group by machine_id;