每台机器的进程平均运行时间

1661. 每台机器的进程平均运行时间

表: Activity

+----------------+---------+
| Column Name    | Type    |
+----------------+---------+
| machine_id     | int     |
| process_id     | int     |
| activity_type  | enum    |
| timestamp      | float   |
+----------------+---------+
该表展示了一家工厂网站的用户活动.
(machine_id, process_id, activity_type) 是当前表的主键.
machine_id 是一台机器的ID号.
process_id 是运行在各机器上的进程ID号.
activity_type 是枚举类型 ('start', 'end').
timestamp 是浮点类型,代表当前时间(以秒为单位).
'start' 代表该进程在这台机器上的开始运行时间戳 , 'end' 代表该进程在这台机器上的终止运行时间戳.
同一台机器，同一个进程都有一对开始时间戳和结束时间戳，而且开始时间戳永远在结束时间戳前面.

现在有一个工厂网站由几台机器运行，每台机器上运行着相同数量的进程. 请写出一条SQL计算每台机器各自完成一个进程任务的平均耗时.

完成一个进程任务的时间指进程的'end' 时间戳 减去 'start' 时间戳. 平均耗时通过计算每台机器上所有进程任务的总耗费时间除以机器上的总进程数量获得.

结果表必须包含machine_id（机器ID） 和对应的 average time（平均耗时） 别名 processing_time, 且四舍五入保留3位小数.

以 任意顺序 返回表。

具体参考例子如下。

示例 1:

输入：
Activity table:
+------------+------------+---------------+-----------+
| machine_id | process_id | activity_type | timestamp |
+------------+------------+---------------+-----------+
| 0          | 0          | start         | 0.712     |
| 0          | 0          | end           | 1.520     |
| 0          | 1          | start         | 3.140     |
| 0          | 1          | end           | 4.120     |
| 1          | 0          | start         | 0.550     |
| 1          | 0          | end           | 1.550     |
| 1          | 1          | start         | 0.430     |
| 1          | 1          | end           | 1.420     |
| 2          | 0          | start         | 4.100     |
| 2          | 0          | end           | 4.512     |
| 2          | 1          | start         | 2.500     |
| 2          | 1          | end           | 5.000     |
+------------+------------+---------------+-----------+
输出：
+------------+-----------------+
| machine_id | processing_time |
+------------+-----------------+
| 0          | 0.894           |
| 1          | 0.995           |
| 2          | 1.456           |
+------------+-----------------+
解释：
一共有3台机器,每台机器运行着两个进程.
机器 0 的平均耗时: ((1.520 - 0.712) + (4.120 - 3.140)) / 2 = 0.894
机器 1 的平均耗时: ((1.550 - 0.550) + (1.420 - 0.430)) / 2 = 0.995
机器 2 的平均耗时: ((4.512 - 4.100) + (5.000 - 2.500)) / 2 = 1.456

原站题解

去查看

上次编辑到这里，代码来自缓存点击恢复默认模板

# Write your MySQL query statement below

pythondata 解法, 执行用时: 288 ms, 内存消耗: 60 MB, 提交时间: 2023-09-17 10:30:17

import pandas as pd

def get_average_time(activity: pd.DataFrame) -> pd.DataFrame:
    activity["start_time"] = activity.groupby(["machine_id","process_id"])["timestamp"].transform('min')
    activity["processing_time"] = activity["timestamp"] -activity["start_time"]
    df = activity[activity["processing_time"]>0].groupby("machine_id")["processing_time"].agg("mean").reset_index()
    df["processing_time"] = round(df["processing_time"],3)
    return df
   
'''
使用 pivot 将 type 展开，即 start 变一列，end 变一列
算两列的时间差
按机器grouoby算平均值
四舍五入用 round
'''
def get_average_time2(activity: pd.DataFrame) -> pd.DataFrame:
    pivot = pd.pivot(activity, index=['machine_id', 'process_id'], columns='activity_type', values='timestamp').reset_index()
    pivot['run_time'] = pivot['end'] - pivot['start']
    ans = pivot.groupby('machine_id')['run_time'].mean().reset_index()
    ans['run_time'] = ans['run_time'].round(3)
    return ans.rename(columns={'run_time':'processing_time'})

mysql 解法, 执行用时: 157 ms, 内存消耗: 0 B, 提交时间: 2023-04-02 11:36:41

# Write your MySQL query statement below
SELECT machine_id AS 'machine_id',
    ROUND(
        SUM(IF(activity_type = 'start', -timestamp, timestamp))
        / COUNT(DISTINCT process_id) 
        ,3) AS 'processing_time'
FROM Activity
GROUP BY machine_id;

mysql 解法, 执行用时: 181 ms, 内存消耗: 0 B, 提交时间: 2023-04-02 11:36:28

# Write your MySQL query statement below
SELECT machine_id AS 'machine_id',
    ROUND(
        SUM(IF(activity_type = 'start', -timestamp, timestamp))
        / COUNT(*) 
        * 2
        ,3) AS 'processing_time'
FROM Activity
GROUP BY machine_id;

mysql 解法, 执行用时: 164 ms, 内存消耗: 0 B, 提交时间: 2023-04-02 11:36:10

# Write your MySQL query statement below
SELECT a1.machine_id AS 'machine_id',
    ROUND(AVG(a2.timestamp - a1.timestamp) 
        ,3) AS 'processing_time'
FROM Activity AS a1
    INNER JOIN Activity AS a2
    ON a1.machine_id = a2.machine_id
        AND a1.process_id = a2.process_id  
        AND a1.activity_type = 'start' AND a2.activity_type = 'end'
GROUP BY a1.machine_id;

mysql 解法, 执行用时: 223 ms, 内存消耗: 0 B, 提交时间: 2023-04-02 11:35:45

# Write your MySQL query statement below
SELECT
machine_id, round(avg(times) * 2, 3) processing_time
FROM
(SELECT
    machine_id, process_id,
    IF(activity_type = 'start', round(-`timestamp`, 3), round(`timestamp`, 3)) times
    from
    Activity
) t
GROUP BY machine_id;

mysql 解法, 执行用时: 150 ms, 内存消耗: 0 B, 提交时间: 2023-04-02 11:35:25

# Write your MySQL query statement below
select machine_id,
    round(sum(case when activity_type = 'end' then timestamp else -timestamp end)/count(distinct process_id), 3) as processing_time
from activity
group by machine_id;

mysql 解法, 执行用时: 157 ms, 内存消耗: 0 B, 提交时间: 2023-04-02 11:35:09

# Write your MySQL query statement below
select
    machine_id,
    round(avg(if(activity_type='start', -timestamp, timestamp))*2, 3) processing_time 
from Activity 
group by machine_id;

上一题

下一题

详情