DA33. 牛客网连续练习题目3天及以上的用户
描述
现有牛客网12月每天练习题目的数据集nowcoder.csv。包含如下字段(字段之间用逗号分隔):
请你统计2021年12月连续练习题目3天及以上的所有用户。
输出连续3天及以上的用户及对应的连续天数,以上数据集的输出结果如下:
Python 3 解法, 执行用时: 783ms, 内存消耗: 524288KB, 提交时间: 2022-07-11
import pandas as pd from datetime import timedelta nowcoder = pd.read_csv('nowcoder.csv') df = nowcoder.groupby('user_id')["user_id"].count() df = df[df>=3] print(df)
Python 3 解法, 执行用时: 792ms, 内存消耗: 524288KB, 提交时间: 2022-07-14
import pandas as pd from datetime import timedelta nowcoder = pd.read_csv('nowcoder.csv') nowcoder=nowcoder.loc[:,['user_id','date']] nowcoder['date'] = pd.to_datetime(nowcoder['date']).dt.date nowcoder.drop_duplicates(inplace=True) nowcoder['rk']=nowcoder.groupby('user_id')['date'].rank(method='first') nowcoder['rk']=pd.to_timedelta(nowcoder['rk'],unit='d') nowcoder['d']=nowcoder['date']-nowcoder['rk'] df=nowcoder.groupby(['user_id','d'])['date'].count() df=df.groupby('user_id').max() print(df[df>=3])
Python 3 解法, 执行用时: 803ms, 内存消耗: 524288KB, 提交时间: 2022-07-10
import pandas as pd from datetime import timedelta nowcoder = pd.read_csv('nowcoder.csv') # import pandas as pd # from datetime import timedelta # nowcoder = pd.read_csv('nowcoder.csv') df = nowcoder.groupby('user_id').user_id.count() df = df[df>=3] print(df)
Python 3 解法, 执行用时: 804ms, 内存消耗: 524288KB, 提交时间: 2022-07-12
import pandas as pd from datetime import timedelta nowcoder = pd.read_csv('nowcoder.csv') df = nowcoder.groupby('user_id')['user_id'].count() print(df[df>=3])
Python 3 解法, 执行用时: 805ms, 内存消耗: 524288KB, 提交时间: 2022-07-26
import pandas as pd from datetime import timedelta nowcoder = pd.read_csv('nowcoder.csv') nowcoder['date1'] = pd.to_datetime(nowcoder['date']).dt.strftime("%Y-%m") data = nowcoder[nowcoder['date1'] == '2021-12'] data['rk'] = pd.to_timedelta(data.groupby(['user_id']).date1.rank(),unit='d') data['cha'] = data['date1']-data['rk'] data1 = data.groupby(['user_id','cha']).count().groupby('user_id')['rk'].max() print(data1[data1>=3])