Commit 72906dc 1 parent d55bf23 commit 72906dc Copy full SHA for 72906dc
File tree 1 file changed +3
-3
lines changed
1 file changed +3
-3
lines changed Original file line number Diff line number Diff line change 30
30
```
31
31
df_user_register.sample(10)
32
32
```
33
- ![ image ] ( https://github.com/hellobilllee/ActiveUserPrediction/tree/master/ photos/sample.JPG)
33
+ ![ ScreenShot ] ( photos/sample.JPG )
34
34
使用pandas 的describe()函数了解数据基本统计信息了。如:
35
35
``` python
36
36
> des_user_register= df_user_register.describe(include = " all" )
37
37
```
38
- ![ image ] ( https://github.com/hellobilllee/ActiveUserPrediction/tree/master/ photos/describe.JPG)
38
+ ![ ScreenShot ] ( photos/describe.JPG )
39
39
40
40
可以看出注册时间为30天,即一个月数据,注册类型有12种,设备类型有一千多种。注意对于类别性特征,读取数据时需要将该特征的dtype显示设置为str,然后describe()中参数include设置为all,就可以分别得到类别型和数值型特征的统计信息了。以下为读取注册日志代码:
41
41
``` python
@@ -56,7 +56,7 @@ df_user_register.sample(10)
56
56
```
57
57
df_user_register['register_day'].value_counts()
58
58
```
59
- ![ image ] ( https://github.com/hellobilllee/ActiveUserPrediction/tree/master/ photos/value_count.JPG)
59
+ ![ ScreenShot ] ( photos/value_count.JPG )
60
60
61
61
推荐使用seaborn进行更加可视化分析:
62
62
You can’t perform that action at this time.
0 commit comments