【基本工具】S02E17 数据透视表的使用方法

# 0.本集概览

1.透视表的基本使用
2.透视表实现高维度的行列分组
3.透视表实现多属性观察及自定义统计函数

# 1.透视表的使用背景

import numpy as np
import pandas as pd
import seaborn as sns

print(titanic.head())

   survived  pclass     sex   age  sibsp  parch     fare embarked  class  \
0         0       3    male  22.0      1      0   7.2500        S  Third
1         1       1  female  38.0      1      0  71.2833        C  First
2         1       3  female  26.0      0      0   7.9250        S  Third
3         1       1  female  35.0      1      0  53.1000        S  First
4         0       3    male  35.0      0      0   8.0500        S  Third

who adult_male deck  embark_town alive  alone
0    man       True  NaN  Southampton    no  False
1  woman      False    C    Cherbourg   yes  False
2  woman      False  NaN  Southampton   yes   True
3  woman      False    C  Southampton   yes  False
4    man       True  NaN  Southampton    no   True  

print(titanic.groupby('sex')['survived'].mean())

sex
female    0.742038
male      0.188908
Name: survived, dtype: float64

print(titanic.groupby(['sex','class'])['survived'].mean())

sex     class
female  First     0.968085
Second    0.921053
Third     0.500000
male    First     0.368852
Second    0.157407
Third     0.135447
Name: survived, dtype: float64

print(titanic.groupby(['sex','class'])['survived'].mean().unstack())

class      First    Second     Third
sex
female  0.968085  0.921053  0.500000
male    0.368852  0.157407  0.135447

# 2.透视表的高维行列分组

print(titanic.pivot_table('survived', index='sex', columns='class'))

class      First    Second     Third
sex
female  0.968085  0.921053  0.500000
male    0.368852  0.157407  0.135447