【python讲概率】S05E07 大数定律与中心极限定理

### 2.大数定理

#### 2.1.原理介绍

$E[M_n]=E[ \frac{X_1+X_2+...+X_n}{n}]$ $=\frac{1}{n}(E[X_1]+E[X_2]+...+E[X_n])$ $=\frac{1}{n}\cdot n\cdot \mu=\mu=E[X_i]$

$var[M_n]=var[ \frac{X_1+X_2+...+X_n}{n}]$ $=\frac{1}{n^2}var[X_1+X_2+...+X_n]$ $=\frac{1}{n^2}(var[X_1]+var[X_2]+...+var[X_n])$ $=\frac{1}{n^2}\cdot n\cdot \sigma^2=\frac{\sigma^2}{n}$

#### 2.2.两个重要的不等关系

$P(|M_n-\mu| \ge \epsilon) \le \frac{\sigma^2}{n \epsilon^2}$

$P(|X-\mu|\ge c) \le \frac{\sigma^2}{c^2}$

#### 2.3.大数定理的模拟

import numpy as np
from scipy.stats import binom
import matplotlib.pyplot as plt
import seaborn
seaborn.set()

n = 10
p = 0.4
sample_size = 15000
expected_value = n*p
N_samples = range(1, sample_size, 10)

for k in range(3):
binom_rv = binom(n=n, p=p)
X = binom_rv.rvs(size=sample_size)
sample_average = [X[:i].mean() for i in N_samples]
plt.plot(N_samples, sample_average,
label='average of sample {}'.format(k))

plt.plot(N_samples, expected_value * np.ones_like(sample_average),
ls='--', label='true expected value:np={}'.format(n*p), c='k')

plt.legend()
plt.show()

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
import seaborn
seaborn.set()

norm_rvs = norm(loc=0, scale=20).rvs(size=1000000)
plt.hist(norm_rvs, normed=True, alpha=0.3, color='b', bins=100, label='original')

mean_array = []
for i in range(10000):
sample = np.random.choice(norm_rvs, size=5, replace=False)
mean_array.append(np.mean(sample))
plt.hist(mean_array, normed=True, alpha=0.3, color='r', bins=100, label='sample size=5')

for i in range(10000):
sample = np.random.choice(norm_rvs, size=50, replace=False)
mean_array.append(np.mean(sample))
plt.hist(mean_array, normed=True, alpha=0.3, color='g', bins=100, label='sample size=50')

plt.gca().axes.set_xlim(-60, 60)
plt.legend(loc='best')
plt.show()