three-distributions
前置知识
\(\Gamma\)函数
\[ \Gamma(x) = \int_0^{\infty}e^{-t}t^{x-1}\mathrm{d}t \]
\(\Beta\)函数
\[ \Beta(x, y) = \int_0^1 t^{x-1}(1-t)^{y-1}\mathrm{d}t \]
有 \(\Beta(x, y) = \frac{\Gamma{(x)}\Gamma{(y)}}{\Gamma(x+y)}\)。
证明
略。
随机变量函数的概率密度函数
设 \((X_1, X_2)\)的密度函数为 \(f(x_1, x_2)\),\(Y_1, Y_2\)都是 \((X_1, X_2)\)的函数,有: \[ Y_1 = g_1(X_1, X_2), \quad Y_2 = g_2(X_1, X_2) \] 要求 \((Y_1, Y_2)\)的概率密度函数 \(l(y_1, y_2)\)。
有逆变换: \[ X_1 = h_1(Y_1, Y_2), \quad X_2 = h_2(Y_1, Y_2) \] 得到雅克比行列式: \[ J(y_1, y_2) = \begin{vmatrix} \frac{\partial h_1}{\partial y_1}& \frac{\partial h_1}{\partial y_2} \\ \frac{\partial h_2}{\partial y_1}& \frac{\partial h_2}{\partial y_2} \\ \end{vmatrix} \] 由[1] p81 (4.11),可以得到 \[ l(y_1, y_2) = f(h_1(y_1, y_2), h_2(y_1, y_2))|J(y_1, y_2)| \]
随机变量和的概率密度函数
略
随机变量商的概率密度函数
设 \((X_1, X_2)\)的密度函数为 \(f(x_1, x_2)\),求 \(Y = X_2/X_1\)的概率密度函数。
可以假设 \(Y_1 = X_2/X_1, Y_2 = X_1\),解出 \(X_1 = Y_2\)和 \(X_2 = Y_1Y_2\)。
解得: \[ J(y_1, y_2) = \begin{vmatrix} \frac{\partial h_1}{\partial y_1}& \frac{\partial h_1}{\partial y_2} \\ \frac{\partial h_2}{\partial y_1}& \frac{\partial h_2}{\partial y_2} \\ \end{vmatrix} = \begin{vmatrix} 0 & 1 \\ y_2 & y_1 \\ \end{vmatrix} = -y_2 \]
\[ l(y_1, y_2) = f(y_2, y_1y_2)|-y_2| = f(y_2, y_1y_2)|y_2| \]
\[ l(y) = l(y_1) = \int_{-\infty}^{\infty} l(y_1, y_2) \mathrm{d}y_2 = \int_{-\infty}^{\infty} f(y_2, y_1y_2)|y_2| \mathrm{d}y_2 = \int_{-\infty}^{\infty} f(x, yx)|x| \mathrm{d}x \]
如果 \(X_1, X_2\) 相互独立,则 \(f(x_1, x_2) = f_1(x_1) f_2(x_2)\),上式成为: \[ \int_{-\infty}^{\infty} f(x, yx)|x| \mathrm{d}x = \int_{-\infty}^{\infty} f_1(x) f_2(yx)|x| \mathrm{d}x \] 如果假设 \(Y_1 = X_2/X_1, Y_2 = X_2\),解出 \(X_1 = Y_2/Y_1\)和 \(X_2 = Y_2\)。
解得: \[ J(y_1, y_2) = \begin{vmatrix} \frac{\partial h_1}{\partial y_1}& \frac{\partial h_1}{\partial y_2} \\ \frac{\partial h_2}{\partial y_1}& \frac{\partial h_2}{\partial y_2} \\ \end{vmatrix} = \begin{vmatrix} -\frac{y_2}{y_1^2} & \frac{1}{y_1} \\ 0 & 1 \\ \end{vmatrix} = -\frac{y_2}{y_1^2} \]
\[ l(y_1, y_2) = f(y_2/y_1, y_2)|-\frac{y_2}{y_1^2}| = f(y_2/y_1, y_2)|\frac{y_2}{y_1^2}| \]
\[ l(y) = l(y_1) = \int_{-\infty}^{\infty} l(y_1, y_2) \mathrm{d}y_2 = \int_{-\infty}^{\infty} f(y_2/y_1, y_2)|\frac{y_2}{y_1^2}| \mathrm{d}y_2 = \int_{-\infty}^{\infty} f(x/y, x)|\frac{x}{y^2}| \mathrm{d}x \]
[2]使用 \(X,Y,Z\)来表示随机变量,与[2] p79的结果仅仅是符号上的区别。
三大分布
\(\chi^2\)分布
定义\(\chi^2\)分布,其概率密度函数为: \[ k_n(x) = \begin{cases} \frac{1}{\Gamma(\frac{n}{2})2^{n/2}}e^{-x/2}x^{(n-2)/2}, & x > 0\\ 0, & x \leq 0 \end{cases} \] 首先证明这个函数可以作为概率密度函数。 \[ \int_0^{\infty}e^{-x/2}x^{(n-2)/2}\mathrm{d}x \overset{x = 2t}{=} \int_0^{\infty}e^{-t}(2t)^{(n-2)/2}\mathrm{d}(2t) = 2^{n/2}\int_0^{\infty}e^{-t}t^{(n-2)/2}\mathrm{d}t = \Gamma(\frac{n}{2})2^{n/2} \] 和前面的系数相乘为1。
定理
若 \(X_1,\cdots,X_n\)相互独立,都服从正态分布 \(N(0,1)\),则 \(Y = X_1^2 + \cdots + X_n^2\) 服从自由度为 \(n\)的\(\chi^2\)分布。
等价于证明 \(Y\) 的概率密度函数为 \(k_n(x)\),使用数学归纳法证明。
当 \(n=1\)时:
令 \(Y = X_1^2\),则\(Y\)的概率密度函数为 \(k_1(x)\),\(\Gamma(\frac{1}{2}) = \sqrt{\pi}\) 。 \[ k_1(x) = \begin{cases} \frac{1}{\Gamma(\frac{1}{2})2^{1/2}}e^{-x/2}x^{(1-2)/2} = \frac{1}{\sqrt{2\pi x}}e^{-x/2}, & x > 0\\ 0, & x \leq 0 \end{cases} \] 由[1] p80 例4.5,可知成立。
现在证明当 \(n-1\)成立时,\(n\)也成立。
令 \(Y = Z + X_n^2\),其中 \(Z = X_1^2 + \cdots + X_{n-1}^2\),因为 \(n-1\)时成立,所以\(Z\)具有概率密度函数 \(k_{n-1}(x)\),我们需要证明\(Y\)具有概率密度函数\(k_n(x)\)。
因为\(Z\)和\(X_n\)相互独立,由随机变量和的概率密度函数可得\(Y\)的概率密度函数: \[ l(y) = \int_{-\infty}^{\infty}k_{n-1}(x)k_1(y-x)\mathrm{d}x = \int_{0}^{y}k_{n-1}(x)k_1(y-x)\mathrm{d}x \] 后一项是因为\(k_n(t)\)在\(t > 0\)时才不为0。 \[ \begin{equation} \begin{split} l(y) &= \int_{0}^{y} k_{n-1}(x)k_1(y-x) \mathrm{d}x \\ &= \int_{0}^{y} \frac{1}{\Gamma(\frac{n-1}{2})2^{(n-1)/2}}e^{-x/2}x^{(n-1-2)/2} \frac{1}{\Gamma(\frac{1}{2})2^{1/2}}e^{-(y-x)/2}(y-x)^{(1-2)/2} \mathrm{d}x \\ &= \frac{1}{\Gamma(\frac{n-1}{2})2^{(n-1)/2}\Gamma(\frac{1}{2})2^{1/2}} e^{-y/2} \int_{0}^{y} x^{(n-3)/2} (y-x)^{-1/2}\mathrm{d}x \end{split} \end{equation} \] 求后一部分的积分: \[ \begin{equation} \begin{split} \int_{0}^{y} x^{(n-3)/2} (y-x)^{-1/2}\mathrm{d}x &\overset{x = yt}{=} \int_{0}^{1} (yt)^{(n-3)/2} (y-(yt))^{-1/2}\mathrm{d}(yt) \\ &= y^{(n-3)/2} y^{-1/2} y\int_{0}^{1} (t)^{(n-3)/2} (1-t)^{-1/2}\mathrm{d}t \\ &= y^{(n-2)/2}\int_{0}^{1} (t)^{(n-3)/2} (1-t)^{-1/2}\mathrm{d}t \\ &= y^{(n-2)/2}\int_{0}^{1} (t)^{\frac{(n-1)}{2} - 1} (1-t)^{\frac{1}{2} - 1}\mathrm{d}t \\ &= y^{(n-2)/2} \Beta(\frac{n-1}{2}, \frac{1}{2}) \\ &= y^{(n-2)/2} \frac{\Gamma(\frac{n-1}{2})\Gamma(\frac{1}{2})}{\Gamma(\frac{n}{2})} \end{split} \end{equation} \] 代入上一部分,得到:
\[ \begin{equation} \begin{split} l(y) &= \frac{1}{\Gamma(\frac{n-1}{2})2^{(n-1)/2}\Gamma(\frac{1}{2})2^{1/2}} e^{-y/2} \int_{0}^{y} x^{(n-3)/2} (y-x)^{-1/2}\mathrm{d}x \\ &= \frac{1}{\Gamma(\frac{n-1}{2})2^{(n-1)/2}\Gamma(\frac{1}{2})2^{1/2}} e^{-y/2} y^{(n-2)/2} \frac{\Gamma(\frac{n-1}{2})\Gamma(\frac{1}{2})}{\Gamma(\frac{n}{2})} \\ &= \frac{1}{\Gamma(\frac{n}{2})2^{n/2}} e^{-y/2} y^{(n-2)/2} \\ &= k_n(x) \end{split} \end{equation} \] 得证。
t分布
设 \(X_1, X_2\)独立,\(X_1 \sim \chi_n^2\),\(X_2 \sim N(0,1)\),而 \(Y = \frac{X_2}{\sqrt{X_1/n}}\),求 \(Y\)的概率密度函数。
先求 \(Z = \sqrt{X_1 /n }\)的概率密度函数 \(g(z)\)。 \[ P(Z \leq z) = P(\sqrt{X_1 /n } \leq z) = P(X_1 \leq nz^2) = \int_0^{nz^2}k_n(x) \mathrm{d}x \] 对\(z\)求导,得到: \[ g(x) = 2nz\text{k}_n(nz^2) \] 现有两个随机变量的概率密度函数: \[ f_1(x_1) =2nx_1k_n(nx_1^2)\\ \\ f_2(x_2) = \frac{1}{\sqrt{2 \pi}}e^{-\frac{x_2^2}{2}} \] 应用随机变量商的概率密度公式,可得\(Y\)的概率密度公式,记为 \(t_n(y)\): \[ \int_{-\infty}^{\infty} f(x, yx)|x| \mathrm{d}x = \int_{-\infty}^{\infty} f_1(x) f_2(yx)|x| \mathrm{d}x \] 考虑到 \(x > 0\): \[ \begin{split} t_n(y) & = \int_0^{\infty} {\color{red}2nxk_n(nx^2)} {\color{green} \frac{1}{\sqrt{2 \pi}}e^{-\frac{(yx)^2}{2}}} {\color{blue} x} \mathrm{d}x \\ & = \int_0^{\infty} {\color{red}2nx \frac{1}{\Gamma(\frac{n}{2})2^{\frac{n}{2} }}e^{-\frac{nx^2}{2}}(nx^2)^{\frac{(n-2)}{2} }} {\color{green} \frac{1}{\sqrt{2 \pi}}e^{-\frac{(yx)^2}{2}}} {\color{blue} x} \mathrm{d}x \\ & = \frac{1}{\Gamma(\frac{n}{2})2^{\frac{n}{2}}} \frac{1}{\sqrt{2 \pi}} \int_0^{\infty} {\color{red}2nx e^{-\frac{nx^2}{2}}(nx^2)^{\frac{(n-2)}{2} }} {\color{green} e^{-\frac{(yx)^2}{2}}} {\color{blue} x}\mathrm{d}x \\ & = \frac{1}{\Gamma(\frac{n}{2})2^{\frac{n}{2}}} \frac{1}{\sqrt{2 \pi}} 2n^{\frac{n}{2}} \int_0^{\infty} {\color{red}x e^{-\frac{nx^2}{2}}(x^2)^{\frac{(n-2)}{2} }} {\color{green} e^{-\frac{(yx)^2}{2}}} {\color{blue} x}\mathrm{d}x \\ & = \frac{1}{\Gamma(\frac{n}{2})2^{\frac{n}{2}}} \frac{1}{\sqrt{2 \pi}} 2n^{\frac{n}{2}} \int_0^{\infty} x^n {\color{red} e^{-\frac{nx^2}{2}}} {\color{green} e^{-\frac{(yx)^2}{2}}} \mathrm{d}x \\ & = \frac{1}{\Gamma(\frac{n}{2})2^{\frac{n}{2}}} \frac{1}{\sqrt{2 \pi}} 2n^{\frac{n}{2}} \int_0^{\infty} x^n { e^{-\frac{nx^2 + (xy)^2}{2}}} \mathrm{d}x \\ & = \frac{1}{\Gamma(\frac{n}{2})2^{\frac{n}{2}}} \frac{1}{\sqrt{2 \pi}} 2n^{\frac{n}{2}} \int_0^{\infty} x^n { e^{-\frac{1}{2}(n + y^2)x^2}} \mathrm{d}x \\ \end{split} \] 做变量替换,令: \[ x = (\frac{2}{n+y^2})^{\frac{1}{2}}t^{^{\frac{1}{2}}} \] 代入上式的积分部分,得到: \[ \begin{split} \int_0^{\infty} {\color{red} x^n} {\color{green} e^{-\frac{1}{2}(n + y^2)x^2}} \mathrm{d}x &= \int_0^{\infty} {\color{red} (\frac{2}{n+y^2})^{\frac{n}{2}}t^{^{\frac{n}{2}}}} {\color{green} \exp{(-\frac{1}{2}(n + y^2) (\frac{2}{n+y^2})t)}} \mathrm{d} (\frac{2}{n+y^2})^{\frac{1}{2}}t^{^{\frac{1}{2}}} \\ &= (\frac{2}{n+y^2})^{\frac{n+1}{2}} \int_0^{\infty} {\color{red} t^{^{\frac{n}{2}}}} {\color{green} e^{-t}} \mathrm{d}t^{^{\frac{1}{2}}} \\ &= \frac{1}{2} (\frac{2}{n+y^2})^{\frac{n+1}{2}} \int_0^{\infty} {\color{red} t^{^{\frac{n-1}{2}}}} {\color{green} e^{-t}} \mathrm{d}t \\ &= \frac{1}{2} (\frac{2}{n+y^2})^{\frac{n+1}{2}} \Gamma{(\frac{n+1}{2})} \\ \end{split} \] 代入\(t_n(y)\)的表达式中 \[ \begin{split} t_n(y) & = \frac{1}{\Gamma(\frac{n}{2})2^{\frac{n}{2}}} \frac{1}{\sqrt{2 \pi}} 2n^{\frac{n}{2}} \int_0^{\infty} x^n { e^{-\frac{1}{2}(n + y^2)x^2}} \mathrm{d}x \\ & = \frac{1}{\Gamma(\frac{n}{2})2^{\frac{n}{2}}} \frac{1}{\sqrt{2 \pi}} 2n^{\frac{n}{2}} {\color{red} \frac{1}{2} (\frac{2}{n+y^2})^{\frac{n+1}{2}} \Gamma{(\frac{n+1}{2})} } \\ &= \frac{\Gamma{(\frac{n+1}{2})}}{\Gamma(\frac{n}{2})(n\pi)^{\frac{1}{2}}}(1 + \frac{y^2}{n})^{-\frac{n+1}{2}} \end{split} \]
F分布
设 \(x_1, x_2\)独立,\(X_1 \sim \chi_n^2\),\(X_2 \sim \chi_m^2\),令 \(Y = \frac{X_2 m^{-1}}{X_1 n^{-1}}\),求\(Y\)的概率密度函数。
注意相关的独立条件。
求出 \(Z = n^{-1}X_1\)的概率密度函数: \[ P(Z \leq z) = P(n^{-1}X_1 \leq z ) = P(X_1 \leq nz) = \int_{0}^{nz}k_n(x) \mathrm{d}x \] 对\(z\)求导,得到\(Z\)的概率密度函数: \[ g(z) = nk_n(nz) \] 由此得到 \(n^{-1}X_1\) 和 \(m^{-1} X_2\)的概率密度函数分别为 \(nk_n(nx_1),mk_m(mx_2)\)。使用随机变量商的概率密度公式,求\(Y\)的概率密度函数,记为\(f_{mn}(y)\),\(m\)在前,是分子\(X_2\)的自由度。 \[ \begin{split} f_{mn}(y) &= \int_{-\infty}^{\infty} f_1(x) f_2(yx)|x| \mathrm{d}x \\ &= \int_{0}^{\infty} f_1(x) f_2(yx)x \mathrm{d}x \\ &= \int_{0}^{\infty} {\color{red} nk_n(nx)} {\color{green} mk_m(mxy)} x\mathrm{d}x \\ &= mn\frac{1}{\Gamma(\frac{m}{2})2^{\frac{m}{2}} \Gamma(\frac{n}{2})2^{\frac{n}{2}}}\int_{0}^{\infty} {\color{red} e^{-\frac{nx}{2}} (nx)^{\frac{n-2}{2}}} {\color{green} e^{-\frac{mxy}{2}} (mxy)^{\frac{m-2}{2}}} x\mathrm{d}x \\ &= mn\frac{1}{\Gamma(\frac{m}{2})2^{\frac{m}{2}} \Gamma(\frac{n}{2})2^{\frac{n}{2}}} (n)^{\frac{n-2}{2}} (my)^{\frac{m-2}{2}} \int_{0}^{\infty} {\color{red} e^{-\frac{nx}{2}} (x)^{\frac{n-2}{2}}} {\color{green} e^{-\frac{mxy}{2}} (x)^{\frac{m-2}{2}}} x\mathrm{d}x \\ &= \frac{1}{\Gamma(\frac{m}{2})2^{\frac{m}{2}} \Gamma(\frac{n}{2})2^{\frac{n}{2}}} n^{\frac{n}{2}} m^{\frac{m}{2}} y^{\frac{m}{2}-1} \int_{0}^{\infty} {\color{red} e^{-\frac{nx}{2}} (x)^{\frac{n-2}{2}}} {\color{green} e^{-\frac{mxy}{2}} (x)^{\frac{m-2}{2}}} x\mathrm{d}x \\ &= \frac{1}{\Gamma(\frac{m}{2})2^{\frac{m}{2}} \Gamma(\frac{n}{2})2^{\frac{n}{2}}} n^{\frac{n}{2}} m^{\frac{m}{2}} y^{\frac{m}{2}-1} \int_{0}^{\infty} { e^{-\frac{(n + my)x}{2}} x^{\frac{m+n}{2}-1}} \mathrm{d}x \\ \end{split} \] 做变量替换: \[ t = \frac{(my+n)x}{2}\\ x = \frac{2t}{my + n} \]
\[ \begin{split} \int_{0}^{\infty} { e^{-\frac{(n + my)x}{2}} x^{\frac{m+n}{2}-1}} \mathrm{d}x &= \int_{0}^{\infty} e^{- \color{red} t} {\color{red} (\frac{2t}{my + n})}^{\frac{m+n}{2}-1} \mathrm{d} {\color{red} (\frac{2t}{my + n})}\\ &= {\color{red} (\frac{2}{my + n})}^{\frac{m+n}{2}} \mathrm{d} \int_{0}^{\infty} e^{- \color{red} t} t^{\frac{m+n}{2}-1} \mathrm{d} t\\ &= {\color{red} (\frac{2}{my + n})}^{\frac{m+n}{2}} \Gamma(\frac{m+n}{2})\\ \end{split} \]
代入上式,得: \[ \begin{split} f_{mn}(y) &= \frac{1}{\Gamma(\frac{m}{2})2^{\frac{m}{2}} \Gamma(\frac{n}{2})2^{\frac{n}{2}}} n^{\frac{n}{2}} m^{\frac{m}{2}} y^{\frac{m}{2}-1} \int_{0}^{\infty} { e^{-\frac{(n + my)x}{2}} x^{\frac{m+n}{2}-1}} \mathrm{d}x \\ &= \frac{1}{\Gamma(\frac{m}{2})2^{\frac{m}{2}} \Gamma(\frac{n}{2})2^{\frac{n}{2}}} n^{\frac{n}{2}} m^{\frac{m}{2}} y^{\frac{m}{2}-1} {\color{red} (\frac{2}{my + n})^{\frac{m+n}{2}} \Gamma(\frac{m+n}{2})} \\ &= \frac{\Gamma(\frac{m+n}{2})}{\Gamma(\frac{m}{2}) \Gamma(\frac{n}{2})} n^{\frac{n}{2}} m^{\frac{m}{2}} y^{\frac{m}{2}-1} {\color{red} (\frac{1}{my + n})^{\frac{m+n}{2}} } \\ \end{split} \]
参考资料
[1] 概率论与数理统计 陈希孺 中国科学技术大学出版社 2.4
[2] 概率论与数理统计 浙大4版