Riemann surface
matlab

Exercise 4.6
We consider the GLM
$$Y_i \stackrel{\text { ind }}{\sim} \text { Poisson }$$
where
$$g\left(\mu_i\right)=g\left(\mathbb{E}\left(Y_i\right)\right)=\beta_0+\beta_1 x_i$$
$\mu_i=\mathbb{E} Y_i$, with $g(\cdot)=\log (\cdot)$, i.e. the canonical link, for $i \in\left[n_A+n_B\right]$, and $x_i=0$ for $i=1, \ldots, n_A$ and $x_i=1$ for $i=n_A+1, \ldots, n_A+n_B$. We shall show that the likelihood equations (that give us the maximum likelihood estimators for $\beta_0$ and $\beta_1$ ) imply that
\begin{aligned} & \hat{\mu}i=\hat{\beta}_0+\hat{\beta}_1 x_i=\bar{Y}_A:=\frac{1}{n_A} \sum{i=1}^{n_A} Y_i, \quad \text { for } i \leq n_A, \text { and } \ & \hat{\mu}i=\hat{\beta}_0+\hat{\beta}_1 x_i=\bar{Y}_B:=\frac{1}{n_B} \sum{i=n_A+1}^{n_A+n_b} Y_i, \quad \text { for } i>n_A . \end{aligned}
Recall the likelihood equations (4.10), which say that the MLEs $\hat{\beta}$ of $\beta$ satisfy
\begin{aligned} \frac{\partial \ell}{\partial \beta_j}(\hat{\beta}) & =\sum_{i=1}^n \frac{Y_i-\mu_i(\hat{\beta})}{\operatorname{Var}\left(Y_i\right)} \frac{\partial \mu_i}{\partial \eta_i}(\hat{\beta}) x_{i, j} \ & =0, \end{aligned}
for $j=0,1$. Note that, in the notation of the book, $x_{i, 0}=1$ and $x_{i, 1}=x_i$, where $x_i$ is as above.
We have previously seen that, for the Poisson distribution, $\mathbb{E}\left(Y_i\right)=\operatorname{Var}\left(Y_i\right)=\mu_i$. Furthermore, since $g(\cdot)=\log (\cdot)$, we have that $\mu_i=g^{-1}\left(\eta_i\right)=\exp \left(\eta_i\right)$. Hence, $\frac{\partial \mu_i}{\partial \eta_i}=\exp \left(\eta_i\right)=\mu_i$. Inserting into (1), we get

that the fitted values $\hat{\mu}=\mu_i(\hat{\beta})$ satisfy
\begin{aligned} \frac{\partial \ell}{\partial \beta_0}(\hat{\beta}) & =\sum_{i=1}^{n_A+n_B} \frac{Y_i-\hat{\mu}i}{\operatorname{Var}\left(Y_i\right)} \frac{\partial \mu_i}{\partial \eta_i}(\hat{\beta}) \ & =\sum{i=1}^{n_A+n_B} Y_i-\hat{\mu}i \ & =0 . \end{aligned} Hence, since the fitted values must be the same for all $i \leq n_A$ and for all $i>n_A$ (the values of the x’s are the same), we obtain $$n_A \hat{\mu}_A+n_B \hat{\mu}_B=n_A \bar{Y}_A+n_B \bar{Y}_B$$ where $\hat{\mu}_A$ is the common value of $\hat{\mu}_i$ for $i \leq n_A, \hat{\mu}_B$ is the common value of $\hat{\mu}_i$ for $i>n_A, \bar{Y}_A:=\frac{1}{n_A} \sum{i=1}^{n_A} Y_i$, and $\bar{Y}B:=\frac{1}{n_B} \sum{i=n_A+1}^{n_A+n_B} Y_i$.
Similarly, we obtain
\begin{aligned} \frac{\partial \ell}{\partial \beta_1}(\hat{\beta}) & =\sum_{i=1}^{n_A+n_B} \frac{Y_i-\hat{\mu}i}{\operatorname{Var}\left(Y_i\right)} \frac{\partial \mu_i}{\partial \eta_i}(\hat{\beta}) x_i \ & =\sum{i=1}^{n_A+n_B}\left(Y_i-\hat{\mu}i\right) x_i \ & =\sum{i=1}^{n_A}\left(Y_i-\hat{\mu}_i\right) \ & =0 . \end{aligned}
The equation implies that $\hat{\mu}_A=\bar{Y}_A$. Combining this with the previous equation, we obtain that $\hat{\mu}_B=\bar{Y}_B$ as well.

Exercise 4.9
Recall that, when we have fitted a GLM via Maximum Likelihood Estimation, then
$$\hat{\beta} \stackrel{d}{\approx} \mathrm{N}\left(\beta,\left(X^{\mathrm{\top}} W X\right)^{-1}\right)$$
where $\stackrel{d}{\approx}$ means “approximately distributed as”, $\beta$ is the true underlying regression coefficient, $X$ is the model matrix, and $W$ is a diagonal matrix with the $(i, i)$-th element given by $W_{i, i}=\left(\frac{\partial \mu_i}{\partial \eta_i}\right)^2 / \operatorname{Var}\left(Y_i\right)$. This follows from general Maximum Likelihood theory, which we will discuss some more later in the solutions (for those who are curious).

Suppose that our GLM is a Normal Linear Model, i.e., $Y_i \stackrel{\text { ind }}{\sim} \mathrm{N}\left(\mu_i, \sigma^2\right), \mu_i=\eta_i=x_i^{\mathrm{T}} \beta$, for $i \in[n]$, and $\varphi=\sigma^2$. We shall find the asymptotic variance $\left(X^{\mathrm{T}} W X\right)^{-1}$ in this particular case.

Since $\mu_i=\eta_i$ we have that $\frac{\partial \mu_i}{\partial \eta_i}=1$. Also, $\operatorname{Var}\left(Y_i\right)=\sigma^2(=\varphi)$, and so we get that $W=\frac{1}{\sigma^2} I_{p \times p}$. Hence,
$$\left(X^{\top} W X\right)^{-1}=\sigma^2\left(X^{\top} X\right)^{-1}$$
We have previously shown that the MLE $\hat{\beta}$ of $\beta$ for the Normal Linear Model is equal to the least squares estimate of $\beta$, and that the variance of the least squares fit is $\sigma^2\left(X^{\mathrm{T}} X\right)^{-1}$. Thus, our finding above is in agreement with what we have shown earlier.
2

E-mail: help-assignment@gmail.com  微信:shuxuejun

help-assignment™是一个服务全球中国留学生的专业代写公司