Multi Variant Statistical Analysis

created : Fri, 24 Sep 2021 12:38:16 +0900
modified : Sun, 12 Jan 2025 23:33:56 +0900

Chapter 0. Introduction

0.1 Visualization of Multivariate Data

head(USArrests)
library(psych) # Scatter Plot Matrix
paris.panels(USArrests)

# Chernoff's Faces
library(aplpack)
faces(USArrests, face.type=1, cex=0.5))

# Star plot
stars(USArrests)

# 3-D scatter plot
library(scatterplot3d)
scatterplot3d(USArrests[, -1], type="h", highlight.3d=TRUE, angle=55, scale.y=0.7, pch=16, main="USArrests")

# 3-D rotated plot
library(rgl)
plot3d(USArrests[,-3])

# Profile plot
library(MASS)
parcoord(USArrests, col=c(1+(1:50)), var.label=T)

# Growth curves for longitudinal data
library(nlme)
head(Orthodont)
library(ggplot2)
p <- ggplot(data = Orthodont, aes(x = age, y = distance, group = Subject, colour=Subject))
p + geom_line()

p + geom_line() + facet_grid(. ~ Sex)

Summary of Introudction

Chapter 1. Linear algebra

1.1 Scalars, vectors, matrices

1.2 Operations of matrices

1.3 Trace and determinant for square matrcies

1.4 Rank of a matrix

1.5 Inverse matrix

1.6 Partitioned matrices

Example 1.6.2

1.7 Positive definite matrix

1.8 Orthogonal vectors and matrices

1.9 Eigenvalues and eigenvectors

1.10 Spectral decomposition

1.11 Cauchy-Schwarz inequality

1.12 Differentiation in Vectors and Matrices

1.13 Some useful quantities

1.14 Random vectors and matrices

1.14.1 Parameter vectors and matrices

2. Chapter 2 Multivariate Normal Distribtuion

2.1 Definitions

2.2 Properties of multivariate normal distribution

2.3 Estimation for sampling from a multivariate normal distributions

2.3.1 Likelihood function of a sample from a multivariate normal distribution

2.3.2 Maximum likelihood estimations (MLEs) from a multivariate normal distribution

2.4 Sampling distributions of $\bar X$ and $S$

2.5 Definition and Properties of the Wishart Distirubiton

2.6 Large sample distributions for $\bar X$ and $S$

2.7 Assessing the assumption of multivariate normality


2.8 Transformations to near normality


  1. Theoreticall transformations:
  1. Power transformations: When all observations are nonnegative, we may consider a family of power transformations. If some measurements are negative, then we first add a constant to all measurements and then apply a power transformation.:
  1. Box-Cox transformations: The Box-Cox transformation family is similar to the power transformation. This family continuously connects the logarithmic transform as the power $\lambda$ approaches zero.:
  1. Note that we should not expect some transformation can always make the data close to normality.

Chapter 3 Hypothesis tests

  1. The use of p univariate tests inflates the Type I erro rate, $\alpha$, whereas the multivariate test preserves the exact $\alpha$ level.:
  1. The univariate tests completely ignore the correlations among the variables, wehreas the multivariate tests make direct use of the correlations.
  2. The multivariate tests are more powerful than univariate tests in many cases.:

3.1 Review of hypothesis tests for a univariate normal mean

3.1.1 When $\sigma^2$ is known

3.1.2 When $\sigma^2$ is unknown

3.2 Hypothesis test on one sample multivariate normal mean vector

3.2.1 When the covariance matrix $\Sigma$ is known

3.2.2 Hotelling’s $T^2$ Statistic: when $\Sigma$ is unknown

  1. Note $T^2 = Z’(\frac{W}{v})^{-1}Z ~ \frac{vp}{v + 1 - p}F_{p, v+ 1-p}$, where $Z ~ N_p(0, \Sigma)$ and $W ~ Wischart(p, v, \Sigma)$ are independent.
  2. Note that $pF_{p, n-p} \rightarrow \chi_p^2$ so that $T^2 ~ \chi_p^2$ for a large sample under $H_0$
  3. $T^2$ statistic is invariant under linear transformation, that is, Hotelling $T^2$ statistic does not depend on the measurement units.

3.3 Hotelling’s $T^2$ and likelihodd ratio tests

3.4 Confidence regions and multiple testing

3.4.1 Simultaneous confidence intervals


Chapter 0. Introduction

0.1. Visualization of Multivariate Data

  1. Scatter Plot Matrix
    library(psych)
    pairs.panels(USArrests)
    
  2. Chernoff’s Faces
    library(aplpack)
    faces(USArrests, face.type=1, cex=0.5)
    
  3. Star plot
    stars(USArrests)
    
  4. 3-D scattor plot
    library(scatterplot3d)
    scatterplot3d(USArrests[,-3], type="ht", highlight.3d=TRUE,
    			  angle=55, scale.y=0.7, pch=16, main="USArrests")
    
  5. 3-D roated plot
    library(rgl)
    plot3d(USArrests[, -3])
    
  6. Profile plot:
    library(MASS)
    parcoord(USArrests, col=c(1+(1:50)), var.label=T)
    
  7. Growth curves for longitudinal data
    library(nlme)
    library(ggplot2)
    p <- ggplot(data = Orthodont, aes(x = age, y=distnace, group = Sbuject, colour=Subject))
    p + geom_line()
    p + geom_line() + facet_grid(.~Sex)
    

Summary of Introduction

Chapter 1. Linear algebra

1.1 Scalars, vectors, matrices

1.2 Operations of matrices

1.3 Trace and determinant for square matrices

1.4 Rank of a matrix

1.5 Inverse matrix

1.6 Partitioned matrices

1.8 Orthogonal vectors and matrices

1.9 Eigenvalues and eigenvectors

1.10 Spectral decomposition

1.11 Cauchy-Schwarz inequality

1.12 Differentaiation in Vectors and Matrices

1.13 Some useful quantities

1.14 Random vectors and matrices

1.14.1 Parameter vectors and matrices

1.14.2 Numerical summarization of multivariate data

  1. Sample mean vector: The sample mean vector is defined by $$ \begin{aligned} \bar X &= \frac{1}{n}X^T 1_{n \times 1} \ & = \frac{1}{n} \sum_{i=1}^n x_i = (\sum_{i=1}^n x_{i1}, \cdots, \sum_{i=1}^n x_{ip})^T \end{aligned}$$

  2. Sample variance-covariance matrix and sample correlation matrix: The sample covariance matrix is defined by $$ S = \frac{1}{n -1} \sum_{i=1}^n (X_i - \bar)(X_i - \bar X)^T $$ that is a unbaised estimate for the population covariance. Another (biased) estimate of the covariance matrix is $$ S = \frac{1}{n} \sum_{i=1}^n (X_i - \bar)(X_i - \bar X)^T $$ that is the maximum likelihood estimate under normality assumption. The sample correlation matrix is defiend by $$ R = \begin{pmatrix} 1 & \cdots & \frac{\sum (x_{i1} - \bar x_1)(x_{ip} - \bar x_p)}{\sqrt{\sum (x_{i1} - \bar x_1)^2 \sum (x_{ip} - \bar x_p)^2)})} \ \vdots & \ddots & \vdots \ \frac{\sum (x_{ip} - \bar x_p)(x_{i1} - \bar x_1)}{\sqrt{\sum (x_{ip} - \bar x_p)^2 \sum (x_{i1} - \bar x_1)^2)})} & \cdots & 1 \

    \end{pmatrix} = \begin{pmatrix} 1 & \cdots & r_{1p} \ \vdots & \ddots & \vdots \ r_{p1} & \cdots & 1 \end{pmatrix}$$ In a matrix form, $R= D^{-1/2} SD^{-1/2}$ where $D = diag(S)$.

    Chatper 2. Multivariate Normal Distribution

    • A generalization of the well-known mound-shaped normal density to multiple dimensions plays a fundamental role in multivariate analysis. Most of the techinques in this multivariate statistics course are based on the assumption that the data were randomly drawn from a multivariate normal distribution. While real data are never exactly multivariate normal, the normal density is often a useful approximation to the “true” (but unknown) population distribution. One advantage of the multivariate normal distribution stems from the fact that it is mathematically tractable and “beautiful” results can be obtained. This is frequently not the case for other distributions. Of course, mathematical attractiveness is of little use to the practitioner. It turns out, however, that normal distributions are useful in practice for two reasons:
      1. the normal distribution serves as a bona fide population model in some instances;
      2. the sampling distributions of many multivariate statistics are approximately normal, regardless of the form of the population, because (multivariate) central limit theorem for a large sample size. To summarize, meany real-world problems fall naturally within the framework of normal theory or we may transform the data set to satisfy the (approximate) normality. The importance of the normal distribution rests on its dual role as both population model for certain natural phenomena and approximate sampling distribution for many statistics.

2.1 Definitions

2.2 Properties of multivariate normal distribution

2.3 Estimation for sampling from a multivariate normal distributions

2.3.1 Likelihood function of a sample from a multivariate normal distribution

2.3.2 Maximum likelihood estimations (MLEs) from a multivariate normal distribution

2.4 Sampling distributions of $\bar X$ and $S$

2.5 Definition and Properties of the Wishart Distribution

2.6 Large sample distributions for $\bar X$ and $S$

Assessing the assumption of multivariate normality

  1. Marginal normality check for each variable (We use th univariate methods with each variable).
  2. Chi-square plot : Use $(X - \mu)^T \Sigma^{-1} (X - \mu) \sim X_m^2$ if $X \sim N_m(\mu, \Sigma)$.
    1. Calculate $d_j^2 = (X_j - \bar X)^T X^{-1}(X_j - \bar X)$.
    2. Rearrange $d_j^2$ in ascending order : $d_{(1)}^2 \le d_{(2)}^2 \le \cdots \le d_{(n)}^2$.
    3. Find $q_j$ such that $P(\chi_m^2 \le q_j) = \frac{j - \frac{1}{2}}{2}$
    4. Plot $(q_j, d_{(j)}^2)$
    5. Check whether the points are approximately on a straight line.
  3. Formal hypothesis test:
    • Mardia’s test based on based on multivariate extensions of skewness and kurtosis measures. $$MS = \frac{1}{6n} \sum_{i=1}^n \sum_{j=1}^n [ (x_i - \bar x)^T \hat \Sigma^{-1} (x_j - \bar x)]^3$$ $$MK = \sqrt{\frac{n}{8m(m+2)}} {\frac{1}{n} \sum_{i=1}^n [(x_i - \bar x)^T \hat \Sigma^{-1} (x_j - \bar x)]^2 - m(m+2)}$$ Under the null hypothesis of multivariate normality, the statistic MS will have approximately a chi-squared distribution with $\frac{1}{6}m(m+1)(m+2)$ degrees of freedom, and MK will be approximately standard normal $N(0,1)$.
    • Henze-Zirkler’s test based on the empirical characteristic function: $$HZ_\beta = \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n e^{-\frac{\beta^2}{2}(x_i -x_j)^T \hat \Sigma^{-1} (x_i - x_j)} - \frac{2}{n(1 + \beta^2)^{m/2}} \sum_{i=1}^n e^{- \frac{\beta^2}{2(1 + \beta^2)} (x_ i - \bar x)^T \hat \Sigma^{-1}(x_i - \bar x)} + \frac{1}{(1 + 2 \beta^2)^{m/2}}$$ where $\beta = \frac{1}{\sqrt{2}} [\frac{(2m+1)n}{4}]^{1 /(m+4)}$ is a common choice. The HZ test rejects normality if $HZ_\beta$ is too large.
    • There are many other tests such as Royston’s test, Coornik-Hansen’s test, and Energy test.
    • An R package MVN includes the above tests.

2.8 Transformations to near normality


  1. Theoretical transformations
VariableTransformation
Count $y$$\sqrt{y}$
Proportion $\hat p$$logit(\hat p) = log \frac{\hat p}{1 -\hat p}$
Correlation $r$Fisher’s $z$ transform $z = log(\frac{1 + r}{1 - r})$
  1. Power transformations: When all observations are nonnegative, we may consider a family of power transformations. If some measurements are negative, the we first add a constant to all measurements and then apply a power transformation. $$ x_i + c \rightarrow (x_i + c)^{\lambda}$$
  2. Box-Cox transformations: The Box-Cox transformation family is similar to the power transformation. This family continuously connects the logarithmic transform as the power $\lambda$ approaches zero. $$ x^{(\lambda)} = \begin{cases} \frac{x^\lambda - 1}{\lambda} & \text{ for } \lambda \not = 0 \ log x & \text{ for } \lambda = 0\end{cases}$$ for $x>0$. We choose $\lambda$ by maximizing the log-likelihood function: $$l(\lambda) = -\frac{n}{2} log [\frac{1}{n} \sum_{i=1}^n (x_i^{(\lambda)} - \bar x^{(\lambda)}) ^2] + (\lambda - 1) \sum_{i=1}^n log x_i$$
  3. Note that we should not expect some transformation can always make the data close to normality.

Chapter 3. Hypothesis tests

3.1 Review of hypothesis tests for a univariate normal mean

3.1.1 When $\sigma^2$ is known

3.1.2 When $\sigma^2$ is unknown

3.2 Hypothesis test on one sample multivariate normal mean vector

3.2.1 When the covariance matrix $\Sigma$ is known

3.2.2 Hotelling’s $T^2$ Statistic: when $\Sigma$ is unknown


  1. Note $T^2 = Z’(\frac{W}{v})^{-1}Z \sim \frac{vp}{v + 1 - p}F_{p, v + 1 - p}$, where $Z \sim N_p(0, \Sigma)$ and $W \sim Wishart(p, v, \Sigma)$ are independent.
  2. Note that $pF_{p, n-p} \rightarrow \chi_p^2$ as $n \rightarrow \infty$ so that $T^2 \sim \chi_p^2$ for a large sample under $H_0$.
  3. $T^2$ statistic is invariant under linear transformation, that is, Hotelling $T^2$ statistic does not depend on the measurement units.

3.3 Hotelling’s $T^2$ and likelihood ratio tests

3.4 Confidence regions and multiple testing

3.4.1 Simultaneous confidence intervals

3.5 Large sample inferences


Summary

  1. Hotelling’s $T^2$ test statistic for one sample normal mean vector
    • Reject $H_0: \mu = \mu_0 \text{ if } T^2 = n (\bar x - \mu_0)‘S^{-1}(\bar - \mu_0) \ge \frac{p(n-1)}{n - p}F_{p, n-p, \alpha}$
  2. Confidence region and simultaneous confidence intervals with confidence level $1 - \alpha$.
Test nameCI
Hotelling’s $T^2$ confidence region$n(\bar x - \mu)‘S^{-1} (\bar x - \mu) \le \frac{(n-1)p}{n-p} F_{p, n-p, \alpha}$
Scheffe’s simultaneous CIs$\bar x_i \pm \sqrt{\frac{p(n-1)}{(n-p)} F_{p, n-p, \alpha}} \sqrt{\frac{s_{ii}}{n}}$
Bonferroni’s simultaneous CIs$\bar x_i \pm t_{n-1, \frac{\alpha}{2p}} \sqrt{\frac{s_{ii}}{n}}$

Chapter 4. Two Sample Comparision and MANOVA

4.1 Paired Comparisons and a Repeated Measures Design

Test nameCI
Hotelling’s $T^2$ confidence region$n(\delta - \bar D)‘S_d^{-1} (\delta - \bar D) \le \frac{(n-1)p}{n-p} F_{p, n-p, \alpha}$
Scheffe’s simultaneous CIs$\bar d_i \pm \sqrt{\frac{p(n-1)}{(n-p)} F_{p, n-p, \alpha}} \sqrt{\frac{s_{d_i}^2}{n}}$
Bonferroni’s simultaneous CIs$\bar d_i \pm t_{n-1, \frac{\alpha}{2p}} \sqrt{\frac{s_{d_i}^2}{n}}$

4.2 Comparing Mean Vectors from Independent Two Samples


4.2.1 When $\Sigma = \Sigma_1 = \Sigma_2$



4.4 Simultaneous Confidence Intervals for Treatment Effects

4.5 Testing for Equality of Covarinace Matrices

Chapter 5. Discriminant analysis and classification

5.1 Discriminant function

5.2 Discriminant functions for two groups

5.3 Classification analysis

5.4 Classification for multivariate normal distributions

5.5 discriminant analysis for several groups

5.5.1 Discrimant functions

5.6 Stepwise Discriminant Analysis

Chapter 6. Principal Component Analysis (PCA)

6.1 Introduction


6.2 Method

6.3 PCA from the correlation matrix

6.4 Plotting of principal components

6.5 How many components to retain?

Chapter 7 . Factor Analysis (FA)

7.1 Orthogonal factor model

7.2 Estimations

  1. (Principal component method) $$\begin{aligned} \Sigma &= \sum_{i=1}^p \lambda_i e_i e_i’ = \sum_{i=1}^p (\sqrt{\lambda_i} e_i)(\sqrt{\lambda_i}e_i)’ \ & = (\sqrt{\lambda_i} e_1 : \cdots : \sqrt{\lambda_p} e_p) \begin{pmatrix}\sqrt{\lambda_1}e_1’ \ \vdots \ \sqrt{\lambda_p} e_p’\end{pmatrix} \end{aligned}$$ If $\lambda_{m+1}, \cdots, \lambda_p$ are small, then we can approximate the covariance matrix by: $$\Sigma \approx ( \sqrt{\lambda_1} e_1 : \cdots : \sqrt{\lambda_m} e_m) \begin{pmatrix} \sqrt{\lambda_1} e_1’ \ \vdots \ \sqrt{\lambda_m} e_m’ \end{pmatrix} + \begin{pmatrix} \psi_1 & 0 & \cdots & 0 \ 0 & \psi_2 & \cdots & 0 \ \vdots & \vdots & \ddots & \vdots \ 0 & 0 & \cdots & \psi_p\end{pmatrix}$$ where $\psi_i = \Sigma_{ii} - \sum_{j=1}^m l_{ij}^2$ Communalities are $$h_i^2 = l_{i1}^2 + \cdots + l_{im}^2$$
  1. (Principal factors) We initially estimate $\Phi^{(0)}$, and apply the principal component solution to $S - \Psi^{(r)}$. $$\begin{aligned} S - \Psi^{(r)} &= \sum_{j=1}^m \lambda_j^{(r)}e_j^{(r)}e_j^{(r)T} + \sum_{j=m+1}^p \lambda_j^{(r)} e_j^{(r)} e_j^{(r)T}\ \Psi^{(r+1)} & = diag(S - L^{(r)}L^{(r)T}) \end{aligned}$$
    • Repeat these steps until converges. The common intial diagonal matrix $\Psi^{(0)}$ is chosen as $diag(S^{-1})$ for factoring the sample covariance matrix and $diag(R^{-1})$ for factoring the sample corelation matrix.
  2. (Maximum likelihood method) Assume $X_j - \mu = LF_j + \epsilon_j$ has a multivariate normal distribution. The likelihood function is given by $$\begin{aligned} L(\mu, \Sigma) &= \prod_{i=1}^N [\frac{1}{(2 \pi)^{p/2} \vert \Sigma \vert ^{1/2}} e^{-\frac{1}{2}(x_i - \mu)’ \Sigma^{-1} (x_i - \mu)}] \ &= \prod_{i=1}^N [\frac{1}{(2 \pi)^{p/2} \vert LL^T + \Psi \vert ^{1/2}} e^{-\frac{1}{2}(x_i - \mu)’ (LL^T + \Psi)^{-1} (x_i - \mu)}] \ \end{aligned}$$ Since $LQQ^TL^T = LL^T$ for any $m \times m$ orthogonal matrix $Q$, it is necessary to impose a condition to obtain a unique maximum likelihood solution: we need $m(m-1)/2$ constraints. Note $$(LL^T + \Psi)^{-1} = \Psi^{-1} - \Psi^{-1}L(I + L^T \Psi^{-1}L)^{-1}L^T \Psi^{-1}$$ If we impose a condition that $L^T \Psi^{-1}L$ is a diagonal matrix (it is exactly $m(m-1)/2$ constraints), then we acan numerically find the MLEs. Hence, we assume $$L^T\Psi^{-1}L = \Delta \text{ a diagonal matrix }$$ We numerically obtain $\hat L$ and $\hat \Psi$ asusuming $L^T \Psi^{-1}L$ is diagonal.

7.3 Hypothesis Testing on the Number of Factors

7.4 Factor Rotation

7.5 Factor scores

  1. (Weighted Least Squares Method) Bartlett suggested weighted least squares be used to estimate the common factor values: $$x - \mu = Lf + \epsilon$$ $$Var(\epsilon_i) = \psi_i$$ $$\text{Minimize } \sum_{i=1}^p \frac{\epsilon_i}{\psi_i} = \epsilon’ \Psi^{-1} \epsilon = (x - \mu - Lf)’ \Psi^{-1}(x - \mu - Lf)$$ $$\hat f = (L’ \Psi^{-1} L)^{-1} L’ \Psi^{-1} (x - \mu)$$ Hence, the estimated factor score is $$\begin{aligned} \hat f_j &= (\hat L’ \hat \Psi^{-1} \hat L)^{-1} \hat L’ \hat \Psi^{-1} (x_j - \bar x) \ & = \hat \Delta ^{-1} \hat L’ \hat \Psi^{-1} (x_j - \bar x) \end{aligned}$$ When the correlation matrix is factored, $$\hat f_j = (\hat L_z’ \hat \Psi_z ^{-1} \hat L_z)^{-1} \hat L_z’ \hat \Psi_z^{-1} z_j = \hat \Delta_z ^{-1} \hat L_z’ \hat \Psi_z^{-1} z_j$$ where $z_j = D^{-1/2}(x_j \bar x)$ and $\hat \rho = \hat L_z \hat L_z’ + \hat \Psi_z$. When $\hat L$ and $\hat \Psi$ are determined by the maximum likelihood method, these estimates must satisfy the uniqueness condition, $\hat L’ \hat \Psi^{-1} \hat L = \hat \Delta$, a diagonal matrix.

  2. (Regression Method) Since $X - \mu = LF + \epsilon \sim N_p(0, LL’ + \Psi)$ and $F \sim N_m(0, I)$, they have aj oint normal distribution $N_{p+m} (0, \Sigma^)$ where $$\Sigma^ = \begin{pmatrix} \Sigma = LL’ + \Psi & L \ L’ & I\end{pmatrix}$$ From the conditional mean vector of a partitioned normal random vector given the rest partitioned vector is $$E(F \vert x) = L’(LL’ + \Psi)^{-1} (x - \mu)$$ $$\hat f_j = \hat L’(\hat L \hat L’ + \hat \Psi)^{-1} (x_j - \bar x)$$ $$\hat f_j = \hat L’ S^{-1} (x_j - \bar x)$$ To reduce the ffects of a (possibly) incorrect determination of the number of factors, practitioners tend to calculate the factor scores by using $S$ (the original sample covarinace matrix) instead of $\hat \Sigma$. Inf a correlation matrix is factored, $$\hat f_j = \hat L_z ’ R^{-1} z_j$$

    • Remark1. If rotated loadings $\hat L ^* = \hat L T$ are used in place of the original loadings, the subsequence factor scores $\hat f_j^$ are obtained by $\hat f_j^ = T \hat f_j$
  3. (Principal component method) When the principal component solution is used, it is common to estimate the factor scores by a ordinary least squares method: $$F = (L’L)^{-1} L’(X- \mu)$$ $$\hat f_j = (\hat L’ \hat L)^{-1} \hat L’(x_j - \bar x)$$ Since $\hat L = (\sqrt{\lambda_1} \hat e_1 : \cdots : \sqrt{\lambda_m} \hat e_m)$, we have $\hat L \hat L = diag(\hat \lambda_1, \cdots, \hat \lambda_m)$ and $$\begin{aligned} \hat f_j & = (\hat L’ \hat L)^{-1} \hat L’ (x_j - \bar x) \ & = \begin{pmatrix} \frac{1}{\lambda_1} & 0 & \cdots & 0 \ 0 & \frac{1}{\lambda_2} & \cdots & 0 \ \vdots & \vdots & \ddots & \vdots \ 0 & 0 & \cdots & \frac{1}{\lambda_m} \end{pmatrix} \begin{pmatrix} \sqrt{\hat \lambda_1} e_1 ’ \ \sqrt{\hat \lambda_2} e_2 ’ \ \vdots \ \sqrt{\hat \lambda_m} e_m’ \end{pmatrix} (x_j - \bar x) \ &= \begin{aligned} \frac{1}{\sqrt{\lambda_1}} e_1’ (x_j - \bar x) \ \frac{1}{\sqrt{\lambda_2}} e_2’ (x_j - \bar x) \ \vdots \ \frac{1}{\sqrt{\lambda_m}} e_m’ (x_j - \bar x) \ \end{aligned} \end{aligned}$$

7.6 Strategy for Factor Analysis

  1. Perform a principal component factor analysis, including a varimax rotation
  2. Perform a maximum likelihood factor analysis, including a varimax rotation
  3. Compare the solutions
  4. Repeat 1- 3 for other number of common factors $m$

Chapter 8 Multivariate regression

8.1 The Classical (Univariate) Linear Regression Model

8.2 Least Squares Estimation

8.3 Sum of Squares Decomposition

8.4 Inferences About the Regression Model

8.4.1 Likelihood ratio tests (LRTs)

8.5 Inferences from the Estimated Regression Function

8.6 Model Checking and Other Aspects of Regression

8.7 Multivariate Multiple Regression