Get started

# Multivariate Analysis

Multivariate analysis is concerned with two or more dependent variables, Y1, Y2, being simultaneously considered for multiple independent variables, X1, X2, etc.  Recent advances in computer software and hardware have made it possible to solve more -problems using multivariate analysis. Some of the software programs available to solve multivariate problems include: SPSS, S-Plus, SAS, and Minitab. Multivariate analysis has found wide usage in the social sciences, psychology, or educational ﬁelds. Applications for multivariate analysis can also be found in the engineering, technology, and scientiﬁc disciplines.

Multivariate Analysis concepts or techniques:

• Principal components analysis

• Factor analysis

• Discriminant function analysis

• Cluster analysis

• Canonical correlation analysis

• Multivariate analysis of variance

Principal Components Analysis

Principal components analysis (PCA) and factor analysis (FA) are two related techniques used to find patterns of correlation among many possible variables or subsets of data, and to reduce them to a smaller manageable number of components or factors. The researcher attempts to find the primarycomponents, or factors, that account for most of the sources of variance. PCA refers to subsets as components and FA uses the term factors. Minimum of 100 observations should be used for PCA.  The ratio is usually set at approximately 5 observations per variable.

If there are 25 variables, then the ratio of 5:1 requires 5 observations/variable x 25 variables = 125 observations. Perhaps two principal components will explain95% of the variance. The other three may only contribute 5%.

Factor Analysis

Factor analysis is a data reduction technique to identify factors that explain variation. It is very similar to the principal components

That is, factor analysis attempts to simplify complex sets of data, reducing many factors to a smaller set. However, there is some subjective judgment involved in describing the factors in this method of analysis.  The output variables are linearly related to the input factors.  The variables under investigation should be measurable, have a range of measurements and be symmetrically distributed.  There should be four or more input factors for each dependent variable. Factor analysis undergoes two stages:

• Factor extraction

• Factor rotation

Continuous Data – Large Samples

Use the normal distribution to calculate the confidence for the mean. X plus/minus Z alpha by two multiply by standard deviation divided by square root of n.

Where:

X = sample average

σ = the population standard deviation

n = sample size

Continuous Data – Small Samples

Use the normal distribution to calculate the confidence for the mean.  X plus/minus t alpha by two multiply by s divided by square root of n.

Where: X = sample average

S = the population standard deviation

n = sample size