Katana VentraIP

Factor analysis

Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observed variables mainly reflect the variations in two unobserved (underlying) variables. Factor analysis searches for such joint variations in response to unobserved latent variables. The observed variables are modelled as linear combinations of the potential factors plus "error" terms, hence factor analysis can be thought of as a special case of errors-in-variables models.[1]

This article is about factor loadings. For factorial design, see Factorial experiment.

Simply put, the factor loading of a variable quantifies the extent to which the variable is related to a given factor.[2]


A common rationale behind factor analytic methods is that the information gained about the interdependencies between observed variables can be used later to reduce the set of variables in a dataset. Factor analysis is commonly used in psychometrics, personality psychology, biology, marketing, product management, operations research, finance, and machine learning. It may help to deal with data sets where there are large numbers of observed variables that are thought to reflect a smaller number of underlying/latent variables. It is one of the most commonly used inter-dependency techniques and is used when the relevant set of variables shows a systematic inter-dependence and the objective is to find out the latent factors that create a commonality.

Katana VentraIP

$_$_$DEEZ_NUTS#0__titleDEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#0__subtitleDEEZ_NUTS$_$_$

Statistical model[edit]

Definition[edit]

The model attempts to explain a set of observations in each of individuals with a set of common factors () where there are fewer factors per unit than observations per unit (). Each individual has of their own common factors, and these are related to the observations via the factor loading matrix (), for a single observation, according to

$_$_$DEEZ_NUTS#3__descriptionDEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#4__descriptionDEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#2__titleDEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#2__descriptionDEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#3__titleDEEZ_NUTS$_$_$

Simple factors: these rotations try to explain all factors by using only a few important variables. This effect can be achieved by using Varimax (the most common rotation).

Simple variables: these rotations try to explain all variables using only a few important factors. This effect can be achieved using either Quartimax or the unrotated components of PCA.

Both: these rotations try to compromise between both of the above goals, but in the process, may achieve a fit that is poor at both tasks; as such, they are unpopular compared to the above methods. Equamax is one such rotation.

PCA results in principal components that account for a maximal amount of variance for observed variables; FA accounts for common variance in the data.

PCA inserts ones on the diagonals of the correlation matrix; FA adjusts the diagonals of the correlation matrix with the unique factors.

PCA minimizes the sum of squared perpendicular distance to the component axis; FA estimates factors that influence responses on observed variables.

The component scores in PCA represent a linear combination of the observed variables weighted by ; the observed variables in FA are linear combinations of the underlying and unique factors.

eigenvectors

In PCA, the components yielded are uninterpretable, i.e. they do not represent underlying ‘constructs’; in FA, the underlying constructs can be labelled and readily interpreted, given an accurate model specification.

In psychometrics[edit]

History[edit]

Charles Spearman was the first psychologist to discuss common factor analysis[39] and did so in his 1904 paper.[40] It provided few details about his methods and was concerned with single-factor models.[41] He discovered that school children's scores on a wide variety of seemingly unrelated subjects were positively correlated, which led him to postulate that a single general mental ability, or g, underlies and shapes human cognitive performance.


The initial development of common factor analysis with multiple factors was given by Louis Thurstone in two papers in the early 1930s,[42][43] summarized in his 1935 book, The Vector of Mind.[44] Thurstone introduced several important factor analysis concepts, including communality, uniqueness, and rotation.[45] He advocated for "simple structure", and developed methods of rotation that could be used as a way to achieve such structure.[39]


In Q methodology, William Stephenson, a student of Spearman, distinguish between R factor analysis, oriented toward the study of inter-individual differences, and Q factor analysis oriented toward subjective intra-individual differences.[46][47]


Raymond Cattell was a strong advocate of factor analysis and psychometrics and used Thurstone's multi-factor theory to explain intelligence. Cattell also developed the scree test and similarity coefficients.

Applications in psychology[edit]

Factor analysis is used to identify "factors" that explain a variety of results on different tests. For example, intelligence research found that people who get a high score on a test of verbal ability are also good on other tests that require verbal abilities. Researchers explained this by using factor analysis to isolate one factor, often called verbal intelligence, which represents the degree to which someone is able to solve problems involving verbal skills.


Factor analysis in psychology is most often associated with intelligence research. However, it also has been used to find factors in a broad range of domains such as personality, attitudes, beliefs, etc. It is linked to psychometrics, as it can assess the validity of an instrument by finding if the instrument indeed measures the postulated factors.

In cross-cultural research[edit]

Factor analysis is a frequently used technique in cross-cultural research. It serves the purpose of extracting cultural dimensions. The best known cultural dimensions models are those elaborated by Geert Hofstede, Ronald Inglehart, Christian Welzel, Shalom Schwartz and Michael Minkov. A popular visualization is Inglehart and Welzel's cultural map of the world.[25]

In political science[edit]

In an early 1965 study, political systems around the world are examined via factor analysis to construct related theoretical models and research, compare political systems, and create typological categories.[50] For these purposes, in this study seven basic political dimensions are identified, which are related to a wide variety of political behaviour: these dimensions are Access, Differentiation, Consensus, Sectionalism, Legitimation, Interest, and Leadership Theory and Research.


Other political scientists explore the measurement of internal political efficacy using four new questions added to the 1988 National Election Study. Factor analysis is here used to find that these items measure a single concept distinct from external efficacy and political trust, and that these four questions provided the best measure of internal political efficacy up to that point in time.[51]

Identify the salient attributes consumers use to evaluate in this category.

products

Use techniques (such as surveys) to collect data from a sample of potential customers concerning their ratings of all the product attributes.

quantitative marketing research

Input the data into a statistical program and run the factor analysis procedure. The computer will yield a set of underlying attributes (or factors).

Use these factors to construct and other product positioning devices.

perceptual maps

$_$_$DEEZ_NUTS#5__titleDEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#5__subtextDEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#5__quote--0DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#5__name--0DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#5__company_or_position--0DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#5__quote--1DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#5__name--1DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#5__company_or_position--1DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#5__quote--2DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#5__name--2DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#5__company_or_position--2DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#5__quote--3DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#5__name--3DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#5__company_or_position--3DEEZ_NUTS$_$_$

In physical and biological sciences[edit]

Factor analysis has also been widely used in physical sciences such as geochemistry, hydrochemistry,[53] astrophysics and cosmology, as well as biological sciences, such as ecology, molecular biology, neuroscience and biochemistry.


In groundwater quality management, it is important to relate the spatial distribution of different chemical parameters to different possible sources, which have different chemical signatures. For example, a sulfide mine is likely to be associated with high levels of acidity, dissolved sulfates and transition metals. These signatures can be identified as factors through R-mode factor analysis, and the location of possible sources can be suggested by contouring the factor scores.[54]


In geochemistry, different factors can correspond to different mineral associations, and thus to mineralisation.[55]

In microarray analysis[edit]

Factor analysis can be used for summarizing high-density oligonucleotide DNA microarrays data at probe level for Affymetrix GeneChips. In this case, the latent variable corresponds to the RNA concentration in a sample.[56]

$_$_$DEEZ_NUTS#1__titleDEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#1__subtextDEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#1__answer--0DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#1__answer--1DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#1__answer--2DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#1__answer--3DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#1__answer--4DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#1__answer--5DEEZ_NUTS$_$_$

$_$_$DEEZ_NUTS#1__answer--6DEEZ_NUTS$_$_$

BMDP

JMP (statistical software)

(statistical software)]

Mplus

: module scikit-learn[57]

Python

(with the base function factanal or fa function in package psych). Rotations are implemented in the GPArotation R package.

R

(using PROC FACTOR or PROC CALIS)

SAS

[58]

SPSS

Stata

Child, Dennis (2006), (3rd ed.), Continuum International, ISBN 978-0-8264-8000-2.

The Essentials of Factor Analysis

Fabrigar, L.R.; Wegener, D.T.; MacCallum, R.C.; Strahan, E.J. (September 1999). "Evaluating the use of exploratory factor analysis in psychological research". Psychological Methods. 4 (3): 272–299. :10.1037/1082-989X.4.3.272.

doi

B.T. Gray (1997) (Conference paper)

Higher-Order Factor Analysis

Jennrich, Robert I., "Rotation to Simple Loadings Using Component Loss Function: The Oblique Case," Psychometrika, Vol. 71, No. 1, pp. 173–191, March 2006.

Katz, Jeffrey Owen, and Rohlf, F. James. Primary product functionplane: An oblique rotation to simple structure. Multivariate Behavioral Research, April 1975, Vol. 10, pp. 219–232.

Katz, Jeffrey Owen, and Rohlf, F. James. Functionplane: A new approach to simple structure rotation. Psychometrika, March 1974, Vol. 39, No. 1, pp. 37–51.

Katz, Jeffrey Owen, and Rohlf, F. James. Function-point cluster analysis. Systematic Zoology, September 1973, Vol. 22, No. 3, pp. 295–301.

(2010), Foundations of Factor Analysis, Chapman & Hall.

Mulaik, S. A.

Preacher, K.J.; MacCallum, R.C. (2003). (PDF). Understanding Statistics. 2 (1): 13–43. doi:10.1207/S15328031US0201_02. hdl:1808/1492.

"Repairing Tom Swift's Electric Factor Analysis Machine"

J.Schmid and J. M. Leiman (1957). The development of hierarchical factor solutions. Psychometrika, 22(1), 53–61.

Thompson, B. (2004), , Washington DC: American Psychological Association, ISBN 978-1591470939.

Exploratory and Confirmatory Factor Analysis: Understanding concepts and applications

A Beginner's Guide to Factor Analysis

Exploratory Factor Analysis. A Book Manuscript by Tucker, L. & MacCallum R. (1993). Retrieved June 8, 2006, from: Archived 2013-05-23 at the Wayback Machine

[2]

Garson, G. David, "Factor Analysis," from Statnotes: Topics in Multivariate Analysis. Retrieved on April 13, 2009, from

StatNotes: Topics in Multivariate Analysis, from G. David Garson at North Carolina State University, Public Administration Program

— conference material

Factor Analysis at 100

FARMS — Factor Analysis for Robust Microarray Summarization, an R package