Factor Analytics is a special technique reducing the huge number of variables into a few numbers of factors is known as factoring of the data, and managing which data is to be present in sheet comes under factor analysis. It is completely a statistical approach that is also used to describe fluctuations among the observed and correlated variables in terms of a potentially lower number of unobserved variables called factors.
The factor analysis technique extracts the maximum common variance from all the variables and puts them into a common score. It is a theory that is used in training the machine learning model and so it is quite related to data mining. The belief behind factor analytic techniques is that the information gained about the interdependencies between observed variables can be used later to reduce the set of variables in a dataset.
Factor analysis is a very effective tool for inspecting changeable relationships for complex concepts such as social status, economic status, dietary patterns, psychological scales, biology, psychometrics, personality theories, marketing, product management, operations research, finance, etc. It can help a researcher to investigate the concepts that are not easily measured in a much easier and quicker way directly by the cave in a large number of variables into a few easily interpretable fundamental factors.
Types of factor analysis:
Exploratory factor analysis (EFA) :
It is used to identify composite inter-relationships among items and group items that are the part of uniting concepts. The Analyst can’t make any prior assumptions about the relationships among factors. It is also used to find the fundamental structure of a huge set of variables. It lessens the large data to a much smaller set of summary variables. It is almost similar to the Confirmatory Factor Analysis(CFA).
- Evaluate the internal reliability of an amount.
- Examine the factors represented by item sets. They presume that the factors aren’t correlated.
- Investigate the grade/class of each item.
However, some common differences, most of them are concerned about how factors are used. Basically, EFA is a data-driven approach, which allows all items to load on all the factors, while in CFA you need to specify which factors are required to load. EFA is really a nice choice if you have no idea about what common factors might exist. EFA is able to generate a huge number of possible models for your data, something which is not possible is, if a researcher has to specify factors. If you have a bit idea about what actually the models look like, and then afterwards you want to test your hypotheses about the data structure, in that case, the CFA is a better approach.
Confirmatory factor analysis (CFA) :
It is a more complex(composite) approach that tests the theory that the items are associated with specific factors. Confirmatory Factor Analysis uses a properly structured equation model to test a measurement model whereby loading on the factors allows for the evaluation of relationships between observed variables and unobserved variables.
As we know, the Structural equation modelling approaches can board measurement error easily, and these are much less restrictive than least-squares estimation thus provide more exposure to accommodate errors. Hypothesized models are tested against actual data, and the analysis would demonstrate loadings of observed variables on the latent variables (factors), as well as the correlation between the latent variables.
Confirmatory Factor Analysis allows an analyst and researcher to figure out if a relationship between a set of observed variables (also known as manifest variables) and their underlying constructs exists. It is similar to the Exploratory Factor Analysis.
The main difference between the two is:
- Simply use Exploratory Factor Analysis to explore the pattern.
- Use Confirmatory Factor Analysis to perform hypothesis testing.
Confirmatory Factor Analysis provides information about the standard quality of the number of factors that are required to represent the data set. Using Confirmatory Factor Analysis, you can define the total number of factors required. For example, Confirmatory Factor Analysis is able to answer questions like Does my thousand question survey can able to measure accurately the one specific factor. Even though it is technically applicable to any kind of discipline, it is typically used in social sciences.
Multiple Factor Analysis :
This type of Factor Analysis is used when your variables are structured in changeable groups. For example, you may have a teenager’s health questionnaire with several points like sleeping patterns, wrong addictions, psychological health, mobile phone addiction, or learning disabilities.
The Multiple Factor Analysis is performed in two steps which are:-
- Firstly, the Principal Component Analysis will perform on each and every section of the data. Further, this can give a useful eigenvalue, which is actually used to normalize the data sets for further use.
- The newly formed data sets are going to merge into a distinctive matrix and then global PCA is performed.
Generalized Procrustes Analysis (GPA) :
The Procrustes analysis is actually a suggested way to compare then the two approximate sets of configurations and shapes, which were originally developed to equivalent to the two solutions from Factor Analysis, this technique was actually used to extend the GP Analysis so that more than two shapes could be compared in many ways. The shapes are properly aligned to achieve the target shape. Mainly GPA (Generalized Procrustes Analysis) uses geometric transformations.
Geometric progressions are :
- Isotropic rescaling,
- Translation of matrices to compare the sets of data.
- NP-Completeness | Set 1 (Introduction)
- Analysis of Algorithm | Set 5 (Amortized Analysis Introduction)
- Difference between Big Oh, Big Omega and Big Theta
- Check if Pascal's Triangle is possible with a complete layer by using numbers upto N
- Count subarrays having sum of elements at even and odd positions equal
- Sorting algorithm visualization : Heap Sort
- Find position of non-attacking Rooks in lexicographic order that can be placed on N*N chessboard
- Split numbers from 1 to N into two equal sum subsets
- Check if a large number is divisible by a number which is a power of 2
- Find Nth number in a sequence which is not a multiple of a given number
- Check if given permutation of 1 to N can be counted in clockwise or anticlockwise direction
- Largest number M less than N such that XOR of M and N is even
- Maximize profit in buying and selling stocks with Rest condition
- Count of paths in given Binary Tree with odd bitwise AND for Q queries
When factor analysis going to generate the factors, each and every factor has ab associated eigenvalue which will give the total variance explained by each factor.
Usually, the factors having eigenvalues greater than 1 are useful :
Percentage of variation explained by F1 = Eigenvalue of Factor 1/No. of Variables Percentage of variation explained by F2 = Eigenvalue of Factor 2/No. of Variables
# X? vector = ? ? vector
X is a general matrix as before, which is multiplied by some vector, and ? is a characteristic value. Look at the equation and notice that when you multiply the matrix by the vector, the effect is to reproduce the same vector just multiplied by the value ?. This is unusual behaviour and earns the vector and quantity? special names: the eigenvector and eigenvalue.
In addition, factors are created with equality; some factors have more weights some have low. In a simple example, imagine your car company says Maruti Suzuki is conducting a survey includes, using – telephonic survey, physical survey, google forms, etc. for customer satisfaction and the results show the following factor loadings:
VARIABLE | F1 | F2 | F3 | | | Problem 1 | 0.985 | 0.111 | -0.032 Problem 2 | 0.724 | 0.008 | 0.167 Problem 3 | 0.798 | 0.180 | 0.345
F1 – Factor 1
F2 – Factor 2
F3 – Factor 3
The factors that affect the question the most (and therefore have the highest factor loadings) are bolded. Factor loadings are similar to correlation coefficients in that they can vary from -1 to 1. The closer factors are to -1 or 1, the more they affect the variable.
Note: A factor loading of 0 indicates no effect.
Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready.
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.