Principal Component analysis (PCA): PCA is an unsupervised linear dimensionality reduction and data visualization technique for very high dimensional data. As having high dimensional data is very hard to gain insights from adding to that, it is very computationally intensive. The main idea behind this technique is to reduce the dimensionality of data that is highly correlated by transforming the original set of vectors to a new set which is known as Principal component.
PCA tries to preserve the Global Structure of data i.e when converting d-dimensional data to d’-dimensional data then it tries to map all the clusters as a whole due to which local structures might get lost. Application of this technique includes Noise filtering, feature extractions, stock market predictions, and gene data analysis.
t-distributed stochastic neighbourhood embedding (t-SNE): t-SNE is also a unsupervised non-linear dimensionality reduction and data visualization technique. The math behind t-SNE is quite complex but the idea is simple. It embeds the points from a higher dimension to a lower dimension trying to preserve the neighborhood of that point.
Unlike PCA it tries to preserve the Local structure of data by minimizing the Kullback–Leibler divergence (KL divergence) between the two distributions with respect to the locations of the points in the map. This technique finds application in computer security research, music analysis, cancer research, bioinformatics, and biomedical signal processing.
Table of Difference between PCA and t-SNE
|1.||It is a linear Dimensionality reduction technique.||It is a non-linear Dimensionality reduction technique.|
|2.||It tries to preserve the global structure of the data.||It tries to preserve the local structure(cluster) of data.|
|3.||It does not work well as compared to t-SNE.||It is one of the best dimensionality reduction technique.|
|4.||It does not involve Hyperparameters.||It involves Hyperparameters such as perplexity, learning rate and number of steps.|
|5.||It gets highly affected by outliers.||It can handle outliers.|
|6.||PCA is a deterministic algorithm.||It is a non-deterministic or randomised algorithm.|
|7.||It works by rotating the vectors for preserving variance.||It works by minimising the distance between the point in a guassian.|
|8.||We can find decide on how much variance to preserve using eigen values.|| We cannot preserve variance instead we can preserve distance using hyperparameters.|
- ML | Principal Component Analysis(PCA)
- ML | Introduction to Kernel PCA
- ML | Face Recognition Using PCA Implementation
- ML | Face Recognition Using Eigenfaces (PCA Algorithm)
- Difference between ++*p, *p++ and *++p
- Difference Between DOS and Windows
- Difference between User Level thread and Kernel Level thread
- What’s difference between The Internet and The Web ?
- Difference between Priority Inversion and Priority Inheritance
- What’s difference between Linux and Android ?
- What’s difference between header files "stdio.h" and "stdlib.h" ?
- Difference between HTML and HTTP
- Difference between http:// and https://
- What's difference between MMU and MPU?
- What's difference between Microcontroller (µC) and Microprocessor (µP)?
- What's the difference between Scripting and Programming Languages?
- What’s difference between “array” and “&array” for “int array” ?
- What's difference between char s and char *s in C?
- Difference between Ping and Traceroute
- What’s difference between 1's Complement and 2's Complement?
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.