Related Posts
Additional Posts in Data & Analytics Consultants
Thought this was interesting. Across 160 teams of researchers, just about all failed to make good life outcome predictions on things like GPA, evictions, layoffs, and others. Data followed 4.5k families across 15 years, with 13k features (varied over time). Haven't looked at it directly yet, but will be turning the docs and data inside out... In the meantime, authors claim this as showing the limits of ML. Oh, and it's published in PNAS, so you know there's some big publication energy there.
https://www.pnas.org/content/117/15/8398
New to Fishbowl?
unlock all discussions on Fishbowl.



Both, though it doesn't reduce m.
Strictly speaking, all PCA does is identify new principal components (variables/dimensions/axes) and rank them in order of how much variance is captured by each. You don't actually need to use the new variables - or you could use all of them and not reduce dimensionality at all.
Typically, you pick the first X principal components that capture some % of the total variance in your data. This reduces dimensionality, which is the same as reducing the n dimension of your data since n = number of variables.
Yes, or a table of explained variance (eigenvalue ratios). Selecting the number of principal components can be done subjectively (making a judgment call) or possibly through a cross validation.
Dimensionality = shape
Ahhh that clears up so much confusion lol I seen the two terms but thought they were different