PCA in Rapidminer
I would like to determine themes of a corpus of tweets using PCA. I created the process using the ff: read excel,nominal to numeric, PCA and connected the ports. There are no errors but I am not sure on how I can identify the hidden themes using PCA with the standard deviation, proportion of variance and cumulative variance. The proportion of variance ranges from 0-.001. I set the variance threshold at .95.
Can you please help me? Thank you
| component | std dev | proportion of variance | cumulative variance |
| PC 1 | 0.157 | 0.025 | 0.025 |
| PC 2 | 0.137 | 0.019 | 0.045 |
| PC 3 | 0.123 | 0.016 | 0.06 |
| PC 4 | 0.118 | 0.014 | 0.075 |
| PC 5 | 0.115 | 0.014 | 0.089 |
| PC 6 | 0.112 | 0.013 | 0.101 |
| PC 7 | 0.104 | 0.011 | 0.113 |
| PC 8 | 0.1 | 0.01 | 0.123 |
| PC 9 | 0.098 | 0.01 | 0.133 |
| PC 10 | 0.097 | 0.01 | 0.143 |
| PC 11 | 0.097 | 0.01 | 0.153 |
| PC 12 | 0.093 | 0.009 | 0.161 |
| PC 13 | 0.093 | 0.009 | 0.17 |
| PC 14 | 0.092 | 0.009 | 0.179 |
| PC 15 | 0.09 | 0.008 | 0.187 |
| PC 16 | 0.089 | 0.008 | 0.196 |
| PC 17 | 0.087 | 0.008 | 0.204 |
| PC 18 | 0.087 | 0.008 | 0.211 |
| PC 19 | 0.086 | 0.008 | 0.219 |
| PC 20 | 0.084 | 0.007 | 0.226 |
| PC 21 | 0.083 | 0.007 | 0.234 |
| PC 22 | 0.082 | 0.007 | 0.241 |
| PC 23 | 0.082 | 0.007 | 0.248 |
| PC 24 | 0.081 | 0.007 | 0.254 |
| PC 25 | 0.08 | 0.007 | 0.261 |
| PC 26 | 0.08 | 0.007 | 0.268 |
Best Answer
-
Thomas_Ott
RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,761
Unicorn
Hi jem810,
Have you checked out this post by our partner Simafore?http://www.simafore.com/blog/bid/62911/How-to-run-Principal-Component-Analysis-with-RapidMiner-Part-2
He goes in depth on how to use PCA with RapidMiner and how to intepret the EigenVectors
2

Contributor I


Answers
Is it possible to display the results graphically?
@sebastian_gonzaYes, there is the ability to graph.