This Analytic provides a semantic matching score for words in the dataset either by using Latent Semantic Analysis (LSA) or Term Frequency/In-document frequency (TFIDF) methods. The results provided are pruned by a given percentile of matches - a percentile of 0.99 corresponds to a returned set of only the highest 1% of semantically useful terms across the dataset. In practice, what this provides you is the other closely matching terms that the dataset matches - for example, if you searched for "Romney", "Mitt" should appear high up on the list, alongside possibly surprising terms, which is what you can use to gain a sense of the types of topics that these tweets cover.
| Name | User Modifiable? | Position | Kind | Description | Possible Values | Options |
|---|---|---|---|---|---|---|
| analysis_type | Yes | 1 | enum | Choose either Latent Semantic Analysis (lsa) or Term Frequency/In-document frequency (tfidf). Both are relatively similar in terms of the results they provide, but have their own nuances. To read up more, please visit the <a href="http://en.wikipedia.org/wiki/Latent_semantic_analysis" target="_blank">LSA</a> and <a href="http://en.wikipedia.org/wiki/Tf*idf" target="_blank">TFIDF</a> pages on Wikipedia. | 'tfidf', 'lsa' | Remove |