Abstract

In this work, we analyze information taken from a social networking platform (SNP) for stocks investment and trading. The platform is like a forum that provides tools and advice to those who are into stock trading and also those want to venture into trading. The goal of this report is to analyze the performance of similarly reviewed and recommended stocks obtained by clustering posts from the Top 20 authors in the platform.

A total of 14,953 posts from January to June 2019 were extracted from the SNP and were converted into a bag-of-words representation matrix using term-frequency inverse frequency (TF-IDF) vectorizer. Dimensions were further reduced using Truncated SVD. The resulting matrix was then clustered using Non-Negative Matrix Factorization with four clusters with distinct themes. Stocks belonging to Cluster 1 (words containing positive analysis) performed best, having recorded a yield* of 12.0% and returns of 8.5%.

Both Cluster 2 and Cluster 3 (neutral reviews and technical analysis) performed poorly in terms of yield, both with an aggregate loss of –5.1% and –2.4%, respectively. Stocks in Cluster 2 resulted to 63.2% returns, while stocks belonging to Cluster 3 resulted 19.3% returns. Interestingly, for Cluster 4, despite containing negatively worded analysis, stocks here still resulted to a 1.53% yield, and 14.54% on returns.