dc.description | This package contains the datasets and source codes used in the PhD thesis entitled <a href="https://hdl.handle.net/20.500.12733/5361" target="_blank">Predicting the Brazilian stock market using sentiment analysis, technical indicators and stock prices</a>.
<br>
The following files are included:
<ul>
<li>File <em>Labeled.zip</em> - financial news labeled in two classes (<em>Positive</em> and <em>Negative</em>), organized to train Sentiment Analysis models. Part of these news were initially presented in [1].
Besides the news in this file, in the related PhD thesis the training dataset was complemented with the labeled news presented in [2].
</li>
<li>File <em>Unlabeled.zip</em> - general unlabeled financial news collected during the period 2010-2020 from the following online sources: G1, Folha de São Paulo and Estadão. This file contains news from the Bovespa index and from the following companies: Banco do Brasil, Itau, Gerdau and Ambev.
</li>
<li>File <em>Stocks.zip</em> - stock prices from the companies Banco do Brasil, Itau, Gerdau, Ambev, and the Bovespa index. The considered period ranges from 2010 to 2020.
</li>
<li>
File <em>Models.zip </em> - contains the source codes of the models used in the PhD thesis (i.e., Multilayer Perceptron, Long Short-Term Memory, Bidirectional Long Short-Term Memory, Convolutional Neural Network, and Support Vector Machines).
</li>
<li>
File <em>Utils.zip</em> - contains the source codes of the preprocessing step designed for the methodology of this work (i.e., load data and generate the word embeddings), alongside with stocks manipulation, and investment evaluation.
</li>
</ul>
[1] Carosia, A. E. D. O., Januário, B. A., da Silva, A. E. A., & Coelho, G. P. (2021). <strong>Sentiment Analysis Applied to News from the Brazilian Stock Market</strong>. IEEE Latin America Transactions, 100. DOI: <a href="https://doi.org/10.1109/TLA.2022.9667151" target="_blank">10.1109/TLA.2022.9667151</a>
<br>
[2] MARTINS, R. F.; PEREIRA, A.; BENEVENUTO, F. <strong>An approach to sentiment analysis of web applications in portuguese</strong>. Proceedings of the 21st Brazilian Symposium on Multimedia and the Web, ACM, p. 105–112, 2015. DOI: <a href="https://doi.org/10.1145/2820426.2820446" target="_blank">10.1145/2820426.2820446</a> | |