Comparative Analysis of VADER and TextBlob on Financial News Headlines
Pub. online: 10 July 2025
Type: Data Science In Action
Open Access
Received
14 September 2024
14 September 2024
Accepted
27 June 2025
27 June 2025
Published
10 July 2025
10 July 2025
Abstract
Financial news headlines serve as a rich source of information on financial activities, offering a wealth of text that can provide insights into human behavior. One key analysis that can be conducted on this text is sentiment analysis. Despite extensive research over the years, sentiment analysis still faces challenges, particularly in handling internet slang, abbreviations, and emoticons commonly found on many websites that cover financial news headlines, including Bloomberg, Yahoo Finance, and Financial Times. This paper compares the performance of two sentiment analyzers—VADER and TextBlob—on financial news headlines from two countries: the USA (a well-developed economic nation) and Nepal (an underdeveloped economic nation). The collected headlines were manually classified into three categories (positive, negative, and neutral) from a financial perspective. The headlines were then cleaned and processed through the sentiment analyzers to compare their performance. The models’ performance is evaluated based on accuracy, sensitivity, specificity, and neutral specificity. Experimental results reveal that VADER performs better than TextBlob on both datasets. Additionally, both models perform better on financial news headlines from the USA than Nepal. These findings are further validated through statistical tests.
Supplementary material
Supplementary MaterialPython codes as well as datasets used in the study are available in a supplementary file.
References
Abiola O, Abayomi-Alli A, Tale OA, Misra S, Abayomi-Alli O (2023). Sentiment analysis of COVID-19 tweets from selected hashtags in Nigeria using VADER and TextBlob analyser. Journal of Electrical Systems and Information Technology, 10(1): 5. https://doi.org/10.1186/s43067-023-00070-9
Agresti A, Caffo B (2000). Simple and effective confidence intervals for proportions and differences of proportions result from adding two successes and two failures. American Statistician, 54(4): 280–288. https://doi.org/10.1080/00031305.2000.10474560
Al-Natour S, Turetken O (2020). A comparative assessment of sentiment analysis and star ratings for consumer reviews. International Journal of Information Management, 54: 102132. https://doi.org/10.1016/j.ijinfomgt.2020.102132
Al-Qablan TA, Mohd Noor MH, Al-Betar MA, Khader AT (2023). A survey on sentiment analysis and its applications. Neural Computing & Applications, 35(29): 21567–21601. https://doi.org/10.1007/s00521-023-08941-y
Aljedaani W, Rustam F, Mkaouer MW, Ghallab A, Rupapara V, Washington PB, et al. (2022). Sentiment analysis on twitter data integrating TextBlob and deep learning models: The case of US airline industry. Knowledge-Based Systems, 255: 109780. https://doi.org/10.1016/j.knosys.2022.109780
Araci D (2019). Finbert: Financial sentiment analysis with pre-trained language models. arXiv preprint: https://arxiv.org/abs/1908.10063.
Bonta V, Kumaresh N, Janardhan N (2019). A comprehensive study on lexicon based approaches for sentiment analysis. Asian Journal of Computer Science and Technology, 8(S2): 1–6. https://doi.org/10.51983/ajcst-2019.8.S2.2037
Ccoya W, Pinto E (2023). Comparative analysis of libraries for the sentimental analysis. arXiv preprint https://arxiv.org/abs/2307.14311.
Dahal KR, Amezziane M, et al. (2020). Exact distribution of difference of two sample proportions and its inferences. Open Journal of Statistics, 10(03): 363. https://doi.org/10.4236/ojs.2020.103024
Dahal KR, Gupta A, Pokhrel NR (2024). Predicting the direction of NEPSE index movement with news headlines using machine learning. Econometrics, 12(2): 16. https://doi.org/10.3390/econometrics12020016
Dahal KR, Pokhrel NR, Gaire S, Mahatara S, Joshi RP, Gupta A, et al. (2023). A comparative study on effect of news sentiment on stock price prediction with deep learning architecture. PLoS ONE, 18(4): e0284695. https://doi.org/10.1371/journal.pone.0284695
Hameed Z, Garcia-Zapirain B, Aguirre JJ, Isaza-Ruget MA (2022). Multiclass classification of breast cancer histopathology images using multilevel features of deep convolutional neural network. Scientific Reports, 12(1): 15600. https://doi.org/10.1038/s41598-022-19278-2
Hutto C (2020). VADER-Sentiment-Analysis, GitHub. Available at: https://github.com/cjhutto/vaderSentiment [Accessed: 2-Jul-2024].
Joshi RC, Singh D, Tiwari V, Dutta MK (2022). An efficient deep neural network based abnormality detection and multi-class breast tumor classification. Multimedia Tools and Applications, 81(10): 13691–13711. https://doi.org/10.1007/s11042-021-11240-0
Khoo CS, Johnkhan SB (2018). Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons. Journal of Information Science, 44(4): 491–511. https://doi.org/10.1177/0165551517703514
Loria S (2024). TextBlob: Simplified Text Processing. Available at: https://textblob.readthedocs.io/en/dev/quickstart.html#sentiment-analysis [Accessed: 28-Jun-2024].
Maqbool J, Aggarwal P, Kaur R, Mittal A, Ganaie IA (2023). Stock prediction by integrating sentiment scores of financial news and MLP-regressor: A machine learning approach. Procedia Computer Science, 218: 1067–1078. https://doi.org/10.1016/j.procs.2023.01.086
Mujahid M, Rustam F, Shafique R, Chunduri V, Villar MG, Ballester JB, et al. (2023). Analyzing sentiments regarding ChatGPT using novel BERT: A machine learning approach. Information, 14(9): 474. https://doi.org/10.3390/info14090474
Nemes L, Kiss A (2021). Prediction of stock values changes using sentiment analysis of stock news headlines. Journal of Information and Telecommunication, 5(3): 375–394. https://doi.org/10.1080/24751839.2021.1874252
Padmaja S, Fatima SS, Bandu S (2014). Evaluating sentiment analysis methods and identifying scope of negation in newspaper articles. International Journal of Advanced Research in Artificial Intelligence, 3(11): 1–6. https://doi.org/10.14569/IJARAI.2014.031101
Pano T, Kashef R (2020). A complete VADER-based sentiment analysis of bitcoin (BTC) tweets during the era of COVID-19. Big Data and Cognitive Computing, 4(4): 33. https://doi.org/10.3390/bdcc4040033
Pokhrel NR, Dahal KR, Rimal R, Bhandari HN, Khatri RK, Rimal B, et al. (2022). Predicting NEPSE index price using deep learning models. Machine Learning with Applications, 9: 100385. https://doi.org/10.1016/j.mlwa.2022.100385
Pokhrel NR, Dahal KR, Rimal R, Bhandari HN, Rimal B (2024). Deep-SDM: A unified computational framework for sequential data modeling using deep learning models. Software, 3(1): 47–61. https://doi.org/10.3390/software3010003
Sanyal S, Barai MK (2021). Comparative study on lexicon-based sentiment analysers over negative sentiment. International Journal of Electrical, Electronics and Computers, 6(6): 1–13. https://doi.org/10.22161/ijeec.66.1
Shamrat FJM, Azam S, Karim A, Islam R, Tasnim Z, Ghosh P, et al. (2022). Lungnet22: A fine-tuned model for multiclass classification and prediction of lung disease using X-ray images. Journal of Personalized Medicine, 12(5): 680. https://doi.org/10.3390/jpm12050680
Shayaa S, Jaafar NI, Bahri S, Sulaiman A, Wai PS, Chung YW, et al. (2018). Sentiment analysis of big data: Methods, applications, and open challenges. IEEE Access, 6: 37807–37827. https://doi.org/10.1109/ACCESS.2018.2851311
Shrestha PM, Lamichhane P (2021). Macroeconomic factors and stock market performance in Nepal. PYC Nepal Journal of Management, 14(1): 79–92. https://doi.org/10.3126/pycnjm.v14i1.41061
Shrestha PM, Lamichhane P (2022). Effect of firm-specific variables on stock returns: Evidence from Nepal. Quest Journal of Management and Social Sciences, 4(2): 249–259. https://doi.org/10.3126/qjmss.v4i2.50320
United Nations (2025). Least developed country category: Nepal. Available at: https://www.un.org/development/desa/dpad/least-developed-country-category-nepal.html [Accessed: 12-Jan-2025].
Wald A, Wolfowitz J (1940). On a test whether two samples are from the same population. The Annals of Mathematical Statistics, 11(2): 147–162. https://doi.org/10.1214/aoms/1177731909
Zhang W, Deng Y, Liu B, Pan SJ, Bing L (2023). Sentiment analysis in the era of large language models: A reality check. arXiv preprint https://arxiv.org/abs/2305.15005.