Comparative Analysis of VADER and TextBlob on Financial News Headlines

Dahal, Keshab Raj; Gupta, Ankrit; Budhathoki, Nirajan

doi:10.6339/25-JDS1195

Journal of Data Science

Comparative Analysis of VADER and TextBlob on Financial News Headlines

Keshab Raj Dahal

Ankrit Gupta Nirajan Budhathoki

https://doi.org/10.6339/25-JDS1195

Pub. online: 10 July 2025 Type: Data Science In Action

Open Access

Received
14 September 2024

Accepted
27 June 2025

Published
10 July 2025

Abstract

Financial news headlines serve as a rich source of information on financial activities, offering a wealth of text that can provide insights into human behavior. One key analysis that can be conducted on this text is sentiment analysis. Despite extensive research over the years, sentiment analysis still faces challenges, particularly in handling internet slang, abbreviations, and emoticons commonly found on many websites that cover financial news headlines, including Bloomberg, Yahoo Finance, and Financial Times. This paper compares the performance of two sentiment analyzers—VADER and TextBlob—on financial news headlines from two countries: the USA (a well-developed economic nation) and Nepal (an underdeveloped economic nation). The collected headlines were manually classified into three categories (positive, negative, and neutral) from a financial perspective. The headlines were then cleaned and processed through the sentiment analyzers to compare their performance. The models’ performance is evaluated based on accuracy, sensitivity, specificity, and neutral specificity. Experimental results reveal that VADER performs better than TextBlob on both datasets. Additionally, both models perform better on financial news headlines from the USA than Nepal. These findings are further validated through statistical tests.

Supplementary material

Supplementary Material

Python codes as well as datasets used in the study are available in a supplementary file.

References

Abiola O, Abayomi-Alli A, Tale OA, Misra S, Abayomi-Alli O (2023). Sentiment analysis of COVID-19 tweets from selected hashtags in Nigeria using VADER and TextBlob analyser. Journal of Electrical Systems and Information Technology, 10(1): 5. https://doi.org/10.1186/s43067-023-00070-9

Agbehadji IE, Ijabadeniyi A (2021). Approach to sentiment analysis and business communication on social media. In: Simon James Fong, and Richard C. Millham, editors, Bio-inspired Algorithms for Data Streaming and Visualization, Big Data Management, and Fog Computing 169–193.

Agresti A, Caffo B (2000). Simple and effective confidence intervals for proportions and differences of proportions result from adding two successes and two failures. American Statistician, 54(4): 280–288. https://doi.org/10.1080/00031305.2000.10474560

Al-Natour S, Turetken O (2020). A comparative assessment of sentiment analysis and star ratings for consumer reviews. International Journal of Information Management, 54: 102132. https://doi.org/10.1016/j.ijinfomgt.2020.102132

Al-Qablan TA, Mohd Noor MH, Al-Betar MA, Khader AT (2023). A survey on sentiment analysis and its applications. Neural Computing & Applications, 35(29): 21567–21601. https://doi.org/10.1007/s00521-023-08941-y

Al-Shabi M (2020). Evaluating the performance of the most important lexicons used to sentiment analysis and opinions mining. International Journal of Computer Science and Network Security, 20(1): 1.

Aljedaani W, Rustam F, Mkaouer MW, Ghallab A, Rupapara V, Washington PB, et al. (2022). Sentiment analysis on twitter data integrating TextBlob and deep learning models: The case of US airline industry. Knowledge-Based Systems, 255: 109780. https://doi.org/10.1016/j.knosys.2022.109780

Araci D (2019). Finbert: Financial sentiment analysis with pre-trained language models. arXiv preprint: https://arxiv.org/abs/1908.10063.

Asderis GA (2022). Sentiment analysis on twitter data, a detailed comparison of TextBlob and VADER.

Berger S (2013). Making in America: From Innovation to Market. MIT Press.

Bharadwaj L (2023). Sentiment analysis in online product reviews: Mining customer opinions for sentiment classification. International Journal for Multidisciplinary Research, 5(5): 1–34.

Bird S (2006). NLTK: The natural language toolkit. In: Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions, 69–72.

Bonta V, Kumaresh N, Janardhan N (2019). A comprehensive study on lexicon based approaches for sentiment analysis. Asian Journal of Computer Science and Technology, 8(S2): 1–6. https://doi.org/10.51983/ajcst-2019.8.S2.2037

Ccoya W, Pinto E (2023). Comparative analysis of libraries for the sentimental analysis. arXiv preprint https://arxiv.org/abs/2307.14311.

Dahal KR, Amezziane M, et al. (2020). Exact distribution of difference of two sample proportions and its inferences. Open Journal of Statistics, 10(03): 363. https://doi.org/10.4236/ojs.2020.103024

Dahal KR, Gupta A, Pokhrel NR (2024). Predicting the direction of NEPSE index movement with news headlines using machine learning. Econometrics, 12(2): 16. https://doi.org/10.3390/econometrics12020016

Dahal KR, Pokhrel NR, Gaire S, Mahatara S, Joshi RP, Gupta A, et al. (2023). A comparative study on effect of news sentiment on stock price prediction with deep learning architecture. PLoS ONE, 18(4): e0284695. https://doi.org/10.1371/journal.pone.0284695

Das N, Gupta S, Das S, Yadav S, Subramanian T, Sarkar N (2021). A comparative study of sentiment analysis tools. In: 2021 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES), 1–7. IEEE.

De Smedt T, Daelemans W (2012). Pattern for Python. Journal of Machine Learning Research, 13(1): 2063–2067.

Ekaputri AP, Akbar S (2022). Financial news sentiment analysis using modified VADER for stock price prediction. In: 2022 9th International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA), 1–6. IEEE.

Elbagir S, Yang J (2020). Sentiment analysis on Twitter with Python’s natural language toolkit and VADER sentiment analyzer. In: IAENG Transactions on Engineering Sciences: Special Issue for the International Association of Engineers Conferences 2019, 63–80. World Scientific.

Gonçalves P, Araújo M, Benevenuto F, Cha M (2013). Comparing and combining sentiment analysis methods. In: Proceedings of the First ACM Conference on Online Social Networks, 27–38.

Gordon RJ (2002). Technology and economic performance in the American economy. Working Paper 8771, National Bureau of Economic Research.

Hameed Z, Garcia-Zapirain B, Aguirre JJ, Isaza-Ruget MA (2022). Multiclass classification of breast cancer histopathology images using multilevel features of deep convolutional neural network. Scientific Reports, 12(1): 15600. https://doi.org/10.1038/s41598-022-19278-2

Hutto C (2020). VADER-Sentiment-Analysis, GitHub. Available at: https://github.com/cjhutto/vaderSentiment [Accessed: 2-Jul-2024].

Hutto C, Gilbert E (2014). VADER: A parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the International AAAI Conference on Web and Social Media, volume 8, 216–225.

Illia F, Eugenia MP, Rutba SA (2021). Sentiment analysis on pedulilindungi application using TextBlob and VADER library. In: Proceedings of The International Conference on Data Science and Official Statistics, volume 2021, 278–288.

Joshi RC, Singh D, Tiwari V, Dutta MK (2022). An efficient deep neural network based abnormality detection and multi-class breast tumor classification. Multimedia Tools and Applications, 81(10): 13691–13711. https://doi.org/10.1007/s11042-021-11240-0

Khoo CS, Johnkhan SB (2018). Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons. Journal of Information Science, 44(4): 491–511. https://doi.org/10.1177/0165551517703514

Kolbitsch J, Maurer HA (2006). The transformation of the web: How emerging communities shape the information we consume. Journal of Universal Computer Science, 12(2): 187–213.

Koukaras P, Nousi C, Tjortjis C (2022). Stock market prediction using microblogging sentiment analysis and machine learning. In: Telecom, volume 3, 358–378. MDPI.

Lemieux VL, Fisher B, Dang T (2014). The visual analysis of financial data. In: Margarita S. Brose, Mark D. Flood, Dilip Krishna, and Bill Nichols, editors, The Handbook of Financial Data and Risk Information II 279–326.

Loria S (2024). TextBlob: Simplified Text Processing. Available at: https://textblob.readthedocs.io/en/dev/quickstart.html#sentiment-analysis [Accessed: 28-Jun-2024].

Maqbool J, Aggarwal P, Kaur R, Mittal A, Ganaie IA (2023). Stock prediction by integrating sentiment scores of financial news and MLP-regressor: A machine learning approach. Procedia Computer Science, 218: 1067–1078. https://doi.org/10.1016/j.procs.2023.01.086

Min WNSW, Zulkarnain NZ, et al. (2020). Comparative evaluation of lexicons in performing sentiment analysis. Journal of Advanced Computing Technology and Application, 2(1): 1–8.

Mujahid M, Rustam F, Shafique R, Chunduri V, Villar MG, Ballester JB, et al. (2023). Analyzing sentiments regarding ChatGPT using novel BERT: A machine learning approach. Information, 14(9): 474. https://doi.org/10.3390/info14090474

Musto C, Semeraro G, Polignano M, et al. (2014). A comparison of lexicon-based approaches for sentiment analysis of microblog posts. In: DART@AI*IA, 59–68. Citeseer.

Nemes L, Kiss A (2021). Prediction of stock values changes using sentiment analysis of stock news headlines. Journal of Information and Telecommunication, 5(3): 375–394. https://doi.org/10.1080/24751839.2021.1874252

Padmaja S, Fatima SS, Bandu S (2014). Evaluating sentiment analysis methods and identifying scope of negation in newspaper articles. International Journal of Advanced Research in Artificial Intelligence, 3(11): 1–6. https://doi.org/10.14569/IJARAI.2014.031101

Pano T, Kashef R (2020). A complete VADER-based sentiment analysis of bitcoin (BTC) tweets during the era of COVID-19. Big Data and Cognitive Computing, 4(4): 33. https://doi.org/10.3390/bdcc4040033

Pennebaker JW, Francis ME, Booth RJ (2001). Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Associates, 71(2001): 2001.

Pokhrel NR, Dahal KR, Rimal R, Bhandari HN, Khatri RK, Rimal B, et al. (2022). Predicting NEPSE index price using deep learning models. Machine Learning with Applications, 9: 100385. https://doi.org/10.1016/j.mlwa.2022.100385

Pokhrel NR, Dahal KR, Rimal R, Bhandari HN, Rimal B (2024). Deep-SDM: A unified computational framework for sequential data modeling using deep learning models. Software, 3(1): 47–61. https://doi.org/10.3390/software3010003

Saha S, Showrov MIH, Rahman MM, Majumder MZH (2022). VADER vs. BERT: A comparative performance analysis for sentiment on coronavirus outbreak. In: International Conference on Machine Intelligence and Emerging Technologies, 371–385. Springer.

Sanyal S, Barai MK (2021). Comparative study on lexicon-based sentiment analysers over negative sentiment. International Journal of Electrical, Electronics and Computers, 6(6): 1–13. https://doi.org/10.22161/ijeec.66.1

Shamrat FJM, Azam S, Karim A, Islam R, Tasnim Z, Ghosh P, et al. (2022). Lungnet22: A fine-tuned model for multiclass classification and prediction of lung disease using X-ray images. Journal of Personalized Medicine, 12(5): 680. https://doi.org/10.3390/jpm12050680

Shayaa S, Jaafar NI, Bahri S, Sulaiman A, Wai PS, Chung YW, et al. (2018). Sentiment analysis of big data: Methods, applications, and open challenges. IEEE Access, 6: 37807–37827. https://doi.org/10.1109/ACCESS.2018.2851311

Shrestha PM, Lamichhane P (2021). Macroeconomic factors and stock market performance in Nepal. PYC Nepal Journal of Management, 14(1): 79–92. https://doi.org/10.3126/pycnjm.v14i1.41061

Shrestha PM, Lamichhane P (2022). Effect of firm-specific variables on stock returns: Evidence from Nepal. Quest Journal of Management and Social Sciences, 4(2): 249–259. https://doi.org/10.3126/qjmss.v4i2.50320

Singh AK, Verma A (2021). An efficient method for aspect based sentiment analysis using SpaCy and VADER. In: 2021 10th IEEE International Conference on Communication Systems and Network Technologies (CSNT), 130–135. IEEE.

Srivastava R, Bharti P, Verma P (2022). Comparative analysis of lexicon and machine learning approach for sentiment analysis. International Journal of Advanced Computer Science and Applications, 13(3): 71–77.

Talpada H, Halgamuge MN, Vinh NTQ (2019). An analysis on use of deep learning and lexical-semantic based sentiment analysis method on twitter data to understand the demographic trend of telemedicine. In: 2019 11th International Conference on Knowledge and Systems Engineering (KSE), 1–9. IEEE.

United Nations (2025). Least developed country category: Nepal. Available at: https://www.un.org/development/desa/dpad/least-developed-country-category-nepal.html [Accessed: 12-Jan-2025].

van Ooijen C, Ubaldi B, Welby B (2019). A data-driven public sector: Enabling the strategic use of data for productive, inclusive and trustworthy governance. Technical Report 33, OECD Publishing, Paris.

Wald A, Wolfowitz J (1940). On a test whether two samples are from the same population. The Annals of Mathematical Statistics, 11(2): 147–162. https://doi.org/10.1214/aoms/1177731909

Zhang T, Irsan IC, Thung F, Lo D (2025). Revisiting sentiment analysis for software engineering in the era of large language models. ACM Transactions on Software Engineering and Methodology, 34(3): 1–30.

Zhang W, Deng Y, Liu B, Pan SJ, Bing L (2023). Sentiment analysis in the era of large language models: A reality check. arXiv preprint https://arxiv.org/abs/2305.15005.

2025 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.

Open access article under the CC BY license.

Keywords

finance news sentiment analysis text mining

Funding

This research received no external funding.

Metrics

since February 2021

135

Article info
views

PDF
downloads

RSS

Authors

Abstract

Supplementary material

References

Export citation

Copy and paste formatted citation

Download citation in file