Evaluating the Performance of a Fake News Model on A Domain-Specific and Heterogeneous Dataset to Improve Detection
DOI:
https://doi.org/10.51173/jt.v7i2.2640Keywords:
Domain, Dataset, Detection, Fake-NewsAbstract
The rapid evolution of technology and the digital age has led to an increase in the spread of fake news, severely undermining the accuracy of information. This study aims to improve fake news detection methods in distinct domains through in-depth dataset analysis using a Convolutional Neural Network (CNN), the research-trained models using an optimized CNN model on publicly available datasets. The findings show that machine learning models trained on domain-specific datasets can accurately identify the nuances of fake news unique to those domains. Compared to models trained on broader datasets, the results demonstrate that models trained on domain-specific data achieved higher accuracy, precision, recall, and F1-score, increasing from 68% to 99% across all metrics when compared with a baseline CNN model. However, while domain-specific models perform exceptionally well in their respective contexts, models trained on a diverse range of datasets exhibit greater generalizability across domains. These findings suggest that dynamic and robust fake news detection systems should integrate both heterogeneous datasets and domain-specific features to enhance effectiveness.
Downloads
References
Tang, Yixuan, and Yi Yang. "Do We Need Domain-Specific Embedding Models? An Empirical Investigation." arXiv preprint arXiv:2409.18511 (2024), https://doi.org/10.48550/arXiv.2409.18511.
E. D. Ajik, G. N. Obunadike, and F. O. Echobu, ‘Fake News Detection Using Optimized CNN and LSTM Techniques’, Journal of Information Systems and Informatics, vol. 5, no. 3, pp. 1044–1057, 2023, doi: 10.51519/journalisi.v5i3.548.
Choudhury, D., Acharjee, T. A novel approach to fake news detection in social networks using genetic algorithm applying machine learning classifiers. Multimed Tools Appl 82, 9029–9045 (2023). https://doi.org/10.1007/s11042-022-12788-1.
Nadeem, Muhammad Imran, Syed Agha Hassnain Mohsan, Kanwal Ahmed, Dun Li, Zhiyun Zheng, Muhammad Shafiq, Faten Khalid Karim, and Samih M. Mostafa. 2023. "HyproBert: A Fake News Detection Model Based on Deep Hypercontext" Symmetry 15, no. 2: 296. https://doi.org/10.3390/sym15020296.
Krishna, S. Rama, S. V. Vasantha, and K. Mani Deep. "Survey on fake news detection using machine learning algorithms." International Journal of Engineering Research & Technology. ISSN (2021): 2278-0181.
Praseed, Amit, Jelwin Rodrigues, and P. Santhi Thilagam. "Hindi fake news detection using transformer ensembles." Engineering Applications of Artificial Intelligence 119 (2023): 105731, https://doi.org/10.1016/j.engappai.2022.105731.
Suryavardan, S., Shreyash Mishra, Megha Chakraborty, Parth Patwa, Anku Rani, Aman Chadha, Aishwarya Reganti et al. "Findings of factify 2: multimodal fake news detection." arXiv preprint arXiv:2307.10475 (2023), https://doi.org/10.48550/arXiv.2307.10475.
Albraikan, Amani Abdulrahman, Mohammed Maray, Faiz Abdullah Alotaibi, Mrim M. Alnfiai, Arun Kumar, and Ahmed Sayed. 2023. "Bio-Inspired Artificial Intelligence with Natural Language Processing Based on Deceptive Content Detection in Social Networking" Biomimetics 8, no. 6: 449. https://doi.org/10.3390/biomimetics8060449.
S. M. Dwivedi and S. B. Wankhade, ‘Comparing the Effectiveness of Different Machine Learning and Deep Learning Models for Fake News Detection’, Journal of Informatics Education and Research, vol. 3, no. 2, 2023, doi: 10.52783/jier.v3i2.207.
Kabir, ASM Humaun, Alexander Alexandrovich Kharlamov, and Ilia Mikhailovich Voronkov. "Research methods for fake news detection in bangla text." In International Conference on Neuroinformatics, pp. 54-60. Cham: Springer Nature Switzerland, 2023, https://doi.org/10.1007/978-3-031-44865-2_6.
Shariq, Abdul, Kayla Wright, Lauren Taylor, and Anna Marbut. "Detecting Fake News Using Natural Language Processing." Capstone Chronicles: 26.
B. Saha, ‘Indian Fake News’. 2022. [Online]. Available: https://www.kaggle.com/datasets/imbikramsaha/fake-real-news/.
E. Aghammadzada, ‘COVID19 Fake News Dataset NLP’. 2020. [Online]. Available: https://www.kaggle.com/datasets/elvinagammed/covid19-fake-news-dataset-nlp.
M. Koreň, ‘Dezinfo SK - Fake News Dataset’. 2023. [Online]. Available: https://www.kaggle.com/datasets/matejkore/dezinfo-sk-fake-news-dataset.
Zepopo, ‘Ukrainian news’. 2023. [Online]. Available: https://www.kaggle.com/datasets/zepopo/ukrainian-fake-and-true-news.
G. Hacheme, ‘French Fake News Detector’. 2020. [Online]. Available: https://www.kaggle.com/datasets/hgilles06/frenchfakenewsdetector.
S. D. Das, A. Basak, and S. Dutta, ‘A Heuristic-driven Ensemble Framework for COVID-19 Fake News Detection’. 2021.
J. C. B. Cruz and C. Cheng, ‘Fake News Detection via Multitask Transfer Learning’, in LREC, 2020.
Zubiaga, Arkaitz, Geraldine Wong Sak Hoi, Maria Liakata, and Rob Procter. "PHEME dataset of rumours and non-rumours." (2016). doi: 10.6084/m9.figshare.4010619.v1.
N. Michelle, ‘PHEME Dataset for Rumour Detection’. 2022. [Online]. Available: https://www.kaggle.com/datasets/nicolemichelle/pheme-dataset-for-rumour-detection/.
Ott, Myle, Yejin Choi, Claire Cardie, and Jeffrey T. Hancock. "Finding deceptive opinion spam by any stretch of the imagination." arXiv preprint arXiv:1107.4557 (2011), https://doi.org/10.48550/arXiv.1107.4557.
Ott, Myle, Claire Cardie, and Jeffrey T. Hancock. "Negative deceptive opinion spam." In Proceedings of the 2013 conference of the north american chapter of the association for computational linguistics: human language technologies, pp. 497-501. 2013.
Aragón, Mario Ezra, Horacio Jesús Jarquín-Vásquez, Manuel Montes-y-Gómez, Hugo Jair Escalante, Luis Villaseñor Pineda, Helena Gómez-Adorno, Juan Pablo Posadas-Durán, and Gemma Bel-Enguix. "Overview of MEX-A3T at IberLEF 2020: Fake News and Aggressiveness Analysis in Mexican Spanish." In IberLEF@ SEPLN, pp. 222-235. 2020.
Gómez-Adorno, Helena, Juan Pablo Posadas-Durán, Gemma Bel Enguix, and Claudia Porto Capetillo. "Overview of fakedes at Iberlef 2021: Fake news detection in Spanish shared task." Procesamiento del lenguaje natural 67 (2021): 223-231.
J. P. Posadas-Durán, H. Gómez-Adorno, G. Sidorov, and J. J. M. Escobar, ‘Detection of fake news in a new corpus for the Spanish language’, Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4869-4876, 2019.
Y. Yang, May 30, 2017. [Online]. Available: https://drive.google.com/open?id=0B3e3qZpPtccsMFo5bk9Ib3VCc2c.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Georgina N. Obuandike, Emmy Danny Ajik, Faith Oluwatosin Echobu

This work is licensed under a Creative Commons Attribution 4.0 International License.










