A diagnostic system for detecting COVID-19 patients depending on lexicon semantic and Biterm Topic Model-Based Feature Selection on WhatsApp Messages Classification
COVID-19 has created an urgent need for innovative detection methods. This study presents a novel approach to identifying potential COVID-19 patients by analyzing their WhatsApp messages using advanced natural language processing techniques. Our methodology combines Word2Vec embeddings with lexical-semantic enrichment using ConceptNet, creating a comprehensive system that can detect subtle linguistic patterns associated with COVID-19 symptoms and experiences. The system processes WhatsApp messages through multiple stages: initial data collection, Word2Vec embedding, lexicon semantic enhancement, vector-space model creation, Biterm Topic Model-based feature selection, and finally, Naive Bayes classification. By enriching the language model with synonyms and capturing complex semantic relationships, our approach can identify potential COVID-19 cases based on how people describe their symptoms and experiences in everyday conversations. We tested the system on a sample of diverse WhatsApp messages, achieving promising results in distinguishing between messages from COVID-19 patients and healthy individuals. The system successfully identified both explicit statements of COVID-19 status and more subtle descriptions of symptoms, while correctly classifying non-COVID related messages with high confidence. While this method shows potential as a non-invasive and scalable screening tool, it should be viewed as complementary to existing diagnostic approaches rather than a replacement. Further large-scale testing is needed to fully validate the system's reliability and effectiveness in real-world applications.
Hatem, R. Majeed and Eliwe, N. H. (2025). A diagnostic system for detecting COVID-19 patients depending on lexicon semantic and Biterm Topic Model-Based Feature Selection on WhatsApp Messages Classification. Computer and Knowledge Engineering, 8(1), 53-64. doi: 10.22067/cke.2025.91490.1145
MLA
Hatem, R. Majeed, and Eliwe, N. H. . "A diagnostic system for detecting COVID-19 patients depending on lexicon semantic and Biterm Topic Model-Based Feature Selection on WhatsApp Messages Classification", Computer and Knowledge Engineering, 8, 1, 2025, 53-64. doi: 10.22067/cke.2025.91490.1145
HARVARD
Hatem, R. Majeed, Eliwe, N. H. (2025). 'A diagnostic system for detecting COVID-19 patients depending on lexicon semantic and Biterm Topic Model-Based Feature Selection on WhatsApp Messages Classification', Computer and Knowledge Engineering, 8(1), pp. 53-64. doi: 10.22067/cke.2025.91490.1145
CHICAGO
R. Majeed Hatem and N. H. Eliwe, "A diagnostic system for detecting COVID-19 patients depending on lexicon semantic and Biterm Topic Model-Based Feature Selection on WhatsApp Messages Classification," Computer and Knowledge Engineering, 8 1 (2025): 53-64, doi: 10.22067/cke.2025.91490.1145
VANCOUVER
Hatem, R. Majeed, Eliwe, N. H. A diagnostic system for detecting COVID-19 patients depending on lexicon semantic and Biterm Topic Model-Based Feature Selection on WhatsApp Messages Classification. Computer and Knowledge Engineering, 2025; 8(1): 53-64. doi: 10.22067/cke.2025.91490.1145
Send comment about this article