Identification of a suitable NLP model for the detection of symptoms mentioned in textual conversations of Covid-19 infected persons

Autores/as

  • Acosta-Guzmán, Ivan Leonel
  • Varela-Tapia, Eleanor Alexandra
  • Sangacha Tapia, Lady Mariuxi
  • Solórzano Monserrate, Mirian Estefanía
  • Acosta-Varela, Christopher Ivan

DOI:

https://doi.org/10.18687/LACCEI2023.1.1.261

Palabras clave:

Covid-19, NLP for Classification, Random Forest, Dense Neural Network, LSTM, Metrics

Resumen

In March 2020, the World Health Organization (WHO) declared Covid-19 disease as a global pan-demic. Therefore, the need for reliable information arose, so several Virtual Health Assistants emerged to provide information to the public to teach the population how to prevent or cope with the Covid-19 Alpha variant infection, but progressively emerged the Beta, Delta, Omicron variants, with different symptomatology, which triggered new waves of infections and deaths in the world. For this reason, the present study promoted the creation of a NLP (Natural Language Processing) model to analyze the experiences of infected people in Guayaquil and to detect the predominant symptoms mentioned in their textual conversations, For this purpose, the Quantitative Methodology was used, using surveys through the Google Colab form, reaching 2873 people in the city of Guayaquil who had and overcame the Covid-19, the Qualitative Methodology was used, using interviews to NLP experts, which allowed corroborating the classifiers suggested in this type of software, Thus, the corpus of textual conversations was generated, 5 different types of NLP models were created in Python, based on classifier algorithms Random Forest, Dense Neural Network (DNN), and Long-Short Term Memory (LSTM), the results were evaluated using metrics such as Accuracy, Precision, Recall, and AUC, finding that the model based on Dense Neural Network DNN obtained the highest metrics.

Descargas

Publicado

2024-04-16

Número

Sección

Articles