نوع مقاله : مقاله کامپیوتر
نویسندگان
دانشکده مهندسی کامپیوتر، دانشگاه صنعتی شاهرود، شاهرود، ایران.
چکیده
کلیدواژهها
موضوعات
عنوان مقاله [English]
نویسندگان [English]
Since the coronavirus was recognized as a pandemic infectious disease in 2019, most people were forced to stay at home during the pandemic. Given that social networks are a popular medium among people and during the pandemic, the analysis of user-generated social content can provide new insights and be effective for tracking the occurrence of the pandemic over time. This study aimed to provide a model to predict the incidence rate of COVID-19 in the first wave of the pandemic in Iran, through the analysis of Persian Instagram posts. Using the synergetic technique, three features of semantic similarity, fear feeling, and hope feeling were extracted from Instagram posts. For this purpose, word embedding techniques (Word2Vec, Glove, FastText) were used to calculate semantic similarity, and a BERT-based classifier model was used to identify fear and hope feelings. To improve performance, the SBERT model was also used instead of classical embedding methods. Then, a support vector regression (SVR) model was trained using statistical indices based on these features to predict the daily incidence rate of COVID-19. The results showed that the synergy of semantic similarity and fear sentiment features using SBERT in the SVM model provided the highest performance with a coefficient of determination (R²) of 0.52, which showed a significant improvement over the baseline methods. These findings indicate that the automatic combination of semantic and sentiment features can be an effective indicator for monitoring epidemics through social networks.
کلیدواژهها [English]