بهبود استخراج ویژگی با استفاده از یک مدل یادگیری عمیق گروهی برای تشخیص موجودیت

نوع مقاله : مقاله کامپیوتر

نویسندگان

1 گروه کامپیوتر، دانشکده مهندسی، دانشگاه آزاد اسلامی واحد اراک،

2 گروه مهندسی کامپیوتر، واحد اراک، دانشگاه آزاد اسلامی، اراک، ایران

چکیده

یکی از مراحل اولیه در بیشتر پردازش‌های زبان طبیعی، استخراج موجودیت نامدار از جمله است. در این زمینه تکنیک‌های مختلف مبتنی بر یادگیری ماشین ارائه شده است که بدون نیاز به پیچیدگی‌های استخراج ویژگی دستی، دقت بالاتری از خود نشان داده اند. لذا، دراین تحقیق ما برای گرفتن ویژگی‌های جمله ورودی از ترکیب دو مدل یادگیری عمیق شامل شبکه عصبی کانولوشن و همینطور حافظه کوتاه مدت طولانی استفاده می‌کنیم. با استخراج ویژگی‌های محلی کلمات توسط شبکه کانولوشن در کنار ویژگی‌های سراسری، اطلاعات بیشتری از جمله جهت کلاسبندی دقیقتر موجودیتها بدست میآوریم. ما معماری پیشنهادی‌مان را روی دو دیتاست CoNLL2003 و ACE05 ارزیابی می‌نماییم و نشان میدهیم که افزودن شبکه کانولوشن سطح کلمه باعث استخراج اطلاعات محلی مفیدی از کلمات موجود در جمله می‌شود که منجر به افزایش دقت سیستم می‌گردد. در نهایت، کارایی سیستم را با دیگر رقبا مقایسه مینماییم و برتری این معماری نسبت به دیگران گزارش داده می‌شود.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

An ensemble deep learning model to enhance feature representation for entity detection

نویسندگان [English]

  • Elham Parsaeimehr 1
  • mehdi fartash 2
  • Javad Akbari Torkestani 2
1 Department of Computer Engineering, َArak Branch, Islamic Azad University, Arak, Iran
2 Department of Computer Engineering, Arak Branch, Islamic Azad University, Arak, Iran
چکیده [English]

One of the main processes in most natural language processing (NLP), is named entity recognition (NER). In this regard, some machine learning techniques have been presented that traditionally use manual features. Also, in recent years, deep neural network-based models have been proposed that achieve higher accuracy without relying on huge computations for feature engineering. Thus, in this article, we employ a combination of two deep learning models to capture the properties of the input sentence, including: long short term memory (LSTM) and convolutional neural network (CNN). In this architecture, extracting local features along with global features, more information is acquired for more accurate classification. We evaluate the performance of this architecture on two datasets CoNLL2003 and ACE05; and demonstrate that by adding a word level CNN, useful local properties are extracted that enhance the accuracy of the performance. Finally, we compare the performance of our system with competitors and our superiority is reported.

کلیدواژه‌ها [English]

  • Named Entity Recognition
  • LSTM
  • CNN
  • Word Embedding
  • Natural Language Processing
[1] A. Akkasi, and E. Varoglu, “Improvement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination”, Journal of AI and Data Mining, Vol. 7, NO. 2, 2019, pp. 311-319.
[2] E. Parsaeimehr, M. Fartash, and J. Akbari Torkestani, “An Enhanced Deep Neural Network-Based Architecture for Joint Extraction of Entity Mentions and Relations”, Journal of Fuzzy logic and Intelligent Systems, Vol. 20, NO. 1, 2020, pp. 69-76.
[3] علی سلیمانی ایوری، محمد رضا فدوی امیری و حسین مروی، "تولید سیگنال مصنوعی زلزله به کمک مدلی جدید در فشرده سازی و آموزش شبکه های عصبی مصنوعی"، مجله مدل سازی در مهندسی، دانشگاه سمنان، دوره14 ، شماره 46، سال 1395، صفحه 75-85.
[4] علی نظری، "مدلسازی انرژی ضربه ی فولادهای مرتبه ای با استفاده از شبکه های عصبی مصنوعی"، مجله مدل سازی درمهندسی، دانشگاه سمنان، دوره14 ، شماره45 ، تابستان1395 ، صفحه 145-162.
[5] زهرا مروج و جواد آذرخش، "شبیه سازی و طبقه بندی وقایع کیفیت توان با استفاده از شبکه عصبی"، مجله مدل سازی در مهندسی، دانشگاه سمنان، دوره13 ، شماره41 ، تابستان1394 ، صفحه 137-146.
[6] M.A. Spalenza, L. Lusquino-Filho, F. M. G. Franca, P. M. V. Lima, and E. Oliveira1, “LCAD - UFES at FakeDeS 2021:Fake News Detection Using Named Entity Recognition and Part-of-Speech Sequences”, IberLEF@SEPLN 2021, 2021.
[7] B. Song, F. Li, Y. Liu, and X. Zeng, “Deep learning methods for biomedical named entity recognition: a survey and qualitative comparison”, Briefings in Bioinformatics, 2021.
[8] R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, “Natural language processing (almost) from scratch,” The Journal of Machine Learning Research, Vol. 12, 2011, pp. 2493– 2537.
[8] X. Ma, and E. Hovy, “End-to-end sequence labeling via bi-directional lstm-cnns-crf”, 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 2016.
[10] O. Etzioni, M. Cafarella, D. Downey, A.M. Popescu, T. Shaked, S. Soderland, D.S. Weld, and A. Yates, “Unsupervised named entity extraction from the web: An experimental study”, Artificial intelligence, Vol. 165, NO. 1, 2005, pp. 91–134.
[11] S. Zhang, and N. Elhadad, “Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts,” Journal of Biomedical Informatics, Vol. 46, NO. 6, 2013, pp. 1088–1098.
[12] A.P. Quimbaya, A.S. Múnera, R.A.G. Rivera, J.C.D. Rodríguez, O.M.M. Velandia, A.A.G. Peña, and C. Labbé, “Named entity recognition over electronic health records through a combined dictionary-based approach”, Procedia Computer Science, Vl. 100, 2016, pp. 55–61.
[13] D. Hanisch, K. Fundel, H.T. Mevissen, R. Zimmer, and J. Fluck, “Prominer: rule-based protein and gene entity recognition”, BMC bioinformatics, Vol. 6, NO. 1:S14, 2005.
[14] G. Zhou, and J. Su, “Named entity recognition using an hmm based chunk tagger”, 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, 2002.
[15] W. Liao, and S. Veeramachaneni, “A simple semi-supervised algorithm for named entity recognition”, NAACL HLT 2009 Workshop on Semi-supervised Learning for Natural Language Processing. Boulder, Colorado, 2009.
[16] J. Hoffart, M.A. Yosef, I. Bordino, H. Fürstenau, M. Pinkal, M. Spaniol, B. Taneva, S. Thater, and G. Weikum, “Robust disambiguation of named entities in text”, 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK, 2011.
[17] Z. Ji, A. Sun, G. Cong, and J. Han, “Joint recognition and linking of fine-grained locations from tweets”, International World Wide Web Conference Committee, Montréal, Québec, Canada, 2016.
[18] L. Jafar Tafreshi, and F. Soltanzadeh, “A Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features”, Journal of AI and Data Mining, Vol. 8, NO. 2, 2020, pp. 227-236.
[19] D. Nadeau, and S. Sekine, “A survey of named entity recognition and classification”, Lingvisticae Investigationes, Vol. 30, NO. 1, 2007, pp. 3-26.
[20] G. Petasis, A. Cucchiarelli, P. Velardi, G. Paliouras, V. Karkaletsis, and C.D. Spyropoulos, “Automatic adaptation of proper noun dictionaries through cooperation of machine learning and probabilistic methods”, 23rd annual international ACM SIGIR conference on Research and development in information retrieval, Athens, Greece, 2000.
[21] J. Hammerton, “Named entity recognition with long short-term memory”, Seventh Conference on Natural Language Learning at HLT-NAACL 2003, 2003, pp. 172–175.
[22] Z. Huang, W. Xu, and K. Yu, “Bidirectional lstm-crf models for sequence tagging”, arXiv preprint, 2015. arXiv:1508.01991.
[23] R. Chalapathy, E. Zare Borzeshi, and M. Piccardi, “An investigation of recurrent neural architectures for drug name recognition”, Seventh International Workshop on Health Text Mining and Information Analysis, Austin, TX, 2016.
[24] K. Xu, Z. Zhou, T. Hao, and W. Liu, “A bidirectional lstm and conditional random fields approach to medical named entity recognition”, International Conference on Advanced Intelligent Systems and Informatics, 2017.
[25] S. Yan, C. Hardmeier, and J. Nivre, “Multilingual named entity recognition using hybrid neural networks”, Sixth Swedish Language Technology Conference (SLTC), 2016.
[26] J. PC Chiu, and E. Nichols, “Named entity recognition with bidirectional lstm-cnns”, Transactions of the Association for Computational Linguistics, Vol. 4, 2016, pp. 357-370.
[27] S. Misawa, M. Taniguchi, Y. Miura, and T. Ohkuma, “Character-based bidirectional lstm-crf with words and characters for Japanese named entity recognition”, First Workshop on Subword and Character Level Models in NLP, Copenhagen, Denmark, 2017.
[28] C.N. dos Santos, and V. Guimaraes, “Boosting named entity recognition with neural character embeddings”, Proceedings of the Fifth Named Entity Workshop of the Association for Computational Linguistics, 2015, pp. 25-33.
[29] G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, and C. Dyer. “Neural architectures for named entity recognition”, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016, pp. 260-270.
[30] M. Habibi, L. Weber, M. Neves, D.L. Wiegandt, and U. Leser, “Deep learning with word embeddings improves biomedical named entity recognition”, Bioinformatics, Vol. 33, No. 14, 2017, pp. 137–148.
[31] M Miwa, and M. Bansal. “End-to-end relation extraction using LSTMs on sequences and tree structures”, Association for Computational Linguistics, 2016, pp. 1105-1116.
[32] S. Zheng, J. Xu, P. Zhou, H. Bao,Q. Zhenyu, and B. Xu, “A neural network framework for relation extraction: learning entity semantic and relation pattern”, Knowledge-Based Systems, Vol. 114, 2016, pp. 12-23.
[33] A. Graves, and J. Schmidhuber, “Framewise phoneme classification with bidirectional LSTM networks”, Journal of Neural Networks, Vol. 18, NO. 5-6, 2005, pp. 602-610.
[34] J. Lafferty, A. McCallum and F. CN Pereira, “Conditional random fields: Probabilistic models for segmenting and labeling sequence data”, In: Proceedings of the 18th International Conference on Machine Learning 2001, San Francisco, US, 2001.
[35] S. Zheng, Y. Hao, D. Lu, H. Bao, J. Xu, H. Hao, and B. Xu. “Joint entity and relation extraction based on a hybrid neural network”, Journal of Neurocomputing, Vol. 257, 2017, pp. 59-66.
[36] T. Liu, J. Yao, and C. Lin, “Towards improving neural named entity recognition with gazetteers”, in ACL, 2019, pp. 5301-5307.
[37] C. Xia, C. Zhang, T. Yang, Y. Li, N. Du, X. Wu, W. Fan, F. Ma, and P. S. Yu, “Multi-grained named entity recognition”, in ACL, 2019, pp. 1430-1440.