کشف ناهنجاری با استفاده از کد کننده خودکار مبتنی بر بلوک‌های LSTM

نوع مقاله: مقاله کامپیوتر

نویسندگان

1 دانشکده مهندسی کامپیوتر و فن آوری اطلاعات - دانشگاه صنعتی شاهرود - شاهرود

2 دانشکده مهندسی کامپیوتر و فناوری اطلاعات - دانشگاه صنعتی شاهرود - شاهرود - ایران

چکیده

کشف ناهنجاری به معنای یافتن نمونه‌هایی است که با اکثریت هنجار و عادی داده‌ها تفاوت دارند. یکی از اساسی‌ترین چالش‌هایی که در سر راه انجام این کار مهم وجود دارد این است که نمونه‌های برچسب خورده، به‌ویژه برای کلاس ناهنجار کمیاب و گاه نایاب هستند. ما در این مقاله روشی را پیشنهاد می‌کنیم که برای کشف ناهنجاری تنها از داده‌های هنجار استفاده می‌کند. این روش بر مبنای شبکه‌های عصبی تأسیس‌شده که کد کننده خودکار نام دارند و در مطالعات یادگیری عمیق موردتوجه هستند. یک کد کننده خودکار ورودی خود را در خروجی بازتولید کرده و خطای بازسازی را به‌عنوان رتبه ناهنجاری مورداستفاده قرار می‌دهد. ما برای ساخت کد کننده، به‌جای نورون‌های معمولی از بلوک‌های LSTM استفاده کرده‌ایم. این بلوک‌ها درواقع نوعی از شبکه‌های عصبی بازگشتی هستند که در کشف و استخراج وابستگی‌های زمانی و مجاورتی مهارت دارند. نتیجه به‌کارگیری کد کننده خودکار مبتنی بر بلوک‌های LSTM برای کشف ناهنجاری نقطه‌ای در ده نمونه از دادگان‌های رایج نشان می‌دهد که این روش در استخراج مدل درونی داده‌های هنجار و تشخیص داده‌های ناساز موفق بوده است. معیار AUC مدل مذکور، تقریباً در تمامی موارد از AUC یک کد کننده خودکار معمولی و روش مشهور ماشین بردار پشتیبان تک کلاسه یا OC-SVM بهتر است.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Anomaly Detection using LSTM AutoEncoder

نویسندگان [English]

  • Mahmoud Moallem 1
  • Ali Akbar Pouyan 2
1 Faculty of Computer & IT Engineering, Shahrood University of Technology, Shahrood, Iran,
2 Faculty of Computer & IT Engineering, Shahrood University of Technology, Shahrood, Iran
چکیده [English]

Anomaly detection means detecting samples that are different from the normal samples in the dataset. One of the great challenges in this area is finding labeled data, especially for the abnormal categories. In this paper, we propose a method that uses normal data to detect anomalies. This method is based on established neural networks which are called automated encoder and are considered in deep learning studies. An automated encoder reproduces its input as output and reconstruction deviation to rate anomalies. We have used LSTM blocks to construct encoder instead of using ordinary neurons. In fact, these blocks are a category of recurring neural networks that are specialized in discovering and fetching time and proximity dependencies. The result of employing an automated encoder using LSTM blocks to detect point anomalies shows that this approach has been promising and successful in extracting the normal data’s internal model and also detecting anomalous data. The AUC factor of the model, in almost all cases, is better than the AUC of an ordinary automated encoder and One Class Support Vector Machine (OC-SVM).

کلیدواژه‌ها [English]

  • Anomaly Detection
  • AutoEncoder
  • LSTM
  • Deep learning
 

[1]         F. E. Grubbs, (1969) , “Procedures for Detecting Outlying Observations in Samples,” Technometrics, vol. 11, no. 1, pp. 1–21.

[2]         W. Rechenberg, (1982) , “Identification of outliers,” Fresenius’ Zeitschrift fur Anal. Chemie, vol. 311, no. 6, pp. 590–597.

[3]         Y. Ma, P. Zhang, Y. Cao, and L. Guo, (2013) , “Parallel auto-encoder for efficient outlier detection,” Proc. - 2013 IEEE Int. Conf. Big Data, Big Data 2013, vol. 2, no. 3, pp. 15–17.

[4]         C. Zhou and R. C. Paffenroth, (2017) , “Anomaly Detection with Robust Deep Autoencoders,” Proc. 23rd ACM SIGKDD Int. Conf. Knowl. Discov. Data Min.  - KDD ’17, pp. 665–674.

[5]         M. S. Aldosari and E. Blaisten-Barojas, (2016) , “Unsupervised Anomaly Detection in Sequences Using Long Short Term Memory Recurrent Neural Networks,” George Mason University, pp. 1-25.

[6]         S. P. Singh, A. Kumar, H. Darbari, L. Singh, A. Rastogi, and S. Jain, (2017) , “Machine translation using deep learning: An overview,” 2017 International Conference on Computer, Communications and Electronics (Comptelix). pp. 162–167.

[7]         S. Chauhan and L. Vig, (2015) , “Anomaly detection in ECG time signals via deep long short-term memory networks,” in Proceedings of the 2015 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2015, pp. 1–7.

[8]         M. Markou and S. Singh, (2003) , “Novelty detection: a review—part 1: statistical approaches,” Signal Processing, vol. 83, no. 12, pp. 2481–2497.

[9]         M. Markou and S. Singh, (2003) , “Novelty detection: a review—part 2:: neural network based approaches,” Signal Processing, vol. 83, no. 12, pp. 2499–2521.

[10]       E. R. de Faria, I. R. Goncalves, J. ao Gama, and A. C. P. de L. F. Carvalho, (2015) , “Evaluation of Multiclass Novelty Detection Algorithms for Data Streams,” IEEE Trans. Knowl. Data Eng., vol. 27, no. 11, pp. 2961–2973.

[11]       Satheesh Chandran C., S. Kamal, A. Mujeeb, and Supriya M.H., (2015) , Feb-, “Novel class detection of underwater targets using Self-Organizing neural networks,” in 2015 IEEE Underwater Technology (UT), pp. 1–5.

[12]       L. Tarassenko, (1995) , “Novelty detection for the identification of masses in mammograms,” in 4th International Conference on Artificial Neural Networks, vol. 1995, pp. 442–447.

[13]       K. WORDEN, G. MANSON, and D. ALLMAN, (2003) , “Experimental Validation of a Structural Health Monitoring Methodology: Part I. Novelty Detection on a Laboratory Structure,” J. Sound Vib., vol. 259, no. 2, pp. 323–343.

[14]       J. Foote, (2000) , “Automatic audio segmentation using a measure of audio novelty,” in 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532), vol. 1, pp. 452–455.

]15[     غ. شفابخش, ح. نادر پور, ف. فصیحی, (1389) , “انتخاب الگوریتم بهینه شبکه عصبی در تحلیل روسازی‌های انعطاف‌پذیر راه‌ها”, مدل‌سازی در مهندسی, دوره 8, شماره 21, صفحه 56-45

]16[     ع. مرتضایی, ع. خیرالدین, (1391) , “مدل‌‌سازی و تخمین طول مفصل پلاستیک ستون‌های بتن‌آرمه به کمک شبکه‌های عصبی مصنوعی”,  مدل‌سازی در مهندسی, دوره 10، شماره 29، صفحه 17-1

]17[     ز. مروج, ج. آذرخش, (1394) , “شبیه‌سازی و طبقه‌بندی وقایع کیفیت توان با استفاده از شبکه عصبی” , مدل‌سازی در مهندسی, دوره 13، شماره 41، صفحه 146-137

]18[     س. ع. سلیمانی ایوری, م. فدوی امیری, ح. مروی, (1395) , “تولید سیگنال مصنوعی زلزله به کمک مدلی جدید در فشرده‌سازی و آموزش شبکه‌های عصبی مصنوعی,” مدل‌سازی در مهندسی, دوره 14, شماره 46,  صفحه 85-75

 [19]      E. W. Tavares Ferreira, G. Arantes Carrijo, R. de Oliveira, and N. Virgilio de Souza Araujo, (2011) , “Intrusion Detection System with Wavelet and Neural Artifical Network Approach for Networks Computers,” IEEE Lat. Am. Trans., vol. 9, no. 5, pp. 832–837.

[20]       M. A. F. Pimentel, D. A. Clifton, L. Clifton, and L. Tarassenko, (2014) , “A review of novelty detection,” Signal Processing, vol. 99, pp. 215–249.

[21]       B. B. Thompson, R. J. Marks, J. J. Choi, M. A. El-Sharkawi, and C. Bunje, (2002) , “Implicit learning in autoencoder novelty assessment,” Proc. 2002 Int. Jt. Conf. Neural Networks. IJCNN’02 (Cat. No.02CH37290), pp. 2878–2883.

[22]       M. Sabokrou, M. Fathy, M. Hoseini, and R. Klette, (2015) , “Real-time anomaly detection and localization in crowded scenes,” 2015 IEEE Conf. Comput. Vis. Pattern Recognit. Work., pp. 56–62.

[23]       W. Yan and L. Yu, (2015) , “On Accurate and Reliable Anomaly Detection for Gas Turbine Combustors : A Deep Learning Approach,” PHM Conf., pp. 1–8.

[24]       Y. Xiong and R. Zuo, (2016) , “Recognition of geochemical anomalies using a deep autoencoder network,” Comput. Geosci., vol. 86, pp. 75–82.

[25]       P. Malhotra, L. Vig, G. Shroff, and P. Agarwal, (2015) , “Long Short Term Memory Networks for Anomaly Detection in Time Series,” in European Symposium on Artificial Neural Networks, no. April, pp. 22–24.

[26]       P. Malhotra, A. Ramakrishnan, G. Anand, L. Vig, P. Agarwal, and G. Shroff, (2016), “LSTM-based Encoder-Decoder for Multi-sensor Anomaly Detection,” in Anomaly Detection Workshop at 33rd International Conference on Machine Learning (ICML 2016), pp. 25–30.

[27]       M. Cheng, Q. Xu, J. Lv, W. Liu, Q. Li, and J. Wang, (2016), “MS-LSTM: A multi-scale LSTM model for BGP anomaly detection,” Proc. - Int. Conf. Netw. Protoc. ICNP, vol. 2016–Decem, no. NetworkML, pp. 1–6.

[28]       B. Schölkopf, R. Williamson, A. Smola, J. Shawe-Taylor, and J. Platt, (1999) , “Support vector method for novelty detection,” Proceedings of the 12th International Conference on Neural Information Processing Systems. MIT Press, pp. 582–588.

[29]       J. Ma and S. Perkins, (2003) , “Time-series novelty detection using one-class support vector machines,” in Proceedings of the International Joint Conference on Neural Networks, 2003. vol. 3, pp. 1741–1745.

[30]       P. Hayton, B. Schölkopf, L. Tarassenko, and P. Anuzis, (2000) , “Support vector novelty detection applied to jet engine vibration spectra,” Proceedings of the 13th International Conference on Neural Information Processing Systems. MIT Press, pp. 907–913.

[31]       L. Tarassenko, A. Nairac, N. Townsend, and P. Cowley, (1999) , “Novelty detection in jet engines,” in IEE Colloquium on Condition Monitoring: Machinery, External Structures and Health (Ref. No. 1999/034), vol. 1999, pp. 1–5.

[32]       L. Clifton, D. A. Clifton, Y. Zhang, P. Watkinson, L. Tarassenko, and H. Yin, (2014) , “Probabilistic Novelty Detection With Support Vector Machines,” IEEE Trans. Reliab., vol. 63, no. 2, pp. 455–467.

[33]       D. R. Hardoon and L. M. Manevitz, (2000) , “One-class machine learning approach for fMRI analysis,” in Proceedings of Postgraduate Research Conference in Electronics, Photonics, Communications and Networks, and Computer Science (PREP), Lancaster, UK, 2005b, pp. 1–2.

[34]       M. Davy, F. Desobry, A. Gretton, and C. Doncarli, (2006) , “An online support vector machine for abnormal events detection,” Signal Processing, vol. 86, no. 8, pp. 2009–2025.

[35]       J. Elman, (1990) , “Finding structure in time* 1,” Cogn. Sci., vol. 14, no. 1, pp. 179–211.

[36]       M. Jordan, (1997) , “Serial order: A parallel distributed processing approach,” Adv. Psychol., vol. 121, pp. 471–495.

[37]       Z. C. Lipton, J. Berkowitz, and C. Elkan, (2015) , “A Critical Review of Recurrent Neural Networks for Sequence Learning,” pp. 1–38.

[38]       S. Hochreiter and J. Urgen Schmidhuber, (1997) , “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780.

[39]       F. A. Gers and J. Schmidhuber, (2000) , “Recurrent nets that time and count,” in Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium, pp. 189–194 vol.3.

[40]       K. Greff, R. K. Srivastava, J. Koutník, B. R. Steunebrink, and J. Schmidhuber, (2017) , “LSTM: A Search Space Odyssey,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 10. pp. 2222–2232.

[41]       T. Fawcett, (2006) , “An introduction to ROC analysis,” Pattern Recognit. Lett., vol. 27, no. 8, pp. 861–874.

[42]       M. Goldstein, and S. Uchida, (2016) , “A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data,” PLoS One, no. April, pp. 1–31.

[43]       R. C. Staudemeyer, (2015) , “Applying long short-term memory recurrent neural networks to intrusion detection,” Sacj, vol. 56, no. 56, pp. 136–154.

[44]       G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent, and M. E. Houle, (2016) , “On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study,” Data Min. Knowl. Discov., vol. 30, no. 4, pp. 891–927.

[45]       A. Emmott, S. Das, T. Dietterich, A. Fern, and W.-K. Wong, (2015) , “A Meta-Analysis of the Anomaly Detection Problem,” Oregon State University Libraries & Press, pp. 12-23.

[46]       O. L. Mangasarian, W. N. Street, and W. H. Wolberg, (1995) , “Breast Cancer Diagnosis and Prognosis Via Linear Programming,” Oper. Res., vol. 43, no. 4, pp. 570–577.

[47]       H.-P. Kriegel, P. Kröger, E. Schubert, and A. Zimek, (2009) , “LoOP: local outlier probabilities,” Proc. 18th ACM Conf. Inf. Knowl. Manag., pp. 1649–1652.

[48]       B. Micenková, B. McWilliams, and I. Assent, (2014) , “Learning Outlier Ensembles : The Best of Both Worlds – Supervised and Unsupervised,” Proc. ACM SIGKDD Work. Outlier Detect. Descr. ODD., pp. 1–4.

[49]       W. Schi, M. Joost, R. Werner, and D.- Koblenz, (1992) , “Synthesis and Performance Analysis of Multilayer Neural Network Architectures,” Koblenz, pp. 100-130.

[50]       N. Abe, B. Zadrozny, and J. Langford, (2006) , “Outlier detection by active learning,” in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’06, pp. 504–509.

[51]       M. Reif, M. Goldstein, A. Stahl, and T. M. Breuel, (2008) , Dec-, “Anomaly detection by combining decision trees and parametric densities,” in 2008 19th International Conference on Pattern Recognition, pp. 1–4.

[52]       J.-M. Geusebroek, G. J. Burghouts, and A. W. M. Smeulders, (2005) , “The Amsterdam Library of Object Images,” Int. J. Comput. Vis., vol. 61, no. 1, pp. 103–112.

[53]       E. Schubert, R. Wojdanowski, A. Zimek, and H.-P. Kriegel, (2012) , “On Evaluation of Outlier Rankings and Outlier Scores,” Proc. 2012 SIAM Int. Conf. Data Min., pp. 1047–1058.

[54]       U. Carrasquilla, (2010) , “Benchmarking Algorithms for Detecting Anomalies in Large Datasets,” Rev. Lit. Arts Am., pp. 1–16.

[55]       K. Leung and C. Leckie, (2005) , “Unsupervised anomaly detection in network intrusion detection using clusters,” Proc. Twenty-eighth Australas. Conf. Comput. Sci. - Vol. 38, vol. 38, no. January, pp. 333–342.