ترجمه تصویر حرارتی به نور مرئی با استفاده از شبکه‌های مولد تخاصمی

ملک پور, نسترن; فدائی اسلام, محمدجواد

doi:10.22075/jme.2024.34522.2685

ترجمه تصویر حرارتی به نور مرئی با استفاده از شبکه‌های مولد تخاصمی

نوع مقاله : مقاله پژوهشی

نویسندگان

دانشکده مهندسی برق و کامپیوتر، دانشگاه سمنان، سمنان، ایران

10.22075/jme.2024.34522.2685

چکیده

سیستم‌های تصویربرداری حرارتی، با توجه به ویژگی‌های منحصربه‎‎فرد خود، مانند توانایی ثبت تصاویر در شرایط آب‎و‎هوای مختلف، ثبت تصاویر در شب و یا دارا بودن خاصیت ضد‎جعل، کاربرد‌های نظامی، امنیتی و قضایی ویژه‌ای دارند. با این حال، تصاویر ثبت‎شده توسط دوربین‌های حرارتی، با استفاده از چشم انسان قابل‎تشخیص نبوده و شناسایی چهره تصاویر حرارتی، برای انسان بسیار سخت است. تبدیل تصاویر حرارتی به تصاویر نور مرئی، در حوزه انتقال محتوای تصویر یا ترجمه تصویر به تصویر قرار دارد. تاکنون، مدل‌های یادگیری عمیق بسیاری برای تبدیل تصاویر حرارتی به نور مرئی معرفی شده‌اند. از بین این مدل‌ها، شبکه‌های مولد تخاصمی توانسته‌اند به پیشرفت قابل‎توجهی در این زمینه دست پیدا کنند. در این مقاله، سعی شد تا شبکه‌ ClawGAN که به‌طور خاص برای تبدیل تصاویر حرارتی به نور مرئی ارائه شده‌است، بهبود داده‌شود. راهکار ما بر پایه‌ی ادغام تکنیک‌های موثر نظیر Unet++، Unet3+، شبکه خودتوجه در مولد معماری پایه است. بدین‎صورت، شبکه قادر خواهد بود تا در زمان انتقال محتوا از دامنه‌ حرارتی به دامنه‌ی نور مرئی، تصاویر با کیفیت بالاتری را تولید کند که قابل‎تشخیص از طریق چشم انسان بوده و دارای کمترین اعوجاج، تاری و نویز باشند. نتایج بدست‎آمده نشان داد که مولد پیشنهادی توانست باعث بهبود قابل‎توجه معیار‎های ارزیابی مانند MSE، PSNR، RMSE، UQI و PSNR-B شود.

کلیدواژه‌ها

موضوعات

مهندسی کامپیوتر

عنوان مقاله [English]

Facial Thermal Image Translation to RGB Visible Light using GAN

نویسندگان [English]

Nastaran Malekpour
Mohammad Javad Fadaeieslam

Faculty of Electrical and Computer Engineering, Semnan University, Semnan, Iran

چکیده [English]

Thermal imaging systems, due to their unique features, such as the ability to record images in different weather conditions, recording images at night, or having anti-counterfeiting properties, have special military, security and judicial applications. However, the images recorded by thermal cameras cannot be recognized by the human eye, and it is very difficult for humans to recognize the faces of thermal images. Converting thermal images to visible light images is in the field of image-to-image content transfer or image to image translation. So far, many deep learning models have been introduced to convert thermal images into visible light. Among these models, adversarial networks have been able to achieve significant progress in this field. In this paper, an attempt was made to improve the ClawGAN network, which is specifically designed to convert thermal images into visible light. Our method is based on the integration of effective techniques such as Unet++, Unet3+, self-attention network in the generator of the base model. In this way, the network will be able to produce higher quality images that can be recognized by the human eye and have minimal distortion, blur and noise when transferring content from the thermal domain to the visible light domain. The obtained results showed that the proposed generator was able to significantly improve the evaluation criteria such as MSE, PSNR, RMSE, UQI and PSNR-B.

کلیدواژه‌ها [English]

Image-to-image translation
Thermal image
RGB visible light image
Generative adversarial network
Self attention

مراجع

[1] W. Chaoyue, X. Chang, W. Chaohui, and T. Dacheng. "Perceptual Adversarial Networks for Image-to-Image Transformation." IEEE Transactions on Image Processing 27, no. 8 (2018): 4066–79. https://doi.org/10.1109/TIP.2018.2836316.

[2] R. Immidisetti, S. Hu, and V.M. Patel. "Simultaneous Face Hallucination and Translation for Thermal to Visible Face Verification Using Axial-GAN." 2021 IEEE International Joint Conference on Biometrics (IJCB), 1-8. IEEE, 2021. https://doi.org/10.1109/IJCB52358.2021.9484353.

[3] S. Liu, G. Mingliang, V. John, Z. Liu, and E. Blasch. "Deep Learning Thermal Image Translation for Night Vision Perception." ACM Transactions on Intelligent Systems and Technology 12, no. 1 (February 2021). https://doi.org/10.1145/3426239.

[4] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. "Generative Adversarial Networks." Communications of the ACM 63, no. 11 (2020): 139-144. https://doi.org/10.1145/3422622.

[5] F. Wu, W. You, J.S. Smith, W. Lu, and B. Zhang. "Image-Image Translation to Enhance Near Infrared Face Recognition." 2019 IEEE International Conference on Image Processing (ICIP), 1-5. IEEE, 2019. https://doi.org/10.1109/ICIP.2019.8804414.

[6] P. Isola, J.Y. Zhu, T. Zhou, and A.A. Efros. "Image-to-Image Translation with Conditional Adversarial Networks." 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 5967-5976. IEEE, 2017. https://doi.org/10.1109/CVPR.2017.632.

[7] Y. Luo , D. Pi, Y. Pan, L. Xie, W. Yu, and Y. Liu. "Claw Connection-Based Generative Adversarial Networks for Facial Image Translation in Thermal to RGB Visible Light." Expert Systems with Applications 185 (2021): 116269. https://doi.org/10.1016/j.eswa.2021.116269.

[8] H. Dou, C. Chen, X. Hu, and S. Peng. "Asymmetric CycleGAN for Unpaired NIR-to-RGB Face Image Translation." Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), 1757–61. IEEE, 2019.

[9] W. Weng, and X. Zhu. "UNet: Convolutional Networks for Biomedical Image Segmentation." IEEE Access 9 (2021): 16591–16603. https://doi.org/10.1109/ACCESS.2021.3053408.

[10] Z. Zongwei, M.M.R. Siddiquee, N. Tajbakhsh, and J. Liang. "UNet++: A Nested U-Net Architecture for Medical Image Segmentation." Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, edited by Danail Stoyanov et al., 1-12. Lecture Notes in Computer Science, vol. 11045. Springer, Cham, 2018. https://doi.org/10.1007/978-3-030-00889-5_1.

[11] H. Huimin, L. Lin, R. Tong, H. Hu, Q. Zhang, Y. Iwamoto, X. Han, Y.W. Chen, and J. Wu. "UNet 3+: A Full-Scale Connected UNet for Medical Image Segmentation." ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, 1055–1059. IEEE, 2020. https://doi.org/10.1109/ICASSP40776.2020.9053405.

[12] H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena. "Self-Attention Generative Adversarial Networks." Proceedings of the 36th International Conference on Machine Learning (ICML), 12744–53. 2019.

[13] D. Zou, Y. Cao, D. Zhou, and Q. Gu. "Gradient Descent Optimizes Over-Parameterized Deep ReLU Networks." Machine Learning 109, no. 3 (2020): 467–92. https://doi.org/10.1007/s10994-019-05839-6.

[14] S. Ioffe, and C. Szegedy. "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift." Proceedings of the 32nd International Conference on Machine Learning (ICML), 448–56. 2015.

[15] J.L. Ba, J.R. Kiros, and G.E. Hinton. "Layer Normalization." arXiv preprint arXiv:1607.06450 (2016). https://doi.org/10.48550/arXiv.1607.06450.

[16] D. Hendrycks , and K. Gimpel. "Gaussian Error Linear Units (GELUs)." arXiv preprint arXiv:1606.08415 (2016). https://doi.org/10.48550/arXiv.1606.08415.

[17] ] J.Y. Zhu, T. Park, P. Isola, and A.A. Efros. "Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks." Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), 12744–53. IEEE, 2017. https://doi.org/10.1109/ICCV.2017.244.

[18] X. Mao, Q. Li, H. Xie, R.Y.K. Lau, Z. Wang, and S.P. Smolley. "Least Squares Generative Adversarial Networks." Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), 1032–40. IEEE, 2017. https://doi.org/10.1109/ICCV.2017.304.

[19] "Otcbvs 2004." n.d. Accessed May 9, 2025. http://vcipl-okstate.org/pbvs/bench/Data/03/download.html.

[20] C.J. Willmott, and K. Matsuura. "Advantages of the Mean Absolute Error (MAE) Over the Root Mean Square Error (RMSE) in Assessing Average Model Performance." Climate Research 30, no. 1 (2005): 79–82. https://doi.org/10.3354/cr030079.

[21] H.R. Sheikh, and A.C. Bovik. "Image Information and Visual Quality." IEEE Transactions on Image Processing 15, no. 2 (2006): 430–44. https://doi.org/10.1109/TIP.2005.859378.

[22] A. Shilandari , H. Marvi, and H. Khosravi. "Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion Recognition." Journal of Modeling in Engineering 21, no. 72 (2023): 1–17. https://doi.org/10.22075/jme.2022.24865.2159. (in Persian)

[23] H. Jabbari, and N. Bigdeli. "A New Capsule Generative Adversarial Network for Imbalanced Classification of Human Sperm Images." Journal of Modeling in Engineering 21, no. 73 (2023): 279–94. https://doi.org/10.22075/jme.2023.28349.2333. (in Persian)

دوره 23، شماره ویژه 81
جشن پنجاهمین سالگرد تاسیس دانشگاه سمنان
تیر 1404
صفحه 189-198

فایل ها

سابقه مقاله

تاریخ دریافت: 01 تیر 1403
تاریخ بازنگری: 14 مهر 1403
تاریخ پذیرش: 22 آبان 1403

تعداد مشاهده مقاله: 507
تعداد دریافت فایل اصل مقاله: 225

ترجمه تصویر حرارتی به نور مرئی با استفاده از شبکه‌های مولد تخاصمی

Facial Thermal Image Translation to RGB Visible Light using GAN

مراجع

دوره 23، شماره ویژه 81
جشن پنجاهمین سالگرد تاسیس دانشگاه سمنان
تیر 1404
صفحه 189-198

فایل ها

سابقه مقاله

هم رسانی

ارجاع به این مقاله

آمار

ترجمه تصویر حرارتی به نور مرئی با استفاده از شبکه‌های مولد تخاصمی

Facial Thermal Image Translation to RGB Visible Light using GAN

مراجع

دوره 23، شماره ویژه 81جشن پنجاهمین سالگرد تاسیس دانشگاه سمنانتیر 1404صفحه 189-198

فایل ها

سابقه مقاله

هم رسانی

ارجاع به این مقاله

آمار

دوره 23، شماره ویژه 81
جشن پنجاهمین سالگرد تاسیس دانشگاه سمنان
تیر 1404
صفحه 189-198