CBTi-YOLOv5: Improved YOLOv5 with CBAM, Transformer, and BiFPN for Real-Time Safety Helmet Detection
Downloads
Background: Some construction workers are often in a situation where injuries can occur from negligence in the use of safety helmets. To avoid this, supervision of the use of safety helmets should be conducted continuously during the work process through the application of computer vision technology. However, the complex background of the construction environment is a challenge to detecting small and densely packed safety helmets accurately.
Objective: The construction environment is complex, and the wide workspace allows workers to be in an area far from supervision. The process makes it difficult for models to detect the use of safety helmets in complex, wide, and very high object density construction environments. Therefore, this study aims to overcome the problem by modifying YOLOv5s (You Only Look Once version 5) architecture.
Methods: Real-time monitoring of the use of safety helmets could be performed using YOLOv5. This study proposed a modified YOLOv5s model called CBTi-YOLOv5s. The model incorporated Convolutional Block Attention Module (CBAM), Transformer, and Bi-directional Feature Pyramid Network (BiFPN) to improve feature extraction, multi-scale object representation, as well as detection accuracy, specifically on small and high-density objects in complex construction environments.
Results: The results showed the modified YOLOv5s architecture had made an improvement of 3.7% in mean average precision (mAP) compared to the base YOLOv5s model. mAP of the base YOLOv5s model was 93.6%, while the modified CBTi-YOLOv5s model achieved 97.3%. The proposed modified YOLOv5s model also achieved an inference speed of 58 frames per second (FPS), and the base model achieved 104 FPS.
Conclusion: CBTi-YOLOv5s improved the accuracy, mAP, and ability to detect objects of varying scales. However, this improvement had drawbacks, namely increased model size and decreased inferential speed due to increased model architectural complexity..
Keywords: Bi-FPN, CBAM, CBTi-YOLOv5s, Helmet Detection, Transformer, YOLOv5
R. L. Mathis and J. H. Jackson, Human Resource Management: Manajemen Sumber Daya Manusia. 2002.
L. M. Azzahri and K. I. Ikhwan, “Hubungan Pengetahuan Tentang Penggunaan Alat Pelindung Diri (APD) Dengan Kepatuhan Penggunaan Apd Pada Perawat Di Puskesmas Kuok,” Prepotif: Jurnal Kesehatan Masyarakat, vol. 3, no. 1, pp. 50–57, 2023, doi: 10.31004/prepotif.v3i1.442.
M. U. Kisaezehra, M. A. Farooq, A. Bhutto, and A. K. Kazi, “Real-time safety helmet detection using yolov5 at construction sites,” Intelligent Automation & Soft Computing, vol. 36, no. 1, pp. 911–927, 2023.
N. Kwak and D. Kim, “Detection of Worker’s Safety Helmet and Mask and Identification of Worker Using Deeplearning,” Computers, Materials & Continua, vol. 75, no. 1, pp. 1671–1686, 2023, doi: 10.32604/cmc.2023.035762.
Y. Said et al., “AI-Based Helmet Violation Detection for Traffic Management System,” Computer Modeling in Engineering & Sciences, vol. 141, no. 1, pp. 733–749, 2024, doi: 10.32604/cmes.2024.052369.
L. Wang, L. Xie, P. Yang, Q. Deng, S. Du, and L. Xu, “Hardhat-Wearing Detection Based on a Lightweight Convolutional Neural Network with Multi-Scale Features and a Top-Down Module,” Sensors, vol. 20, no. 7, p. 1868, Mar. 2020, doi: 10.3390/s20071868.
W. Chen, M. Liu, X. Zhou, J. Pan, and H. Tan, “Safety Helmet Wearing Detection in Aerial Images Using Improved YOLOv4,” Computers, Materials & Continua, vol. 72, no. 2, pp. 3159–3174, 2022, doi: 10.32604/cmc.2022.026664.
H. Li, D. Wu, W. Zhang, and C. Xiao, “YOLO-PL: Helmet wearing detection algorithm based on improved YOLOv4,” Digit Signal Process, vol. 144, p. 104283, Jan. 2024, doi: 10.1016/j.dsp.2023.104283.
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 779–788. doi: 10.1109/CVPR.2016.91.
J. Redmon and A. Farhadi, “Yolo9000: Better, faster, stronger,” arXiv preprint arXiv:1612.08242, 2016, [Online]. Available: https://arxiv.org/abs/1612.08242v1
J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” arXiv preprint arXiv:1804.02767, 2018, [Online]. Available: https://arxiv.org/abs/1804.02767
A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “Yolov4: Optimal Speed and accuracy of object detection,” arXiv preprint arXiv:2004.10934, 2020, [Online]. Available: https://arxiv.org/abs/2004.10934
A. R. Muhammad, H. P. Utomo, P. Hidayatullah, and N. Syakrani, “Early Stopping Effectiveness for YOLOv4,” Journal of Information Systems Engineering and Business Intelligence, vol. 8, no. 1, pp. 11–20, 2022, doi: 10.20473/jisebi.8.1.11-20.
G. Jocher, K. Nishimura, T. Mineeva, and R. J. A. M. Vilariño, “yolov5,” 2020.
Shan Du, M. Shehata, and W. Badawy, “Hard hat detection in video sequences based on face features, motion and color information,” in 2011 3rd International Conference on Computer Research and Development, IEEE, Mar. 2011, pp. 25–29. doi: 10.1109/ICCRD.2011.5763846.
K. Shrestha, P. P. Shrestha, D. Bajracharya, and E. A. Yfantis, “Hard-Hat Detection for Construction Safety Visualization,” Journal of Construction Engineering, vol. 2015, pp. 1–8, Feb. 2015, doi: 10.1155/2015/721380.
T. Mahendrakar and others, “Performance Study of YOLOv5 and Faster R-CNN for Autonomous Navigation around Non-Cooperative Targets,” in 2022 IEEE Aerospace Conference (AERO), Big Sky, MT, USA, 2022, pp. 1–12. doi: 10.1109/AERO53065.2022.9843537.
H. Chen, Z. Chen, and H. Yu, “Enhanced Yolov5: An efficient road object detection method,” Sensors, vol. 23, no. 20, p. 8355, 2023, doi: 10.3390/s23208355.
F. Zhou, H. Zhao, and Z. Nie, “Safety Helmet Detection Based on YOLOv5,” in 2021 IEEE International Conference on Power Electronics, Computer Applications (ICPECA), Shenyang, China, 2021, pp. 6–11. doi: 10.1109/ICPECA51329.2021.9362711.
H. Li, L. Shi, S. Fang, and F. Yin, “Real-Time Detection of Apple Leaf Diseases in Natural Scenes Based on YOLOv5,” Agriculture, vol. 13, no. 4, p. 878, 2023, doi: 10.3390/agriculture13040878.
M. Tan, R. Pang, and Q. V Le, “EfficientDet: Scalable and efficient object detection,” arXiv preprint arXiv:1911.09070, 2020, [Online]. Available: https://arxiv.org/abs/1911.09070
X. Zhu, S. Lyu, X. Wang, and Q. Zhao, “TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios,” in 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, 2021, pp. 2778–2788. doi: 10.1109/ICCVW54120.2021.00312.
S. Woo, J. Park, J. Y. Lee, and I. S. Kweon, “CBAM: Convolutional Block Attention Module,” in Computer Vision – ECCV 2018, vol. 11211, V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss, Eds., in Lecture Notes in Computer Science, vol. 11211. , Springer, Cham, 2018, pp. 3–19. doi: 10.1007/978-3-030-01234-2_1.
A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” Oct. 2020, [Online]. Available: http://arxiv.org/abs/2010.11929
L. Xie, “Hardhat,” 2019. [Online]. Available: https://doi.org/10.7910/DVN/7CBGOS
A. Vaswani and others, “Attention is All you Need,” Neural Information Processing Systems, 2017, doi: 10.48550/arXiv.1706.03762.
J. Yao, X. Fan, B. Li, and W. Qin, “Adverse Weather Target Detection Algorithm Based on Adaptive Color Levels and Improved YOLOv5,” Sensors, vol. 22, no. 21, p. 8577, 2022, doi: 10.3390/s22218577.
Y. Guo, P. Regmi, Y. Ding, R. B. Bist, and L. Chai, “Automatic detection of Brown Hens in cage-free houses with Deep Learning Methods,” Poult Sci, vol. 102, no. 8, p. 102784, 2023, doi: 10.1016/j.psj.2023.102784.
Copyright (c) 2025 The Authors. Published by Universitas Airlangga.

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
All accepted papers will be published under a Creative Commons Attribution 4.0 International (CC BY 4.0) License. Authors retain copyright and grant the journal right of first publication. CC-BY Licenced means lets others to Share (copy and redistribute the material in any medium or format) and Adapt (remix, transform, and build upon the material for any purpose, even commercially).















