TY - JOUR
T1 - A study of YOLO architectures for wildfire and smoke detection in ground and aerial imagery
AU - Ramos, Leo Thomas
AU - Casas, Edmundo
AU - Romero, Cristian
AU - Rivas-Echeverría, Francklin
AU - Bendek, Eduardo
N1 - Publisher Copyright:
© 2025 The Authors
PY - 2025/6
Y1 - 2025/6
N2 - This study evaluates the performance of state-of-the-art YOLO architectures, YOLOv8, YOLOv9, YOLOv10, and YOLOv11, for wildfire and smoke detection. Using the Fire and Smoke dataset, we trained all models for 100 epochs with default settings to ensure a fair comparison. Performance was assessed through accuracy, training efficiency, and inference speed, using both numerical metrics and visual evaluations. Our results show that YOLOv8 achieves the best balance between detection accuracy and computational efficiency, reaching a mAP@50:95 of 0.661 in its largest version with a training time of 1.023 hours. YOLOv10x achieves similar performance, 0.654, but with higher training time and latency. In contrast, YOLOv9 and YOLOv11 perform worse, particularly in their larger variants, despite having more parameters and longer training times, YOLOv9e, for instance, requires over 1.5 hours to train. Notably, YOLOv10 and YOLOv11 surpassed YOLOv8 in certain cases, particularly in reducing false detections under partial occlusions or visual elements resembling smoke. However, all architectures struggled in low-visibility conditions, such as detecting faint smoke at night.
AB - This study evaluates the performance of state-of-the-art YOLO architectures, YOLOv8, YOLOv9, YOLOv10, and YOLOv11, for wildfire and smoke detection. Using the Fire and Smoke dataset, we trained all models for 100 epochs with default settings to ensure a fair comparison. Performance was assessed through accuracy, training efficiency, and inference speed, using both numerical metrics and visual evaluations. Our results show that YOLOv8 achieves the best balance between detection accuracy and computational efficiency, reaching a mAP@50:95 of 0.661 in its largest version with a training time of 1.023 hours. YOLOv10x achieves similar performance, 0.654, but with higher training time and latency. In contrast, YOLOv9 and YOLOv11 perform worse, particularly in their larger variants, despite having more parameters and longer training times, YOLOv9e, for instance, requires over 1.5 hours to train. Notably, YOLOv10 and YOLOv11 surpassed YOLOv8 in certain cases, particularly in reducing false detections under partial occlusions or visual elements resembling smoke. However, all architectures struggled in low-visibility conditions, such as detecting faint smoke at night.
KW - Computer vision
KW - Deep learning
KW - Object detection
KW - Smoke detection
KW - Wildfire detection
KW - YOLO
UR - https://www.scopus.com/pages/publications/105002710185
U2 - 10.1016/j.rineng.2025.104869
DO - 10.1016/j.rineng.2025.104869
M3 - Article
AN - SCOPUS:105002710185
SN - 2590-1230
VL - 26
JO - Results in Engineering
JF - Results in Engineering
M1 - 104869
ER -