Deep learning-based oriented object detection in remote sensing imagery: YOLOv7-OBB

Santos, Pietro Terra Pizzutti dos

Trabalho de Conclusão de Curso de Graduação

Fecha

2023-01-25

Registro en:

SANTOS, P. T. P. dos. Deep Learning-Based Oriented Object Detection in Remote Sensing Imagery: YOLOv7-OBB. 2023. 86 p. Trabalho de Conclusão de Curso (Graduação em Engenharia de Telecomunicações) - Universidade Federal de Santa Maria, Santa Maria, RS, 2023.

http://repositorio.ufsm.br/handle/1/27722

https://repositorioslatinoamericanos.uchile.cl/handle/2250/8628337

Autor

Santos, Pietro Terra Pizzutti dos

Institución

Universidade Federal de Santa Maria (Brasil)

Resumen

Remote sensing (RS) is the act of processing and extracting meaningful features about the ground and objects observed at a distance, usually from a much higher position from aircraft and satellites. Due to the large field of coverage in RS imagery, object detection in these images can be really useful, gathering a broad and concise notion of the objects present in certain areas. Due to their great capability of assimilating intricate patterns, Deep Learning (DL) models have achieved state-of-the-art (SOTA) performance in computer vision tasks. In this project, an extensive research is conducted on current DL-based object detection models and a suitable model, YOLOv7, is chosen to serve as a baseline for modifications to enable a high performance oriented bounding-box (OBB) detector in RS imagery. In supervised DL models, their final performance is very dependent on the quality of their training. To improve it, large datasets covering the specific task are pursued, converging to the use of DOTA dataset. Moreover, the concept of transfer learning is employed to allow the use of a pre-trained model on a very large dataset with different tasks. The final model is evaluated on common object detection metrics, such as the confusion matrix, precision, and recall curves. They validate the detector, capable of identifying 16 object classes with SOTA performance: high accuracy, fast and with the latest oriented bounding-box. Comparing the confusion matrices of the developed model and YOLOv5-OBB (KAIXUAN, 2022), for instance, it correctly identifies with a probability of 0.97, 0.89, 0.67 and 0.67% the following classes: plane, baseball diamond, bridge and ground track field. Meanwhile the YOLOv5-OBB obtains 0.96, 0.83, 0.6 and 0.6% for the same respective classes. Another interesting point is the reduction from 0.73 to 0.69% in the probability of mistaking the background for a small-vehicle. The model can further be trained on custom datasets for detection in agriculture, livestock, militarily, etc., bringing implications for many areas and activities. The repository containing all the codes used and developed in this project is available at (SANTOS, 2022).

Materias

Machine learning

Deep learning

Remote sensing imagery

Object detection

Oriented bounding-box

YOLOv7

DOTA dataset

Mostrar el registro completo del ítem