Trabalho de Conclusão de Curso de Graduação
Deep learning-based oriented object detection in remote sensing imagery: YOLOv7-OBB
Fecha
2023-01-25Registro en:
SANTOS, P. T. P. dos. Deep Learning-Based Oriented Object Detection in Remote Sensing Imagery: YOLOv7-OBB. 2023. 86 p. Trabalho de Conclusão de Curso (Graduação em Engenharia de Telecomunicações) - Universidade Federal de Santa Maria, Santa Maria, RS, 2023.
Autor
Santos, Pietro Terra Pizzutti dos
Institución
Resumen
Remote sensing (RS) is the act of processing and extracting meaningful features about
the ground and objects observed at a distance, usually from a much higher position from aircraft and satellites. Due to the large field of coverage in RS imagery, object detection in these
images can be really useful, gathering a broad and concise notion of the objects present in certain areas. Due to their great capability of assimilating intricate patterns, Deep Learning (DL)
models have achieved state-of-the-art (SOTA) performance in computer vision tasks. In this
project, an extensive research is conducted on current DL-based object detection models and
a suitable model, YOLOv7, is chosen to serve as a baseline for modifications to enable a high
performance oriented bounding-box (OBB) detector in RS imagery. In supervised DL models, their final performance is very dependent on the quality of their training. To improve it,
large datasets covering the specific task are pursued, converging to the use of DOTA dataset.
Moreover, the concept of transfer learning is employed to allow the use of a pre-trained model
on a very large dataset with different tasks. The final model is evaluated on common object
detection metrics, such as the confusion matrix, precision, and recall curves. They validate the
detector, capable of identifying 16 object classes with SOTA performance: high accuracy, fast
and with the latest oriented bounding-box. Comparing the confusion matrices of the developed
model and YOLOv5-OBB (KAIXUAN, 2022), for instance, it correctly identifies with a probability of 0.97, 0.89, 0.67 and 0.67% the following classes: plane, baseball diamond, bridge
and ground track field. Meanwhile the YOLOv5-OBB obtains 0.96, 0.83, 0.6 and 0.6% for the
same respective classes. Another interesting point is the reduction from 0.73 to 0.69% in the
probability of mistaking the background for a small-vehicle. The model can further be trained
on custom datasets for detection in agriculture, livestock, militarily, etc., bringing implications
for many areas and activities. The repository containing all the codes used and developed in
this project is available at (SANTOS, 2022).