Artículo
Raspberries-LITRP database : RGB images database for the industrial applications of red raspberries’ automatic quality estimation
Registro en:
10.3390/app122211586
Autor
Quintero-Rincón, Antonio
Mora, Marco
Naranjo Torres, José
Fredes, Claudio
Valenzuela, Andrés
Institución
Resumen
Abstract: This work presents a free new database designed from a real industrial process to recognize,
identify, and classify the quality of the red raspberry accurately, automatically, and in real time.
Raspberry trays with recently harvested fresh fruit enter the industry’s selection and quality control
process to be categorized and subsequently their purchase price is determined. This selection is
carried out from a sample of a complete batch to evaluate the quality of the raspberry. This database
aims to solve one of the major problems in the industry: evaluating the largest amount of fruit possible
and not a single sample. This major dataset enables researchers in various disciplines to develop
practical machine-learning (ML) algorithms to improve red raspberry quality in the industry, by
identifying different diseases and defects in the fruit, and by overcoming limitations by increasing
the performance detection rate accuracy and reducing computation time. This database is made up of
two packages and can be downloaded free from the Laboratory of Technological Research in Pattern
Recognition repository at the Catholic University of the Maule. The RGB image package contains 286
raw original images with a resolution of 3948 2748 pixels from raspberry trays acquired during a
typical process in the industry. Furthermore, the labeled images are available with the annotations
for two diseases (86 albinism labels and 164 fungus rust labels) and two defects (115 over-ripeness
labels, and 244 peduncle labels). The MATLAB code package contains three well-known ML methodological
approaches, which can be used to classify and detect the quality of red raspberries. Two
are statistical-based learning methods for feature extraction coupled with a conventional artificial
neural network (ANN) as a classifier and detector. The first method uses four predictive learning
from descriptive statistical measures, such as variance, standard deviation, mean, and median.
The second method uses three predictive learning from a statistical model based on the generalized
extreme value distribution parameters, such as location, scale, and shape. The third ML
approach uses a convolution neural network based on a pre-trained fastest region approach (Faster
R-CNN) that extracts its features directly from images to classify and detect fruit quality. The classification
performance metric was assessed in terms of true and false positive rates, and accuracy.
On average, for all types of raspberries studied, the following accuracies were achieved: Faster
R-CNN 91.2%, descriptive statistics 81%, and generalized extreme value 84.5%. These performance
metrics were compared to manual data annotations by industry quality control staff, accomplishing
the parameters and standards of agribusiness. This work shows promising results, which can shed a
new light on fruit quality standards methodologies in the industry.