14
A Feature Learning Based Approach for Automated Fruit Yield Estimation Calvin Hung, James Underwood, Juan Nieto and Salah Sukkarieh Abstract This paper demonstrates a generalised multi-scale feature learning ap- proach to multi-class segmentation, applied to the estimation of fruit yield on tree- crops. The learning approach makes the algorithm flexible and adaptable to different classification problems, and hence applicable to a wide variety of tree-crop applica- tions. Extensive experiments were performed on a dataset consisting of 8000 colour images collected in an apple orchard. This paper shows that the algorithm was able to segment apples with different sizes and colours in an outdoor environment with natural lighting conditions, with a single model obtained from images captured us- ing a monocular colour camera. The segmentation results are applied to the problem of fruit counting and the results are compared against manual counting. The results show a squared correlation coefficient of R 2 = 0.81. 1 Introduction A continuous growing population has generated an unprecedented demand for agri- culture production. Several problems with traditional farming will need to be ad- dressed in order to achieve the required increase in production. First of all, there is a shortage in labour in rural areas. Migration and urbanization have left rural areas with a reduced and aging population and has also reduced the available farmland for Calvin Hung Australian Centre for Field Robotics, The University of Sydney, e-mail: [email protected] James Underwood Australian Centre for Field Robotics, The University of Sydney, e-mail: [email protected] Juan Nieto Australian Centre for Field Robotics, The University of Sydney, e-mail: [email protected] Salah Sukkarieh Australian Centre for Field Robotics, The University of Sydney, e-mail: [email protected] 1

A Feature Learning Based Approach for Automated Fruit ... · A Feature Learning Based Approach for Automated Fruit Yield Estimation 5 to the multiscale responses. The classifier

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Feature Learning Based Approach for Automated Fruit ... · A Feature Learning Based Approach for Automated Fruit Yield Estimation 5 to the multiscale responses. The classifier

A Feature Learning Based Approach forAutomated Fruit Yield Estimation

Calvin Hung, James Underwood, Juan Nieto and Salah Sukkarieh

Abstract This paper demonstrates a generalised multi-scale feature learning ap-proach to multi-class segmentation, applied to the estimation of fruit yield on tree-crops. The learning approach makes the algorithm flexible and adaptable to differentclassification problems, and hence applicable to a wide variety of tree-crop applica-tions. Extensive experiments were performed on a dataset consisting of 8000 colourimages collected in an apple orchard. This paper shows that the algorithm was ableto segment apples with different sizes and colours in an outdoor environment withnatural lighting conditions, with a single model obtained from images captured us-ing a monocular colour camera. The segmentation results are applied to the problemof fruit counting and the results are compared against manual counting. The resultsshow a squared correlation coefficient of R2 = 0.81.

1 Introduction

A continuous growing population has generated an unprecedented demand for agri-culture production. Several problems with traditional farming will need to be ad-dressed in order to achieve the required increase in production. First of all, there isa shortage in labour in rural areas. Migration and urbanization have left rural areaswith a reduced and aging population and has also reduced the available farmland for

Calvin HungAustralian Centre for Field Robotics, The University of Sydney, e-mail: [email protected]

James UnderwoodAustralian Centre for Field Robotics, The University of Sydney, e-mail:[email protected]

Juan NietoAustralian Centre for Field Robotics, The University of Sydney, e-mail: [email protected]

Salah SukkariehAustralian Centre for Field Robotics, The University of Sydney, e-mail: [email protected]

1

Page 2: A Feature Learning Based Approach for Automated Fruit ... · A Feature Learning Based Approach for Automated Fruit Yield Estimation 5 to the multiscale responses. The classifier

2 Calvin Hung, James Underwood, Juan Nieto and Salah Sukkarieh

production. Secondly, the impacts of current practices to climate change provoke anecessity for innovation in order to make agriculture a sustainable practice. Reduc-ing emissions, increasing efficiency in the use of resources such as water, reducingthe use of chemical products are some of the actions that will need to be addressedto reduce the environmental footprint of farming.

Robotics and automation are expected to have a significant impact on farms ofthe future by increasing efficiency in production and the use of resources and re-ducing labour costs. Information gathering systems are already being used in manyfarms. Geographic Information Systems (GIS) based on remote sensing data havebeen shown to provide good methods to solve spatial problems and support de-cision making [1]. The main limitation of GIS is that they provide coarse spatialand temporal information. Manual sampling suffers from the same two shortfalls,in addition to an increase in labour cost. On the other hand, field data from groundmobile sensor platforms can increase spatial and temporal resolution with improvedoperational and cost benefits.

The operational benefit of the autonomous system is to provide accurate yieldestimation that grants better decision making such as fertiliser application across afarm, prediction for production purposes later in the supply chain. In addition theresearch into image processing can provide understanding of individual health andnutritional values of the fruit.

This paper demonstrates a feature learning based approach for image based fruitsegmentation. We present results for apple segmentation using a camera mounted onan unmanned ground vehicle shown in Fig.1. Outdoor fruit classification using vi-sion systems is a challenging task. An autonomous segmentation approach will haveto deal with illumination changes, occlusion, variety of object sizes and in our casevariety of colours. The algorithm presented here performs pixel classification usingfeature learning to automatically select the most important attributes of the target(fruit). We show that the approach can handle different illumination conditions anddifferent fruit size and colour.

In particular this paper presents the following contributions:

• an extension for apple tree segmentation of the algorithm presented in [2] foralmonds;

• extensive experiments using data collected in an apple orchard with a roboticplatform. The algorithm detects both, red and green apples with a commonmodel;

• a pipeline for fruit counting and validation against manual fruit counting per-formed by the farmer.

2 Related Work

There has been a relatively large amount of work presented in fruit segmentationusing vision. The literature covers a variety of fruits such as grapes [3] [4], mangoes

Page 3: A Feature Learning Based Approach for Automated Fruit ... · A Feature Learning Based Approach for Automated Fruit Yield Estimation 5 to the multiscale responses. The classifier

A Feature Learning Based Approach for Automated Fruit Yield Estimation 3

(a) Sensor Configuration (b) Scanning a Row

Fig. 1 The perception research ground vehicle Shrimp, operating at the apple orchard.

[5], oranges [6] and apples [7, 8]. A common trend in orchard tree classification isthe use of manually designed features, such as intensity or colour ranges or shapefeatures [9]. For example, the work presented in [8] uses a simple intensity thresholdto segment flowers, while the work presented in [5] uses a colour threshold. Anotherexample is the work proposed by [4] where a shape based detection followed by acolour and texture classifier is applied to grape segmentation. The main drawbackof these approaches is that they are specifically designed for a particular fruit varietyand will need to be re-designed for different fruits or to cope with variations such asseasonal changes.

Apart from the always present occlusion and illumination challenges, fruit clas-sification becomes a simpler task when the fruit has a distinctive colour from thefoliage. Green apples however, do not belong to that category. A pipeline for applesegmentation is presented in [7]. In order to control the illumination the authors col-lected data at night with an artificial source. The algorithm selects red apples usingcolour and green apples using specular reflection features.

The fruit classification framework presented here has been designed with the aimof automatically adapting the feature sets for different fruits; the feature extractionand classification rules are all obtained via learning. Our approach does not requiredomain specific assumptions and can therefore be applied to different types of trees.

Page 4: A Feature Learning Based Approach for Automated Fruit ... · A Feature Learning Based Approach for Automated Fruit Yield Estimation 5 to the multiscale responses. The classifier

4 Calvin Hung, James Underwood, Juan Nieto and Salah Sukkarieh

3 Fruit Classification

Green apple detection using vision is challenging due to the low contrast of thefruit with the background foliage. In addition, outdoor environments present extradifficulties due to the variations in illumination caused by occlusion and differentweather conditions.

This paper applies the fruit segmentation algorithm previously demonstrated onalmond trees [2], to apple segmentation and counting. The learning based approachallows the algorithm to be applied on fruit of different appearance. This sectionprovides a brief description of the algorithm, interested reader please refer to [2] fordetails.

3.1 Image Modelling using Conditional Random Fields

This paper models the image data using a Conditional Random Fields (CRF) frame-work [10]. The graphical model for an image consists of a two dimensional latticeG =< V ,E > where V is a set of pixels representing the vertices of the graph andE is a set of edges modeling the relationships between the neighbouring pixels.

Image segmentation is performed by assigning to every pixel xi ∈ V in the imagea meaningful label li ∈L . For multi-class image segmentation, the label set L maycontain multiple labels up to k classes L ∈ {1, ...,k}. The optimal labeling l? isobtained via energy minimisation on the graph structure G with the energy functiondefined as in Eq. 1

E(l) = ∑i∈V

ψi(li,xi)+ ∑(i, j)∈E

ψi j(li, l j,xi,x j)− log(Z(x)) (1)

where ψi(li,xi) is the unary potential which models the likelihood of a pixel takinga certain label, ψi j(li, l j) is the pairwise potential which models the assumption thatthe neighbouring pixels should take the same label, and log(Z(x)) is the partitionfunction.

Conventionally the unary potential is computed using the features in the image,for example grey level intensity [11], colour [12], or texture [13]. In our approachthe unary potential is generated using the multi-scale features learnt from the imagedataset.

3.2 Multi-Scale Feature Learning

This section provides an overview of our feature learning approach. We first applya sparse autoencoder [14] at different scales to obtain the multi-scale feature dic-tionaries. A logistic regression classifier is then used to learn the label association

Page 5: A Feature Learning Based Approach for Automated Fruit ... · A Feature Learning Based Approach for Automated Fruit Yield Estimation 5 to the multiscale responses. The classifier

A Feature Learning Based Approach for Automated Fruit Yield Estimation 5

to the multiscale responses. The classifier output is then passed into the CRF as theunary term described in Eq. 1 for multi-class image segmentation.

The first step is to use unsupervised feature learning to capture the features froman unlabelled dataset. In this paper a sparse autoencoder is used to learn the dic-tionaries from randomly sampled image patches. This process is then repeated withimages sub-sampled at different scales.

With the low dimensional dictionary code obtained using unsupervised featurelearning, an additional supervised label assignment step can be used to train a clas-sifier. In this paper a softmax regression classifier is used. Softmax regression waschosen because it can be formulated as a single layer perceptron and trained usingback-propagation.

Lastly the sparse autoencoder used for unsupervised feature learning and thelogistic regression classifier are joined into one single network to fine-tune the net-work parameters.

3.3 Fruit Counting

With the apple segmentation output, two approaches are used to estimate the fruitcount. The first is to count the total number of fruit pixels per side of the row anduse the pixel count to infer actual fruit count, the second approach is to performcircle detection using the circular Hough transform [15] to estimate the actual fruitcount. Prior to detection, image erosion is applied to remove the noise and dilationis applied to join partially occulded fruit.

4 Experimental Setup

The experiments were conducted using Shrimp; a general purpose perception re-search ground vehicle. The system is equipped with a variety of localisation, rang-ing and imaging sensors that are commonly used in machine perception, togetherwith soil specific sensors for agricultural applications, see Fig. 1(a).

Data from all sensors were gathered from a single 0.5ha block at a commercialapple orchard near Melbourne, Victoria in Australia, shown in Fig. 3. The data col-lection was performed one week before harvest with the apple diameter ranged from70 to 76 mm. The orchard employs a modern Guttingen V trellis structure, wherebythe crop is arranged in pairs of 2D planar trellises, arranged to form a ’V’ shape.The structure was designed to maximise sunlight on the leaves and to promote ef-ficient manual harvesting, and potentially would be an appropriate structure for fu-ture robotic harvesters. The Shrimp robot acquired data while driving between therows, as shown in Fig. 1(b). All fifteen rows in Fig. 3 were scanned, with a trajectorylength of 3.1km and a total of 0.5T B of data. The data collection took approximately3 hours. The experiments in this paper were conducted using the single front facing

Page 6: A Feature Learning Based Approach for Automated Fruit ... · A Feature Learning Based Approach for Automated Fruit Yield Estimation 5 to the multiscale responses. The classifier

6 Calvin Hung, James Underwood, Juan Nieto and Salah Sukkarieh

camera of the Ladybug3 panospheric camera, seen at the top of the robot, becausethis camera has a sufficiently wide field of view to see the entire trellis face, whilescanning from the centre of a row, as shown in Figs. 1(b). The data was acquiredcontinuously at 5Hz, with an individual image resolution of 1232× 1616 pixels.Other sensors are used as part of research in tree segmentation and farm modelling.

4.1 Image Analysis

Two set of experiments were performed in this study. The experiments were de-signed to test both the generalisation of the algorithm for pixel classification, and toverify that the resulting fruit count did correlate to the ground truth. Multi-class seg-mentation was used to test the generalisation of the algorithm and binary apple/non-apple segmentation was used to evaluate the fruit counting performance.

The dataset consists of over 8000 images with resolutions of 1232×1616 pixels.To simplify the labelling process each image was divided into 32 sub-images with308× 202 pixels. Overall 90 sub-images were hand labelled with pixel accuracywith multi-class labels, and 800 sub-images were labelled with pixel accuracy withbinary apple/non-apple classes. These labelled sub-images were used for the seg-mentation algorithm training and 2-fold cross validation. Examples of the labelledimages are shown in Figure 2.

Ground Truth Labels Sub-Images

Mu

lti-

Cla

ss

Bin

ary

Fig. 2 Example of image labels: 90 sub-images were hand labelled with pixel accuracy with multi-class labels, and 800 sub-images were labelled with pixel accuracy with binary apple/non-appleclasses.

Page 7: A Feature Learning Based Approach for Automated Fruit ... · A Feature Learning Based Approach for Automated Fruit Yield Estimation 5 to the multiscale responses. The classifier

A Feature Learning Based Approach for Automated Fruit Yield Estimation 7

Apart from the individual class pixel classification accuracy, three evaluationmetrics were applied, the global, average accuracy and the F measure. The globalaccuracy measures the number of correctly classified pixels of the entire datasetwhereas the average accuracy measures the average performance over all classes.The F measure was computed by averaging the F-measure of the individual classes.

4.1.1 Multi-class Fruit Segmentation

The first experiment performed multi-class segmentation with the same setting as[2] to classify fruits, leaves, branches, sky and ground. The main difference wasthat the fruit in this paper were apples instead of almonds. The objective of thisexperiment was to test the generalisation of the multi-class segmentation algorithmon different fruits, as well as the same fruit (apple) with different colours (red andgreen).

4.1.2 Apple Counting

The second experiment was designed to evaluate the fruit counting performance.With this objective in mind the algorithm was re-trained to specialise in apple/non-apple binary classification. This was done because in general, the classification ac-curacy improves with decreasing number of classes. In addition to the binary classsegmentation evaluation using the hand labelled image data, the apple farmer pro-vided us with the ground truth count and weight over several rows of the apple farmshown in Fig. 3 by using an automated post-harvest weighing and counting machine.The apple trees are grown on a V shaped trellis known as Guttingen V. Each rowhas two sides, the rows are numbered sequentially and the A-sides are west facingwhereas the B-sides are east facing. The ground truth counts were provided as totalsfor each side of each row, and were used to evaluate the correlation between thealgorithm count and the ground truth.

To normalise the pixels counts provided by the algorithm, all the images andpixel class probabilities were first undistorted using the camera calibration data.Secondly, by using the navigation solution the algorithm sampled images every 0.5metres along each row to minimise overlap. Finally the total pixel count per side ofthe rows were normalised by dividing by the number of images sampled.

Page 8: A Feature Learning Based Approach for Automated Fruit ... · A Feature Learning Based Approach for Automated Fruit Yield Estimation 5 to the multiscale responses. The classifier

8 Calvin Hung, James Underwood, Juan Nieto and Salah Sukkarieh

Fig. 3 The Map of the Farm: The apple trees are grown on a V shaped trellis (Guttingen V). Therows are numbered sequentially, each row has two sides and the A-sides are facing west whereasthe B-sides are facing east. The robot surveyed the entire labelled area and the farmer provided theground truth apple count and weight for each row face for the majority of the surveyed area.

5 Results

5.1 Multi-class Fruit Segmentation

The multi-class segmentation results are shown in Fig. 4 and Table 1. The datasetconsists of leaves, apples and tree trunks at different lighting conditions and scales.The algorithm was able to segment various objects and also the background scene(sky and ground). The confusion matrix shows that the majority of mis-classificationoccurs between the apple and leaf classes, this is largely due to the colour and texturesimilarity between the leaves and green apples.

Table 2 also shows the overall performance in apple segmentation compared toalmond segmentation shown in [2] using the same multi-scale feature learning al-gorithm. The segmentation algorithm performed slightly better on the apple dataset(F score 87.3) compared to the almond dataset (F score 84.8). The algorithm gen-

Page 9: A Feature Learning Based Approach for Automated Fruit ... · A Feature Learning Based Approach for Automated Fruit Yield Estimation 5 to the multiscale responses. The classifier

A Feature Learning Based Approach for Automated Fruit Yield Estimation 9

Fig. 4 Apple orchard multi-class segmentation qualitative results: This dataset was collected dur-ing an orchard surveying mission aiming to automate the yield estimation and harvesting process.The class colour code: Apples are red, leaves are green, branches are brown, sky is blue and theground is yellow.

leaves apples trunk ground skyleaves 96.7 25.2 17.4 7.4 0.6apples 2.0 71.0 7.7 0.1 0.1trunk 0.5 2.2 71.8 1.9 0.1ground 0.3 0.6 2.6 86.9 0.0sky 0.4 1.1 0.5 3.6 99.1

Table 1 Confusion matrix for pixel multi-class classification: The classifier performs well in theleaves and the sky classes. The majority of mis-classification occurs between leaves and greenapples.

eralised well on two different fruit applications, and both show that fruit and leavesare the hardest to classify.

Lea

ves

Frui

ts

Trun

k

Gro

und

Sky

Glo

bal

Ave

rage

FM

easu

re

Almond [2] 94.0 69.8 72.8 89.1 95.8 86.9 84.3 84.8Apple 96.7 71.0 71.9 86.9 99.1 92.5 85.1 87.3

Table 2 Multi-class segmentation performance comparison between previous work on almondand this paper on apple. The learning algorithm performed well across different types of fruit withdifferent appearances.

Page 10: A Feature Learning Based Approach for Automated Fruit ... · A Feature Learning Based Approach for Automated Fruit Yield Estimation 5 to the multiscale responses. The classifier

10 Calvin Hung, James Underwood, Juan Nieto and Salah Sukkarieh

5.2 Apple Fruit Counting

Fig. 5 shows the binary apple/non-apple pixel classification results. The classifi-cation algorithm took the image collected by the robotic platform shown in (a),returned the probability of each pixel belonging to the apple class shown in (b),the class probability map was then thresholded at 95% , and erosion was appliedto clean up the noise and dilation to join partially occluded apples shown in (c),followed by the apple detection using circular Hough transform shown in (d).

As expected the classification accuracy improved as the number of classes wasreduced from 5 down to only apple and non-apple. For the apple counting appli-cation, the algorithm was re-trained to only perform binary classification betweenapple and non-apple class. The classification accuracy was 93.3 % for the appleclass and 87.7 % for the non-apple class. Compared to the multi-class experimentwhere the apple classification accuracy was 71%.

The normalised pixel count was computed by summing the apple pixels in the im-ages for each side of the rows and dividing by the number of images. Separately, theapple count was generated by detecting circular regions in the class map using cir-cular Hough transform. Overall the algorithm undercounted the fruit due to shadingand occlusion, however the undercounting was consistent. This was demonstratedin the comparison between the pixel and fruit count produced by the algorithm, andthe ground truth fruit count and weight, shown in Fig. 6. Positive correlations canbe observed in all plots, with the strongest correlation (R2 = 0.810) between the al-gorithm fruit count and the ground truth fruit count. Overall higher correlation wasobserved between the algorithm and ground truth when comparing to fruit countrather than weight.

Finally the algorithm fruit count can be calibrated using the linear equation relat-ing ground truth to the algorithm output. The calibrated fruit count is shown in Fig.7. The algorithm and ground truth match well for the first 10 sides and has a poormatch for three sides 8a, 7b and 6a in particular.

6 Discussion

The results showed that the algorithm performed well for the apple fruit segmen-tation tasks. In this paper the multi-class segmentation achieved 71 % accuracy onthe apple class, and when re-trained to specialise in apple/non-apple classificationthe accuracy improved to 93.3 %. This paper also showed that the apple count canbe estimated by using the pixel classification output. Although the algorithm un-dercounted the apples due to factors such as occlusion and shading, the algorithmoutput was consistent. The results showed high correlation between the algorithmfruit count to the ground truth fruit count provided by the apple farmer.

There were several outliers and it is ongoing work to determine the root cause inthese cases, to make the algorithm more robust. The farmer also reported logisticalproblems with the harvest weighing procedure, as it is not standard practice to mea-

Page 11: A Feature Learning Based Approach for Automated Fruit ... · A Feature Learning Based Approach for Automated Fruit Yield Estimation 5 to the multiscale responses. The classifier

A Feature Learning Based Approach for Automated Fruit Yield Estimation 11

200 400 600 800 1000 1200

200

400

600

800

1000

1200

1400

1600

200 400 600 800 1000 1200

200

400

600

800

1000

1200

1400

1600

200 400 600 800 1000 1200

200

400

600

800

1000

1200

1400

1600

200 400 600 800 1000 1200

200

400

600

800

1000

1200

1400

1600

a) b)

c) d)

Fig. 5 Apple Pixel: a) Original Image. b) Classifier confidence output, higher confidence is shownin red and lower confidence is shown in blue. c) Confidence threshold at 0.95 and with morpho-logical operations to clean up the noise d) Apple detection with circular Hough transform.

Page 12: A Feature Learning Based Approach for Automated Fruit ... · A Feature Learning Based Approach for Automated Fruit Yield Estimation 5 to the multiscale responses. The classifier

12 Calvin Hung, James Underwood, Juan Nieto and Salah Sukkarieh

4)) 6)) 8)) 1))) 12)) 14)) 16)) 18))2)))

3)))

4)))

5)))

6)))

7)))

8)))

Alg

orith

mbF

ruitb

Cou

nt

GroundbTruthbFruitbWeight

AlgorithmbCountbVSbGroundbTruthbWeight

14a13b

13a

12b

12a

11b11a

1)b

1)a9b

9a

8a

7b

6a5b

yb=b3(378xbPb1)91(23)pbR2 =b)(726

2))) 4))) 6))) 8))) 1)))) 12))) 14)))2)))

3)))

4)))

5)))

6)))

7)))

8)))

Alg

orith

mbF

ruitb

Cou

nt

GroundbTruthbFruitbCount

AlgorithmbCountbVSbGroundbTruthbCount

14a13b

13a

12b

12a

11b11a

1)b

1)a9b

9a

8a

7b

6a5b

yb=b)(492xbPb1352(158pbR2 =b)(81)

2))) 4))) 6))) 8))) 1)))) 12))) 14)))2(5

3

3(5

4

4(5

5

5(5

6

6(5

7

7(5xb1)

4

GroundbTruthbFruitbCount

Nor

mal

ised

bPix

elbC

ount

bper

bIm

age

NormalisedbPixelbCountbVSbGroundbTruthbCount

14a

13b13a

12b

12a

11b

11a 1)b1)a

9b

9a

8a

7b

6a

5b

yb=b3(867xbPb2)8)6(692pbR2 =b)(8)6

4)) 6)) 8)) 1))) 12)) 14)) 16)) 18))2(5

3

3(5

4

4(5

5

5(5

6

6(5

7

7(5xb1)

4

GroundbTruthbFruitbWeightbRkgN

Nor

mal

ised

bPix

elbC

ount

bper

bIm

age

NormalisedbPixelbCountbVSbGroundbTruthbWeight

14a

13b13a

12b

12a

11b

11a 1)b1)a

9b

9a

8a

7b

6a

5b

yb=b27(266xbPb17824()23pbR2 =b)(742

Fig. 6 Algorithm counts versus ground truth counts: Positive correlation is observed in all plots.The strongest correlation occurred between the algorithm fruit count and the ground truth fruitcount, demonstrating the algorithm’s ability to infer actual fruit count. The row naming conventionis according to the map shown in Fig. 3.

14a 13b 13a 12b 12a 11b 11a 10b 10a 9b 9a 8a 7b 6a 5b0

2000

4000

6000

8000

10000

12000

14000

RowsName

App

lesC

ount

CalibratedsYieldsEstimationsandsGroundsTruthsforsGreensandsRedsApples

s

GroundsTruthsFruitsCount

AlgorithmsFruitsCount

Fig. 7 Calibrated algorithm count versus ground truth count: The algorithm and ground truth fruitcount matched well in the first 10 sides. The row naming convention is according to the map shownin Fig. 3.

Page 13: A Feature Learning Based Approach for Automated Fruit ... · A Feature Learning Based Approach for Automated Fruit Yield Estimation 5 to the multiscale responses. The classifier

A Feature Learning Based Approach for Automated Fruit Yield Estimation 13

sure the yield per row and we are working together to produce the most accurateground truth for the next season.

This work focused on fruit segmentation and counting in outdoor natural lightingconditions. The same method used here could be applied to natural or controlledlighting, as long as training data were supplied for the chosen condition. We expectthe performance of this method to be higher in controlled conditions than undernatural lighting, due to the increased stability of the appearance of the fruit, but thisremains to be verified experimentally.

One limitation of the algorithm is that it currently does not run in real time,instead it runs at 30 seconds per frame. This is acceptable for yield estimation tasksbut will need further optimisation for future autonomous harvesting applications.There are two potential solutions. The first is to down sample the image, the currentimage resolution of 1232× 1616 pixels is more than enough to resolve large fruitssuch as apples. The second is to process a smaller subset of frames with reducesoverlap, while still maintaining full coverage.

7 Conclusion

This paper presented an approach for multi-class image segmentation for an applefruit segmentation application. This algorithm has been successfully applied to al-mond segmentation previously, and this paper proved that by providing new trainingdata, the algorithm can be generalised to apple segmentation without modification.In addition, the experiments showed that by sampling the classification output withthe aid of the navigation data, the algorithm is able to provide reliable apple yieldestimation.

Compared to the existing work, the algorithm developed in this paper is able towork in natural lighting conditions and is able to detect both red and green applesusing images from a monocular camera.

Acknowledgements We would like to thank Kevin Sanders for his support during the field trials athis apple orchard. This work is supported in part by the Australian Centre for Field Robotics at theUniversity of Sydney and Horticulture Australia Limited through project AH11009 AutonomousPerception Systems for Horticulture Tree Crops.

References

1. Q. Zhang and F. J. Pierce, Agricultural Automation: Fundamentals and Practices. CRCPressI Llc, 2013.

2. C. Hung, J. Nieto, Z. Taylor, J. Underwood, and S. Sukkarieh, “Orchard Fruit Segmentationusing Multi-spectral Feature Learning,” in IEEE/RSJ International Conference on IntelligentRobots and Systems, 2013.

Page 14: A Feature Learning Based Approach for Automated Fruit ... · A Feature Learning Based Approach for Automated Fruit Yield Estimation 5 to the multiscale responses. The classifier

14 Calvin Hung, James Underwood, Juan Nieto and Salah Sukkarieh

3. D. Dey, L. Mummert, and R. Sukthankar, “Classification of plant structures from uncalibratedimage sequences,” in Applications of Computer Vision (WACV), 2012 IEEE Workshop on.IEEE, 2012, pp. 329–336.

4. S. Nuske, S. Achar, T. Bates, S. Narasimhan, and S. Singh, “Yield estimation in vineyards byvisual grape detection,” in Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ Interna-tional Conference on. IEEE, 2011, pp. 2352–2358.

5. A. Payne, K. Walsh, P. Subedi, and D. Jarvis, “Estimation of mango crop yield using imageanalysis–segmentation method,” Computers and Electronics in Agriculture, vol. 91, pp. 57–64, 2013.

6. D. Bulanon, T. Burks, and V. Alchanatis, “Image fusion of visible and thermal images for fruitdetection,” Biosystems Engineering, vol. 103, no. 1, pp. 12–22, 2009.

7. Q. Wang, S. Nuske, M. Bergerman, and S. Singh, “Automated crop yield estimation for appleorchards,” in 13th International Symposium on Experimental Robotics, 2012.

8. A. Aggelopoulou, D. Bochtis, S. Fountas, K. C. Swain, T. Gemtos, and G. Nanos, “Yieldprediction in apple orchards based on image processing,” Precision Agriculture, vol. 12, no. 3,pp. 448–456, 2011.

9. A. Jimenez, R. Ceres, J. Pons et al., “A survey of computer vision methods for locating fruit ontrees,” Transactions of the ASAE-American Society of Agricultural Engineers, vol. 43, no. 6,pp. 1911–1920, 2000.

10. R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, M. Tappen,and C. Rother, “A comparative study of energy minimization methods for markov randomfields with smoothness-based priors,” IEEE Transactions on Pattern Analysis and MachineIntelligence, pp. 1068–1080, 2007.

11. Y. Boykov and M. Jolly, “Interactive graph cuts for optimal boundary & region segmentationof objects in nd images,” in Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEEInternational Conference on, vol. 1. IEEE, 2001, pp. 105–112.

12. C. Rother, V. Kolmogorov, and A. Blake, “Grabcut: Interactive foreground extraction usingiterated graph cuts,” in ACM Transactions on Graphics (TOG), vol. 23, no. 3. ACM, 2004,pp. 309–314.

13. L. Ladicky, C. Russell, P. Kohli, and P. Torr, “Associative hierarchical crfs for object class im-age segmentation,” in Computer Vision, 2009 IEEE 12th International Conference on. IEEE,2009, pp. 739–746.

14. H. Lee, C. Ekanadham, and A. Ng, “Sparse deep belief net model for visual area v2,” Advancesin neural information processing systems, vol. 19, 2007.

15. T. J. Atherton and D. J. Kerbyson, “Size invariant circle detection,” Image and Vision comput-ing, vol. 17, no. 11, pp. 795–803, 1999.