Around a year ago, a former embedded systems designer from the Japanese automobile industry named Makoto Koike, started helping out at his parents’ cucumber farm, and was amazed by the amount of work it takes to sort cucumbers by size, shape, colour and other attributes.
In Japan, each farm has its own classification standard and there's no industry standard. At Makoto's farm, they sort them into nine different classes, and his mother sorts them all herself — spending up to eight hours per day at peak harvesting times.
There are some automatic sorters on the market, but they have limitations in terms of performance and cost, and small farms don't tend to use them.
The many uses of deep learning
Makoto first got the idea to explore machine learning for sorting cucumbers from a wildly different source - Google AlphaGo - competing with the world's top professional Go player.
"When I saw the Google's AlphaGo, I realized something really serious is happening here,” said Makoto. “That was the trigger for me to start developing the cucumber sorter with deep learning technology."
Using deep learning for image recognition allows a computer to learn from a training data set what the important "features" of the images are. By using a hierarchy of numerous artificial neurons, deep learning can automatically classify images with a high degree of accuracy. Thus, neural networks can recognise different species of cats, or models of cars or airplanes from images. Sometimes neural networks can exceed the performance of the human eye for certain applications.
TensorFlow democratises the power of deep learning
But can computers really learn mom's art of cucumber sorting? Makoto set out to see whether he could use deep learning technology for sorting using Google's open source machine learning library, TensorFlow.
"Google had just open sourced TensorFlow, so I started trying it out with images of my cucumbers,” Makoto said. “This was the first time I tried out machine learning or deep learning technology, and right away got much higher accuracy than I expected. That gave me the confidence that it could solve my problem."
Pushing the limits of deep learning
One of the current challenges with deep learning is that you need to have a large number of training datasets. To train the model, Makoto spent about three months taking 7,000 pictures of cucumbers sorted by his mother, but it’s probably not enough.
"When I did a validation with the test images, the recognition accuracy exceeded 95%. But if you apply the system with real use cases, the accuracy drops down to about 70%. I suspect the neural network model has the issue of "overfitting" (the phenomenon in neural network where the model is trained to fit only to the small training dataset) because of the insufficient number of training images."
The second challenge of deep learning is that it consumes a lot of computing power. The current sorter uses a typical Windows desktop PC to train the neural network model. Although it converts the cucumber image into 80 x 80 pixel low-resolution images, it still takes two to three days to complete training the model with 7,000 images.
"Even with this low-res image, the system can only classify a cucumber based on its shape, length and level of distortion. It can't recognise colour, texture, scratches and prickles,” Makoto explained. Increasing image resolution by zooming into the cucumber would result in much higher accuracy, but would also increase the training time significantly.
To improve deep learning, some large enterprises have started doing large-scale distributed training, but those servers come at an enormous cost. Google offers Cloud Machine Learning (Cloud ML), a low-cost cloud platform for training and prediction that dedicates hundreds of cloud servers to training a network with TensorFlow. With Cloud ML, Google handles building a large-scale cluster for distributed training, and you just pay for what you use, making it easier for developers to try out deep learning without making a significant capital investment.