The Controlled Environment Agriculture Open Data project aims to advance controlled environment research, machine learning and artificial intelligence through the collection and dissemination of crop production data.
by David Kuack
There is a considerable amount of data being generated by both private companies and university researchers when it comes to controlled environment crop production. This data is being generated for ornamentals, food crops, and cannabis. One of the questions about all this data is whether it is being used to its maximum potential to benefit the horticulture industry.
“Data has become a big topic in the horticulture industry with university researchers and private companies,” said Erico Mattos, executive director of the Greenhouse Lighting and Systems Engineering (GLASE) consortium. “People can identify with the challenges and opportunities with the amount of data that is being generated. However, we don’t yet have a centralized repository and a standard methodology for storage to allow us to explore and exploit this data.”
Addressing the data proliferation
In 2018 during the North Central Extension & Research Activity–101 (NCERA-101) meeting, members of this USDA-organized committee discussed what should be done with the extensive amount of data being generated by controlled environment researchers. Ohio State University professor Chieri Kubota proposed the formation of a sub-committee to address the need to develop guidelines for sharing data generated by controlled environment agriculture researchers.
“Dr. Kubota initiated the discussion about the need for a centralized platform to store data collected from controlled environment research,” Mattos said. “A task force was formed that included Chieri, Kale Harbick at USDA-ARS, Purdue University professor Yang Yang, Melanie Yelton at Plenty and myself. Since the task force was formed, Ken Tran at Koidra and Timothy Shelford at Cornell University have also become members of the task force.
“We started discussing how we could make use of all this data. Researchers in the United States collect a huge amount of data. All of the environmental data such as temperature, relative humidity, and carbon dioxide and light levels in controlled environment research is collected. There is also a biological set of data which includes plant biomass and fruit yield.”
Mattos said there is also a great deal of research data generated and collected by private companies that are not shared with the horticulture industry.
“With the advancement in sensors and environmental controls, the capability now exists that this data can be collected,” he said. “With the advancements in computing power, this data can be used to start new applications and new tools that haven’t been available before. However, in order to do this, we have to have access to a large amount of data. That’s why the task force thought it would be good to create a repository where researchers and private companies could share the data following a specific format. This data could then be used in the advancement of machine learning and artificial intelligence applications to optimize crop yields in commercial CEA operations.”