Plant Identification Model

Is there any plan to utilize (or fine tune) an existing plant identification model (e.g., Seek, iNaturalist, Agrobase, etc), or is the vision to build one from scratch? Adding Pl@ntNet to the list, which can be called via API.

Is the assumption that the model will be a DNN?

1 Like

Great question.

I don’t have plans for a specific model right now. The new vision system is installed and I’m working on some networking magic so I can talk to it easily. Then I will start collecting data with it and see how the images look.

I imagine a DNN would be used. There’s some very interesting vision work going on with transformers right now, but the networks are often large and I imagine that all needs a few years at least before it could be run on edge computing.

I do absolutely love CLIP for the way it ingests training data.

I have done some work to run real time image segmentation on my personal robot Rover.

I don’t have the latest work I’ve done in that thread, but it describes what I have been working on.

However with Rover the Nvidia example network is slow, so I am planning on checking out this network:

I am very interested in self supervised learning, but I’ve not gone in to what it would take for our application. Ideally I would want a network that can automatically learn to differentiate between different plants without explicit labels. My plan is to start collecting a bit of data from the new cameras and then start talking to researchers.

I just came across this paper today. I’ve only read the abstract but it sounds interesting:

I’ve thought there must be some way to automatically differentiate different plants based on the idea that for a given plant, it will always be near parts of itself, while neighboring plants will often but not always be nearby. However I am a relative novice when it comes to deep learning, so this is an area where open collaboration is a must. Luckily it’s an application I think many will agree is worth working on.

Also we only have a Jetson Nano for compute right now which is fine for collecting data, but I imagine we would upgrade to an Xavier or better for running inference onboard.

1 Like

It is necessary to put high latency requirement computing like inference on board if you can reliably communicate at full bandwidth with high latency to a ground station. Likewise the question may be whether or not inferencing will be a full time function or not. It may be possible to distribute the task among a “swarm” to bolster the total fielded inferencing capacity. More rovers or ground station means faster per instance inferencing until it’s at full duty cycle.

As for gathering the data, my personal ideal is to grow to allow no-till and biodiverse plantings. Using a concept that the identifying portions of a plant will be near other parts of the same plant is a great one I think, with my limited knowledge.

If the rovers are initially dealing with rows of singular known crops then the plants that exist in multiple uniquely planted rows are not the intentional plants, or “weeds”. We can also compare images of plants in a controlled situation like a greenhouse to the same variety in field.

If the rovers know what’s planted where by GPS they can be constantly checking the rows and feeding the development stages back into the dataset.


Unknown whether it is best to put compute onboard or nearby. I worry about requiring a robust wifi connection for basic operation, but that approach has its benefits.

I think it comes down to what kind of compute is required. If it can be done on an Xavier or other $500 edge compute, it may make sense to put it onboard. If we need a $3000 multi GPU server and a high end Wifi network to run it, the upfront cost is higher. I prefer onboard compute as any latency slows down overall operational speed.

First step is to build a “data engine” (see talk) and figure out what a plant-identifying network looks like:

And yes we will incorporate GPS data, and work to georeference everything in the images. We just received a new GPS system that estimates GPS position at 100hz using GPS and an IMU. It has an application processor which I will hook up to the camera shutter activation signal, so we can stamp every single shutter activation with high accuracy. From there we can also use structure from motion techniques to compute the depth of the scene and localize objects. This may only be done in post processing, but it will allow us to track the progress of every plant over time. That’s gotta be useful for somethin!

1 Like

There are so many potential software and hardware solutions/combinations, it may be helpful to understand the essential use cases and perhaps create x number of alternatives to compare and contrast. This could cut across the entirely of vision, power management, movement planning, movement execution, robotic arm planning, robotic arm movement, etc.

There potentially could be multiple processors involved, even perhaps down to a very insignificant Arduino that primarily transitions it between Operating and Dormant.

On the subject of self-learning, some mechanism for tagging/labeling is essential, whether directly by a human (at the time of discovery or later offline), or by deducing it from the characteristics in a Plant Traits Database or simply by calling a plant image identification service such as Pl@ntNet with a cropped Region of Interest (ROI). Otherwise, it won’t know if the unidentified plant is beneficial (in one form or another, such as a legitimate companion plant), neutral, or detrimental (robbing nutrients, water, sunlight, and or simply crowding out crop plants).