Along with suggestions of Jetson Xavier might also be the Oak-D, which also includes stereo cameras and multiple spatial AI capabilities. OpenVINO compatible.
Worth a mention, the Intel Compute Stick added to an RPi might be an alternative, lower cost consideration, or simply a comparison point;
These integrated cameras are always interesting. So far I tend to be biased towards connecting traditional cameras to general purpose compute, so there is more flexibility in algorithm development. For example I have a couple of these in the office which I want to test as navigation cameras:
That comes with sample apps for computing depth, though visual odometry would be nice too.
Field of view matters too. Acorn can drive in any direction as it has four wheel steering, so getting 360 degree coverage would be valuable. That could be achieved with four fish eye cameras at the corners. We could perhaps use these:
More and more I am seeing good results estimating depth from monocular cameras using structure from motion techniques:
But the advantage of integrated solutions is that they work out of the box. I’m not sure how much compute would be needed to run all the algorithms I want to run, nor how much integration work is required. These are tasks we might be able to get help with from some eager researchers, once the camera systems are installed. I can easily collect datasets at least.
And this is all for the navigation cameras. For the crop-facing cameras, I think we must have general purpose compute and I think it will need to be very powerful. Just running semantic segmentation on 1000x1000 pixel images takes something like the Jetson Xavier at least.
Do you have any data you can collect at this time, or have already collected? Some of the “Papers with Code” reference supervised learning to make this go more quickly, which could infer in this case a carefully calibrated crop mockup set that it would traverse over from which to ‘learn’ depth perception.
Also found this paper, which uses an RPi;
I am not working on any navigation camera solution right now, because GPS is pretty great so far. I am more focused on the crop-facing camera system. I guess what I meant was that if we want to develop a nav-cam solution, then collecting data isn’t too hard even if I wasn’t equipped to solve the machine learning part of it. (Maybe I would be, but mostly I take existing research and deploy it).
That said, my Rover robot needs to solve the same outdoor vision problem and it already has a four camera system on it. Rover is undergoing some mechanical maintenance right now, but I am eager to start collecting new datasets for it once I get it going again. If someone was interested in the vision based navigation problem, that’s basically what I designed Rover for and I will be making its datasets public.
As far as the crop camera, I have some preliminary images but the idea of using fisheye lenses did not work out great due to insufficient depth of field, so I can see the entire underside of acorn with both of the two 13mp cameras underneath it, but not all of the field of view is in focus. I will share some images anyway but I think it needs more work before we can start producing useful datasets.
As a stopgap we could also use some gopro cameras to start collecting initial data while I mess with optics for the intended vision system. I’ll have to throw our GoPro under there and see how workable that data is. I don’t think it would be useful for realtime processing (as far as I know) but it would be fine for post processing and testing a training system.