We’ve long left “chicken-bucket” lidars in the rear-view mirror — is the spray painting approach to point cloud capture the next to go? To achieve the $100 cost target for mass production lidar, providers need to embrace smart sensing, says Ollie Mathews
The need for smarter sensing
An interesting reason for the recent chip shortage in the automotive industry is that many car manufacturers have built their vehicles on outdated generations of silicon, which are low on chip providers’ priorities. They have done this to minimise hardware costs — the tight cost margins in this industry mean OEMs always need to make the most out of the cheapest hardware possible.
This approach does not marry well with current lidar offerings. In recent years, lidar manufacturers have succeeded in meeting the high demands of autonomous vehicles on Field of View (FoV), resolution and sample rates. However, in doing so they have produced complex devices that tend to cost closer to $1000 than $100.
Smart sensing is a way to embrace the ethos of doing more with less, allowing for lidar manufacturers to meet those tough requirements at lower costs.
The core idea behind smart sensing is to maximise the information density of measurements by being more selective in which samples are taken. This reduces the number of measurements required, leading to cheaper and more useful sensors for OEMs, whose Engine Control Units (ECUs) can get the same information without needing to trawl through millions of points per second.
In this post we discuss:
- What we mean by smart sensing in lidar
- How to identify Regions of Interest (ROIs) in a point cloud
- How to leverage that information, especially with foveating lidar
- The commercial implications in terms of hardware cost
Smart sensing in lidar
For a lidar, we can think of smart sensing in terms of “point cloud efficiency,” which we define as:
In commercial lidar, point cloud efficiencies are often cited to be as low as 10% [1, 2]. The traditional approach is to uniformly sample the entire Field of View (FoV), effectively spray painting the scene with laser points. This inevitably yields a lot of datapoints of the ground and the sky, but very little information about the objects that perception algorithms care about.
The smart sensing alternative is to be more selective about which parts of the FoV the lidar samples, sampling densely in Regions of Interest (ROIs) and sparsely elsewhere. Doing this requires two things: more computation needs to be pushed into the lidars to identify the ROIs; and lidars need to be able to foveate to focus in on those ROIs.
Identifying Regions of Interest (and less interest) in a point cloud
This means adding some logic into the device to identify ROIs within its field of view. There are different ways of defining and identifying these regions, entailing different computational complexity.
A computationally light way to identify ROIs is to flag individual points as they come in by means of simple physical heuristics. For example, a lidar can easily flag points that are far away using information directly available in the Time of Flight calculation. These objects take up a smaller proportion of the field of view so are especially poorly represented by uniform sampling.
Another heuristic that requires very little extra logic could be to look for particularly reflective points, which tend to be man-made and more interesting like the number plate on a car. Both approaches require little extra logic and allow the lidar to focus on important points.
Pushing more computation into the lidar could allow for ROI identification at the object level. Algorithms like RANSAC can be used to identify common planes in point clouds, like buildings and the ground. These regions tend to be less interesting, and by identifying them, a lidar can ensure it focuses on more interesting objects in a scene.
With still more computing power, a lidar could also identify moving objects in a scene, which is crucial to a vehicle’s ability to respond to unexpected changes. The animation below illustrates this for a parking scene. The video on the left shows the original scene, while we have applied our background removal algorithm to the video on the right. Once the algorithm has learnt to distinguish moving objects, the ROIs shown encompass less than 10% of the points taken, without losing information about dynamic objects.
Removal of background as detected from a moving vehicle is more complex and would require an implementation of Simultaneous Location and Mapping (SLAM) to distinguish absolute from relative motion. A lidar could even go further and calculate the movement of objects in a scene to focus ROIs on where objects will be in the next frame.
The simplest of these approaches could be easily implemented on the powerful FPGAs already in lidars. The more complex methods could also be handled by FPGAs, albeit doing so might add greater silicon costs. In the following sections, we look at why these costs are still justified.
Closing the feedback loop
After metrics and algorithms have been applied to identify ROIs, a smart lidar needs to act on that information. It can do this in two ways:
- The lidar can filter the point clouds to reduce the amount of useless data sent to the ECU
- The information can be used to inform the acquisition of future points, making the lidar focus on sampling more points in Regions of Interest, and less points overall
The first of these requires no changes to the mechanics of the lidar. It is a part of the approach used by Outsight to make Lidar Boxes, which provide an interface with a lidar . This device uses SLAM to process the lidar point clouds in real-time, providing background object information and a reduced number of more informative points to a Neural Network which can classify objects in the data.
While this is an effective approach to reducing the bandwidth of the data sent to the ECU, it does not tackle the cost of sampling points which are not needed. The key advantages of smart sensing come from bringing hardware into the feedback loop. This means the lidar needs to foveate, which for a scanning lidar means altering the scanning pattern to sample more densely in Regions of Interest.
Crucially, not having to sample the entire FoV at the required resolution in each frame means the number of points per scan can be reduced dramatically. In theory, gains in point cloud efficiency can allow for proportional drops in sample rate without losing any information. This directly translates into lower hardware cost for lidar manufacturers.
Commercial factors in implementation
The advantages of smart sensing to the OEM lie in the reduction in the amount of data the ECU needs to process, allowing more time for classifying the data and making good decisions. Moreover, the ECU could also be added into the feedback loop, making it possible for the lidar to pay more attention to objects the ECU is struggling to identify.
The commercial benefit to the lidar manufacturer is direct — sampling less points means lower hardware costs.
Current commercial lidars can have as many as five TX and RX channels running in parallel with all the signal processing parts (including expensive fast ADCs) repeated for each. Increasing the point cloud efficiency from 10% to 50 % would mean that only one channel needs to run — without losing information but while dramatically cutting silicon costs. Even in systems which are not as heavily multiplexed, sampling less points means cheaper, lower frequency components can be used.
Moreover, many lidar architectures already have the in-built ability to foveate. Timing already must be tightly controlled in MEMS-based systems, for example, to achieve uniform sampling under sinusoidal mirror motion, and there is no added complexity in achieving a non-uniform pattern.
This begs the question of why more lidar companies are not implementing smart sensing in their devices. Perhaps OEMs would hesitate to take up products that provide semantic information about a scene, because they are used to dealing with point clouds themselves and reluctant to take the unknown risk of relying on a lidar provider’s inferences. To reach the low cost they want, however, OEMs are likely to accept some more deterministic algorithms like SLAM being passed onto edge devices.
The true reason foveating lidars are not ubiquitous is that they have not been needed yet. However, as providers start to look at actually achieving the much-talked about $100 cost target, they will need to start to look at ways to cut costs without impacting on performance. With autonomous vehicles also moving faster and faster, an OEM might well look favourably at a lidar that does not require them to motor through 1 million points per second to get the information they need.
So perhaps the question is not why have foveating lidars with edge processing not been implemented yet, but how soon before they are embraced as an essential part of a cost-effective system.