Computing Relative Elevation from Stereo Matches

Next: Navigation Up: Stereo Driving Previous: Stereo Matching

Computing Relative Elevation from Stereo Matches

As previously mentioned, a limitation of the conventional approach to stereo driving is that it relies on precise metric calibration with respect to an external calibration target in order to convert matches to 3-D points. From a practical standpoint, this is a serious limitation in scenarios in which the sensing hardware cannot be physically accessed, such as in the case of planetary exploration. In particular, this limitation implies that the vision system must remain perfectly calibrated over the course of an entire mission. From a philosophical point of view, navigation should not require the precise knowledge of the 3-D position of points in the scene: What is important is how much a point deviates from the reference ground plane, not its exact position.

Based on these observations, we developed an approach in which a relative measure of height with respect to a ground plane is computed from the matches without requiring the knowledge of the full set of camera parameters. This height is relative in the sense that it is a multiple of the true height by a unknown scale factor. We now describe the construction of the relative height in detail. The geometry described below has been used in earlier work in which a point is classified as belonging to one of two halfspaces based on its projections in two uncalibrated images [11].

Let us consider first a flat ground plane observed by two cameras. We assume that the only known information about the geometry of the cameras is the epipolar geometry. Let be a generic point on the plane and and be its projections in the left and right images, respectively. We represent points in the image plane by 2-D projective coordinates, , where the usual Cartesian image coordinates are and . It can be easily shown that for any point , the projections are related by a linear projective transformation, or homography, : . In this relation, the symbol = means that the two sides are equal in the projective sense, i.e., that their coordinates are proportional. Intuitively, maps a pixel from the right image to its location in the left image assuming that the corresponding 3-D scene point lies on the plane (Figure 2(a)).

The homography is a 3x3 matrix defined up to a scale factor. can be easily estimated from real images in the following way. First, features are selected in the left image and the corresponding pixels in the right image are computed using the algorithm of Section 2.1. Then, the parameters of are computed by solving for the least-squares criterion: . The features used in the computation of may be anywhere in the image. Moreover, computing does not require any information on the actual 3-D positions of the scene points used as features.

We now show that is all we need to compute a relative elevation map. Consider a world point not necessarily on the ground plane and its projections and . Let us assume that we also have defined once and for all a "reference" point described by its projections and . may be anywhere in space so long as it is not on the reference plane. Of course the point is not known; only its projections in the images are known. Let us consider now the image points and . Point (resp. ) is the point at which (resp. ) would be projected in the right image if it were on the ground plane. Finally, consider the point , intersection in the right image of the two segments and . Since is the projection of the segment and is the projection of a line segment contained in the plane, the intersection point must be the projection of the intersection of with the reference plane (Figure 2(b)).

The previous reasoning shows that we now have a way to compute the intersection of the line joining a point and a reference point with a reference plane without computing the actual 3-D position of the point. Now, let become the point at infinity in a given direction. The intersection becomes the image of the projection of onto the reference plane in the direction given by (Figure 2(c)). Because is now the projection of , the distance in the image is directly related to the height of with respect to the reference plane. In practice, we use another reference point which we declare to be at height one from the reference plane. If is the image of the projection of on the reference plane, then the height is defined as: (Figure 2(d)).

In affine geometry, this definition of height is exact in the sense that is proportional to the distance between and the reference plane. In the projective case, an additional reference plane is required in order to define a concept of projective height. However, the affine approximation is accurate enough for the purpose of navigation because the heights are computed at relatively long range from the camera and over a relatively shallow depth of field.

The relative height is also used for limiting the search in the stereo matching. More precisely, we define an interval of heights which we anticipate in a typical terrain. This interval is converted at each pixel to a disparity range . This is an effective way of limiting the search by searching only for disparities that are physically meaningful at each pixel.

In addition to the relative elevation, a measure of slope relative to the ground plane can also be computed under minimal knowledge of camera geometry. We do not describe the slope evaluation algorithm here because it is not integrated in the current system. We refer the reader to [12] and [1] for a detailed presentation.

Next: Navigation Up: Stereo Driving Previous: Stereo Matching

buffa@cs.cmu.edu
Fri Aug 19 11:49:17 EDT 1994