Figure 3. Image of block matching
5-1. On calculation algorithms
In addition, since the distance measurement of the stereo camera is divided into several calculation algorithms, the calculation algorithm is briefly introduced below.
1. Preprocessing: Distortion correction (calibration), normalization of luminance values of images, etc.
2. Parallelization: Image conversion for efficiency of matching
3. Matching: Estimating disparity by matching
4. Triangulation: Convert disparity map to distance from geometric arrangement of camera
Calculate the distance by measuring the parallax by the above steps.
5-1-1 About distortion correction (calibration)
Correct camera distortion as preprocessing of image processing. Since the lens of the camera is bent because it has distortion, it mathematically removes radial distortion and circumferential distortion of the lens.
The image before distortion correction (left) and the image after distortion correction (right)
5-1-2 About rectification processing
In the parallelization process, the accuracy and distance between the cameras are adjusted so that the corresponding points of the two measured images have the same row coordinates.
Make sure that the two image planes are on the same plane and the lines of the image are exactly aligned.
This process is an essential task for improving the processing efficiency of stereo vision (stereo camera), because matching process is the process of searching for the same image from one dimensional problem from two dimensional search problem. Parallelization of stereo images is often used as a preprocessing procedure for disparity calculation and anaglyph image creation.
Image before collimation processing
Image after equilibration processing
About 5-1-3 Matching
It is a method of estimating disparity by performing matching on each part of two images. Here, parallax is the difference between the positions of the corresponding parts between the two images.
If parallax of each part on the image can be estimated by performing stereo matching, the distance can be calculated based on the principle of triangulation. In the matching, a stereo corresponding point search (searching for the same point in two different camera images) is performed. There are various matching algorithms for searching corresponding points.
In OpenCV of computer vision programming library described below, a fast and effective block matching stereo algorithm is implemented, and as a matching image, a window is set on the same plane image and the sum of differences among them (SAD: Sum of Absolute Difference)) that minimizes the difference between the absolute values of the differences.
Describing the processing image of the matching algorithm,
There are three steps in the above block matching stereo corresponding point search algorithm.
1. Prefilter to normalize the brightness of the image and emphasize the texture.
2. Use the SAD window to search for corresponding points along horizontal epipolar lines.
3. Postfiltering to eliminate defective corresponding points.
In the prefiltering phase, we normalize the input image to emphasize the brightness and texture of the image in order to make the matching efficient.
The next corresponding point is searched by sliding the SAD window (the range of disparity search from the reference pixel). For each feature in the left camera image, look for the one that best matches from the corresponding line in the right camera image.
When collimating, each line becomes an epipolar line, so it can be assumed that the matching place in the image of the right camera is in the same row (same y coordinate) in the left camera image. Also, since the stereo camera is mounted in parallel, if the parallax is zero, it becomes the same point (x0), and if the parallax is larger than that it will be on the left side of the image. (See the figure below)
For posterior filtering, we perform processing to delete abnormal values and defective corresponding points, such as by checking whether parallax values match by seeing the values of the left and right parallax.
Image image search image
5-1-3-1 What is epipolar wire?
The epipolar line that came out frequently above is the line that connects geometric points related to stereo vision, which photographs three-dimensional space with two cameras, epipolar geometry.
Epipolar geometry is a geometry that helps to restore 3D depth information from images viewed from two different positions and to find correspondences between images.
On the diagram Epopyla geometry
As a premise to explain epipolar geometry,
It is assumed that the point P existing in the three-dimensional space is projected (perspective projection) on the projection plane (Left view, Right view) of the two cameras.
· Ol and Or are the projection centers of the two cameras.
· Points pl and pr are projections of points P on each projection plane.
We will proceed with the explanation.
Since the two cameras are in different positions, if one camera can see the other camera, they are projected to el and er, respectively. This is called epipole or epipolar point.
· El, er and Ol, Or have the feature of riding on the same straight line in three dimensional space.
■ Epipolar Lines
An epipolar line is a line that can be written on the projection plane, and it is a line from the projection point to the corresponding epipole.
The straight line Ol - P is projected on one point in the camera on the left, and when a point pl present on the projection plane, a straight line Or - P in the right camera and pr on the projection plane of the right camera, The camera 's linear er - pr is called epipolar line. (Line el-pl is epipolar line for left camera.)
This epipolar line is uniquely determined by the three-dimensional spatial position of the point P, and all the epipolar lines pass through the epipolar point (el, er in the figure).
· Conversely, the straight line passing through the epipolar point has all the characteristic that it becomes an epipolar line.
■ Epipolar Surface
· A plane passing through three points P, Ol, Or is called an epipolar plane.
· The line of intersection between the epipolar plane and the projection plane coincides with the epipolar line. (There are epipolar points on the epipolar line.)
■ Epipolar Constraints
When the positional relationship between the two cameras is known, the following can be said.
. Given the projection pl with the left camera at point P, the epipolar line er - pr of the right camera is defined. And the projection pr of point P with the right camera will be somewhere on this epipolar line. This is called an epipolar constraint.
· In other words, if you assume that the same point is captured by two cameras, it should be on the epipolar line of each other.
· Therefore, to solve the problem of where the point seen by one camera is reflected in the other camera, it is enough to investigate on the epipolar line, which leads to a considerable amount of calculation saving.
· If the correspondence is correct and the positions of pl and pr are known, it is possible to determine the position of the point P in three dimensional space.
5-1-4 About Triangulation
If you know the geometric placement of the camera, convert the disparity map to distance with triangulation principle. Describing with the formula, the depth: Z can be expressed by the following formula, but as an image at the time of calculation, it is explained using the figure below.
As shown in the figure, it is assumed that there are stereo cameras corrected for distortion correction and parallelization processing. It consists of two cameras.At that time, the image plane is exactly on the same plane and has an exactly parallel optical axis (the optical axis is also called a principal ray in a ray passing from the projection center O to the principal point c) and the same focal length f.
Here we assume that the principal points Cxleft and Cxright are calibrated and have the same pixel coordinates in the left and right images, respectively. Furthermore, assuming that the rows of these images are present and all the rows of pixels in one camera are exactly aligned with the corresponding rows of another camera, the point P in the real world Let's assume that it exists in the left and right image view and has xl and xr as its horizontal coordinates.
At this time, the parallax is defined as d = xl - xr, and the depth Z can be derived by triangulation principle.
From the above equation, since the depth is inversely proportional to parallax, when the parallax is close to 0 (in the case of a distant object), the depth changes greatly, and in the case where the parallax is large (in the case of a nearby object) ,even if the parallax is slightly different, there is a feature that the influence on the depth becomes small. For this reason, the stereo camera system exerts high resolution especially for objects relatively close to the camera.
5-2. Calculation processing unit
In image processing, NVIDIA's GPU (Graphics Processing Unit) famous for image processing computer, CPU (Central Processing Unit) such as Intel Core i7 and i5, etc. are utilized in PC (personal computer) based processing.
Since GPU is specialized for image processing and development, processing can be performed more efficiently than CPU. In addition, there are cases where image processing is performed using FPGA (Field Programmable Gate Array) when limiting the purpose. FPGA is famous for Altera company acquired by Intel.
The FPGA is an LSI that can design a circuit configuration by programming, and since it constitutes a circuit by programming, it is possible to change the circuit configuration inside the chip, and it is possible to change the program.There are cases where it can be made and used in terms of cost, development period, etc.
5-3. About image processing programming (Open CV)
There are various kinds of image processing programming, but OpenCV (official name: Open Source Computer Vision Library) is an open source computer vision library. The library is written in C and C ++, and various functions required for processing images and movies on a computer are implemented, and since it is distributed under the BSD license, it can be used not only for academic purposes but also for commercial purpose.
In addition, because it is multi-platform compatible, it is characterized by being used in a wide range of scenes.
Characteristics of OpenCV are C / C ++, Java, Python, MATLAB libraries that can be used in various environments and have functions such as image processing / image analysis and machine learning. As platforms, there are features supporting all POSIX-compliant Unix-like OS such as macOS and FreeBSD, Linux, Windows, Android, iOS, etc.
Application of general-purpose image processing programming environment makes it easy to apply to stereo vision system and development of built-in system of stereo vision algorithm, which will shorten development time and reduce price of developed product.