WO2017138245A1 - 画像処理装置、物体認識装置、機器制御システム、画像処理方法およびプログラム - Google Patents
画像処理装置、物体認識装置、機器制御システム、画像処理方法およびプログラム Download PDFInfo
- Publication number
- WO2017138245A1 WO2017138245A1 PCT/JP2016/087158 JP2016087158W WO2017138245A1 WO 2017138245 A1 WO2017138245 A1 WO 2017138245A1 JP 2016087158 W JP2016087158 W JP 2016087158W WO 2017138245 A1 WO2017138245 A1 WO 2017138245A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- template
- unit
- matching
- image
- thinning
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/74—Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60R—VEHICLES, VEHICLE FITTINGS, OR VEHICLE PARTS, NOT OTHERWISE PROVIDED FOR
- B60R21/00—Arrangements or fittings on vehicles for protecting or preventing injuries to occupants or pedestrians in case of accidents or other traffic risks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/16—Anti-collision systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
- G06T2207/10021—Stereoscopic video; Stereoscopic image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30248—Vehicle exterior or interior
- G06T2207/30252—Vehicle exterior; Vicinity of vehicle
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30248—Vehicle exterior or interior
- G06T2207/30252—Vehicle exterior; Vicinity of vehicle
- G06T2207/30261—Obstacle
Definitions
- the present invention relates to an image processing apparatus, an object recognition apparatus, a device control system, an image processing method, and a program.
- the clustering process is a process of newly detecting an object by using, in particular, a luminance image captured in real time and a parallax image derived from a stereo camera.
- the tracking process is a process of following an object detected by the clustering process using information of a plurality of frames. In the tracking process, a region similar to the object detected in the previous frame is detected from the current frame by template matching basically on the basis of a pattern of disparity values or luminance values on a two-dimensional image.
- Patent Document 1 a technique of recognizing a pedestrian by template matching has been proposed (for example, Patent Document 1).
- Patent Document 1 applies an optimal or suitable algorithm to an object to be recognized (an object at a short distance) and an object having a small amount of change (an object at a long distance) as objects to be recognized. Because there is no problem, there is a problem that depending on the distance of the object, accurate detection may not be possible.
- the present invention has been made in view of the above, and it is an object of the present invention to provide an image processing apparatus, an object recognition apparatus, a device control system, an image processing method, and a program that improve the accuracy of object detection.
- the present invention predicts the position of the object in the current frame from the position of the object in the previous frame with respect to the current frame, and specifies a prediction area And determining means for determining whether the object exists in the first distance range or in the second distance range farther than the first distance range based on the distance of the object in the previous frame, and the object is determined by the determination means
- First matching processing means for detecting an object by performing template matching using a first template for an object of the previous frame in the prediction region of the current frame when it is determined that the object is in the first distance range, and determination If it is determined by the means that the object is present in the second distance range, the first ten for the object of the previous frame in the prediction area of the current frame It performs template matching using a different second template and rate, characterized by comprising a second matching processing means for detecting an object, the.
- the accuracy of detection of an object can be improved.
- FIG. 1 is a diagram showing an example in which a device control system according to the embodiment is mounted on a vehicle.
- FIG. 2 is a view showing an example of the appearance of the object recognition apparatus according to the embodiment.
- FIG. 3 is a diagram showing an example of the hardware configuration of the object recognition apparatus according to the embodiment.
- FIG. 4 is a diagram showing an example of a functional block configuration of the object recognition device according to the embodiment.
- FIG. 5 is a diagram showing an example of a functional block configuration of a disparity value calculation processing unit of the object recognition device according to the embodiment.
- FIG. 6 is a diagram for explaining the principle of deriving the distance from the imaging unit to the object.
- FIG. 1 is a diagram showing an example in which a device control system according to the embodiment is mounted on a vehicle.
- FIG. 2 is a view showing an example of the appearance of the object recognition apparatus according to the embodiment.
- FIG. 3 is a diagram showing an example of the hardware configuration of the object recognition apparatus according to the embodiment.
- FIG. 7 is an explanatory diagram in the case of finding a corresponding pixel in a comparison image corresponding to a reference pixel in a reference image.
- FIG. 8 is a diagram showing an example of a graph of the result of the block matching process.
- FIG. 9 is a diagram showing an example of a functional block configuration of a recognition processing unit of the object recognition device according to the embodiment.
- FIG. 10 is a diagram illustrating an example of a V map generated from a parallax image.
- FIG. 11 is a diagram illustrating an example of a U map generated from a parallax image.
- FIG. 12 is a diagram showing an example of a real U map generated from the U map.
- FIG. 13 is a diagram for explaining the process of creating a detection frame.
- FIG. 10 is a diagram illustrating an example of a V map generated from a parallax image.
- FIG. 11 is a diagram illustrating an example of a U map generated from a parallax image.
- FIG. 14 is a diagram showing an example of a functional block configuration of a tracking processing unit of the recognition processing unit of the object recognition device according to the embodiment.
- FIG. 15 is a flowchart illustrating an example of the block matching process performed by the disparity value deriving unit according to the embodiment.
- FIG. 16 is a flowchart illustrating an example of tracking processing performed by the tracking processing unit of the recognition processing unit according to the embodiment.
- FIG. 17 is a diagram for explaining the movement prediction operation.
- FIG. 18 is a flowchart illustrating an example of the matching processing operation of the tracking processing of the tracking processing unit according to the embodiment.
- FIG. 19 is a flowchart illustrating an example of operation of feature update processing in the case of performing rough matching in the tracking processing of the tracking processing unit of the embodiment.
- FIG. 15 is a flowchart illustrating an example of the block matching process performed by the disparity value deriving unit according to the embodiment.
- FIG. 16 is a flowchart illustrating an example of tracking processing performed by the tracking
- FIG. 20 is a diagram for explaining thinning-out processing on an image of a detection area in feature update processing in the case of performing rough matching in the tracking processing unit of the embodiment.
- FIG. 21 is a flowchart illustrating an example of rough matching processing operation in the tracking processing of the tracking processing unit according to the embodiment.
- FIG. 22 is a diagram for explaining thinning processing on an image of a prediction area in rough matching processing in tracking processing of the tracking processing unit according to the embodiment.
- FIG. 23 is a diagram for explaining frame correction processing in the rough matching processing in the tracking processing of the tracking processing unit according to the embodiment.
- FIG. 24 is a flowchart showing an example of operation of feature update processing in the case of performing part matching in the tracking processing of the tracking processing unit of the embodiment.
- FIG. 25 is a flowchart illustrating an example of an operation of a process of selecting a part template in the feature update process in the case of performing the part matching of the tracking processing unit of the embodiment.
- FIG. 26 is a diagram for explaining part template selection processing.
- FIG. 27 is a flowchart illustrating an example of the part matching process in the tracking process of the tracking processing unit according to the embodiment.
- FIG. 28 is a diagram for explaining part matching processing.
- FIG. 1 is a diagram showing an example in which a device control system according to the embodiment is mounted on a vehicle. The case where the device control system 60 of the present embodiment is mounted on a vehicle 70 will be described as an example with reference to FIG. 1.
- FIG. 1A is a side view of a vehicle 70 equipped with the device control system 60
- FIG. 1B is a front view of the vehicle 70. As shown in FIG.
- a vehicle 70 which is a car is equipped with a device control system 60.
- the device control system 60 includes an object recognition device 1 installed in a cabin which is a room space of the vehicle 70, a vehicle control device 6 (control device), a steering wheel 7, and a brake pedal 8.
- the object recognition device 1 has an imaging function of imaging the traveling direction of the vehicle 70 and is installed, for example, in the vicinity of a rearview mirror inside a front window of the vehicle 70.
- the object recognition device 1 includes the main body 2, an imaging unit 10 a fixed to the main body 2, and an imaging unit 10 b, the details of the configuration and operation will be described later.
- the imaging units 10 a and 10 b are fixed to the main unit 2 so as to be able to capture an object in the traveling direction of the vehicle 70.
- the vehicle control device 6 is an ECU (Electronic Control Unit) that executes control of various vehicles based on the recognition information received from the object recognition device 1.
- the vehicle control device 6 controls steering system (control target) including the steering wheel 7 based on the recognition information received from the object recognition device 1 as an example of vehicle control, or steering control to avoid an obstacle, or A brake control or the like is performed to control the brake pedal 8 (control target) to decelerate and stop the vehicle 70.
- vehicle safety such as steering control or brake control is performed to improve the driving safety of the vehicle 70. it can.
- the object recognition apparatus 1 shall image the front of the vehicle 70 as mentioned above, it is not limited to this. That is, the object recognition device 1 may be installed to image the rear or side of the vehicle 70. In this case, the object recognition device 1 can detect the position of a succeeding vehicle and a person behind the vehicle 70 or other vehicles and people on the side. Then, the vehicle control device 6 can execute the above-described vehicle control by detecting a danger at the time of lane change of the vehicle 70 or lane merging. Further, in the back operation when the vehicle 70 is parked, etc., the vehicle control device 6 determines that there is a danger of collision based on the recognition information on the obstacle behind the vehicle 70 output by the object recognition device 1 In some cases, the vehicle control described above can be performed.
- FIG. 2 is a view showing an example of the appearance of the object recognition apparatus according to the embodiment.
- the object recognition device 1 includes the main body 2, the imaging unit 10 a fixed to the main body 2, and the imaging unit 10 b as described above.
- the imaging units 10 a and 10 b are configured by a pair of cylindrical cameras arranged in parallel with the main unit 2. Further, for convenience of description, the imaging unit 10a illustrated in FIG. 2 may be referred to as a right camera, and the imaging unit 10b may be referred to as a left camera.
- FIG. 3 is a diagram showing an example of the hardware configuration of the object recognition apparatus according to the embodiment. The hardware configuration of the object recognition device 1 will be described with reference to FIG.
- the object recognition device 1 includes a parallax value deriving unit 3 and a recognition processing unit 5 in the main body unit 2.
- the parallax value deriving unit 3 derives a parallax value dp indicating parallax with respect to the object from a plurality of images obtained by imaging the object, and outputs a parallax image (an example of parallax information) indicating the parallax value dp in each pixel Device.
- the recognition processing unit 5 performs an object recognition process or the like on an object such as a person or a car reflected in a captured image based on the parallax image output from the parallax value derivation unit 3, and information indicating the result of the object recognition process It is an apparatus which outputs the recognition information which is these to the vehicle control apparatus 6.
- the parallax value deriving unit 3 includes an imaging unit 10 a, an imaging unit 10 b, a signal conversion unit 20 a, a signal conversion unit 20 b, and an image processing unit 30.
- the imaging unit 10a is a processing unit that images an object in front and generates an analog image signal.
- the imaging unit 10a includes an imaging lens 11a, an aperture 12a, and an image sensor 13a.
- the imaging lens 11a is an optical element for refracting incident light to form an image of an object on the image sensor 13a.
- the diaphragm 12a is a member that adjusts the amount of light input to the image sensor 13a by blocking part of the light that has passed through the imaging lens 11a.
- the image sensor 13a is a semiconductor element which is incident on the imaging lens 11a and converts the light passing through the diaphragm 12a into an electrical analog image signal.
- the image sensor 13a is realized by, for example, a solid-state imaging device such as a charge coupled device (CCD) or a complementary metal oxide semiconductor (CMOS).
- CCD charge coupled device
- CMOS complementary metal oxide semiconductor
- the imaging unit 10 b is a processing unit that captures an image of a subject in front and generates an analog image signal.
- the imaging unit 10b includes an imaging lens 11b, an aperture 12b, and an image sensor 13b.
- the functions of the imaging lens 11b, the diaphragm 12b, and the image sensor 13b are the same as the functions of the imaging lens 11a, the diaphragm 12a, and the image sensor 13a described above. Further, the imaging lens 11a and the imaging lens 11b are installed such that the lens surfaces thereof are on the same plane so that the left and right cameras can be imaged under the same condition.
- the signal conversion unit 20a is a processing unit that converts an analog image signal generated by the imaging unit 10a into digital image data.
- the signal conversion unit 20a includes a CDS (Correlated Double Sampling) 21a, an AGC (Auto Gain Control) 22a, an ADC (Analog Digital Converter) 23a, and a frame memory 24a.
- CDS Correlated Double Sampling
- AGC Auto Gain Control
- ADC Analog Digital Converter
- the CDS 21a removes noise from the analog image signal generated by the image sensor 13a by correlated double sampling, a horizontal differential filter, or a vertical smoothing filter.
- the AGC 22a performs gain control to control the intensity of an analog image signal from which noise has been removed by the CDS 21a.
- the ADC 23a converts an analog image signal whose gain is controlled by the AGC 22a into digital image data.
- the frame memory 24a stores the image data converted by the ADC 23a.
- the signal conversion unit 20 b is a processing unit that converts an analog image signal generated by the imaging unit 10 b into digital image data.
- the signal conversion unit 20 b includes a CDS 21 b, an AGC 22 b, an ADC 23 b, and a frame memory 24 b.
- the functions of the CDS 21b, the AGC 22b, the ADC 23b, and the frame memory 24b are the same as the functions of the CDS 21a, the AGC 22a, the ADC 23a, and the frame memory 24a described above.
- the image processing unit 30 is a device that performs image processing on image data converted by the signal conversion unit 20a and the signal conversion unit 20b.
- the image processing unit 30 includes an FPGA (field programmable gate array) 31, a central processing unit (CPU) 32, a read only memory (ROM) 33, a random access memory (RAM) 34, and an interface (I / F) 35. And a bus line 39.
- FPGA field programmable gate array
- the FPGA 31 is an integrated circuit, and here performs processing of deriving a parallax value dp in an image based on image data.
- the CPU 32 controls each function of the disparity value deriving unit 3.
- the ROM 33 stores an image processing program that the CPU 32 executes to control each function of the disparity value deriving unit 3.
- the RAM 34 is used as a work area of the CPU 32.
- the I / F 35 is an interface for communicating with the I / F 55 in the recognition processing unit 5 and the communication line 4.
- the bus line 39 is, as shown in FIG. 3, an address bus, a data bus, and the like that connect the FPGA 31, the CPU 32, the ROM 33, the RAM 34, and the I / F 35 so that they can communicate with each other.
- the image processing unit 30 includes the FPGA 31 as an integrated circuit for deriving the parallax value dp, the present invention is not limited to this, and an integrated circuit such as an application specific integrated circuit (ASIC) may be used. .
- ASIC application specific integrated circuit
- the recognition processing unit 5 includes an FPGA 51, a CPU 52, a ROM 53, a RAM 54, an I / F 55, a CAN (Controller Area Network) I / F 58, and a bus line 59. .
- the FPGA 51 is an integrated circuit, and here performs object recognition processing on an object based on the parallax image received from the image processing unit 30.
- the CPU 52 controls each function of the recognition processing unit 5.
- the ROM 53 stores a program for object recognition processing in which the CPU 52 executes the object recognition processing of the recognition processing unit 5.
- the RAM 54 is used as a work area of the CPU 52.
- the I / F 55 is an interface for performing data communication with the I / F 35 of the image processing unit 30 and the communication line 4.
- the CAN I / F 58 is an interface for communicating with an external controller (for example, the vehicle control device 6 shown in FIG. 6).
- the FPGA 51, the CPU 52, the ROM 53, the RAM 54, the I / F 55, and the CANI / F 58 are an address bus, a data bus, and the like connected so as to be able to communicate with each other.
- the FPGA 51 converts the parallax image into a parallax image according to an instruction of the CPU 52 in the recognition processing unit 5. Based on the object recognition processing of an object such as a person and a car reflected in a captured image is executed.
- Each of the above programs may be distributed in a computer readable recording medium as an installable or executable file.
- This recording medium is, for example, a compact disc read only memory (CD-ROM) or a secure digital (SD) memory card.
- the image processing unit 30 of the disparity value deriving unit 3 and the recognition processing unit 5 are separate devices, the present invention is not limited to this.
- the image processing unit 30 The parallax image generation and the object recognition processing may be performed using the same device as the recognition processing unit 5 and the recognition processing unit 5.
- FIG. 4 is a diagram showing an example of a functional block configuration of the object recognition device according to the embodiment. First, the configuration and operation of the functional blocks of the main part of the object recognition device 1 will be described with reference to FIG.
- the object recognition device 1 includes the parallax value deriving unit 3 and the recognition processing unit 5.
- the parallax value deriving unit 3 includes an image acquiring unit 100 a (first imaging unit), an image acquiring unit 100 b (second imaging unit), conversion units 200 a and 200 b, and a parallax value arithmetic processing unit 300 (generation unit And.
- the image acquisition unit 100a is a functional unit that captures an object in front with the right camera, generates an analog image signal, and obtains a luminance image that is an image based on the image signal.
- the image acquisition unit 100a is realized by the imaging unit 10a illustrated in FIG.
- the image acquisition unit 100 b is a functional unit that captures an object in front with the left camera, generates an analog image signal, and obtains a luminance image that is an image based on the image signal.
- the image acquisition unit 100b is realized by the imaging unit 10b illustrated in FIG.
- the conversion unit 200a is a functional unit that removes noise from the image data of the luminance image obtained by the image acquisition unit 100a, converts the image data into digital image data, and outputs the image data.
- the converter 200a is realized by the signal converter 20a shown in FIG.
- the conversion unit 200b is a functional unit that removes noise from the image data of the luminance image obtained by the image acquisition unit 100b, converts the image data into digital image data, and outputs the image data.
- the converter 200b is realized by the signal converter 20b shown in FIG.
- the luminances captured by the image acquisition unit 100a that is the right camera (the imaging unit 10a) The image is used as the image data of the reference image Ia (hereinafter simply referred to as the reference image Ia) (first captured image), and the luminance image captured by the image acquisition unit 100b which is the left camera (imaging unit 10b) is compared with the image Image data of Ib (hereinafter, simply referred to as comparison image Ib) (second captured image). That is, the conversion units 200a and 200b output the reference image Ia and the comparison image Ib, respectively, based on the two luminance images output from the image acquisition units 100a and 100b.
- the parallax value calculation processing unit 300 derives a parallax value for each pixel of the reference image Ia based on the reference image Ia and the comparison image Ib received from each of the conversion units 200a and 200b, and generates parallax for each pixel of the reference image Ia. It is a functional unit that generates a parallax image in which values are associated.
- the disparity value calculation processing unit 300 outputs the generated disparity image to the recognition processing unit 5.
- the recognition processing unit 5 is a functional unit that recognizes (detects) an object based on the reference image Ia and the parallax image received from the parallax value deriving unit 3 and tracks (tracks) the recognized object.
- FIG. 5 is a diagram showing an example of a functional block configuration of a disparity value calculation processing unit of the object recognition device according to the embodiment.
- FIG. 6 is a diagram for explaining the principle of deriving the distance from the imaging unit to the object.
- FIG. 7 is an explanatory diagram in the case of finding a corresponding pixel in a comparison image corresponding to a reference pixel in a reference image.
- FIG. 8 is a diagram showing an example of a graph of the result of the block matching process.
- the imaging system shown in FIG. 6 is assumed to have an imaging unit 10a and an imaging unit 10b arranged in parallel equilibria.
- the imaging units 10a and 10b respectively include imaging lenses 11a and 11b that refract incident light to form an image of an object on an image sensor that is a solid-state imaging device.
- the images captured by the imaging unit 10a and the imaging unit 10b are respectively referred to as a reference image Ia and a comparison image Ib.
- a point S on the object E in a three-dimensional space is mapped at a position on a straight line parallel to a straight line connecting the imaging lens 11a and the imaging lens 11b in each of the reference image Ia and the comparison image Ib.
- a point S mapped to each image is taken as a point Sa (x, y) in the reference image Ia, and taken as a point Sb (X, y) in the comparison image Ib.
- the parallax value dp is given by the following (formula 1) Is represented by
- the distance between the point Sa (x, y) in the reference image Ia and the intersection of the perpendicular drawn from the imaging lens 11a onto the imaging surface is ⁇ a
- the distance Z between the imaging units 10a and 10b and the object E is derived by using the parallax value dp.
- the distance Z is a distance from a straight line connecting the focal position of the imaging lens 11a and the focal position of the imaging lens 11b to the point S on the object E.
- the distance Z can be calculated by 2).
- C (p, d) A method of calculating the cost value C (p, d) will be described with reference to FIGS. 7 and 8.
- C (p, d) will be described as representing C (x, y, d).
- FIG. 7A shows a conceptual diagram showing the reference pixel p and the reference area pb in the reference image Ia
- FIG. 7B corresponds to the reference pixel p shown in FIG. 7A.
- the corresponding pixel indicates a pixel in the comparison image Ib most similar to the reference pixel p in the reference image Ia.
- the cost value C is an evaluation value (coincidence) indicating the similarity or dissimilarity of each pixel in the comparison image Ib with respect to the reference pixel p in the reference image Ia.
- the cost value C shown below is described as an evaluation value representing a dissimilarity indicating that the pixel in the comparison image Ib is similar to the reference pixel p as the value is smaller.
- candidates for corresponding pixels on the epipolar line EL in the comparison image Ib with respect to the reference pixel p (x, y) in the reference image Ia and the reference pixel p (x, y) are candidates.
- d is the shift amount (shift amount) between the reference pixel p and the candidate pixel q, and the shift amount d is shifted in pixel units.
- the candidate pixel q (x + d, y) and the reference pixel p (x, y) are sequentially shifted while sequentially shifting the candidate pixel q (x + d, y) by one pixel in a predetermined range (for example, 0 ⁇ d ⁇ 25).
- Cost value C (p, d) which is the dissimilarity of the luminance value with.
- block matching processing is performed as stereo matching processing in order to obtain a corresponding pixel of the reference pixel p.
- a reference area pb which is a predetermined area centered on the reference pixel p of the reference image Ia, and a candidate area qb centered on the candidate pixel q of the comparison image Ib (the size is the same as the reference area pb).
- the average value of each block is subtracted from the value of SAD (Sum of Absolute Difference), SSD (Sum of Squared Difference), or SSD.
- ZSSD Zero-mean-Sum of Squared Difference
- the reference image Ia and the comparison image Ib also have parallel equidistant relationships. Therefore, the corresponding pixel in the comparison image Ib corresponding to the reference pixel p in the reference image Ia is present on the epipolar line EL shown as a line in the lateral direction of the drawing in FIG. In order to obtain it, the pixels on the epipolar line EL of the comparison image Ib may be searched.
- the cost value C (p, d) calculated by such block matching processing is represented by, for example, a graph shown in FIG. 8 in relation to the shift amount d.
- the disparity value calculation processing unit 300 includes a cost calculation unit 301, a determination unit 302, and a first generation unit 303.
- the cost calculation unit 301 determines the luminance value of the reference pixel p (x, y) in the reference image Ia and the reference pixel p (x, x) on the epipolar line EL in the comparison image Ib based on the reference pixel p (x, y).
- Each candidate pixel q (x + d, y) is identified based on each luminance value of candidate pixel q (x + d, y), which is a candidate for the corresponding pixel, identified by shifting from the pixel corresponding to the position y) by the shift amount d. It is a functional unit that calculates the cost value C (p, d) of y).
- the cost calculation unit 301 performs block matching processing on a reference area pb which is a predetermined area centered on the reference pixel p of the reference image Ia and a candidate area qb centered on the candidate pixel q of the comparison image Ib.
- the degree of dissimilarity with (the size is the same as that of the reference area pb) is calculated as the cost value C.
- the determination unit 302 determines the shift amount d corresponding to the minimum value of the cost value C calculated by the cost calculation unit 301 as the parallax value dp for the pixel of the reference image Ia which is the target of the calculation of the cost value C. It is a functional unit.
- the first generation unit 303 generates a parallax image which is an image in which the pixel value of each pixel of the reference image Ia is replaced with the parallax value dp corresponding to the pixel based on the parallax value dp determined by the determination unit 302 Functional unit.
- the cost calculation unit 301, the determination unit 302, and the first generation unit 303 shown in FIG. 5 are each realized by the FPGA 31 shown in FIG. Note that some or all of the cost calculation unit 301, the determination unit 302, and the first generation unit 303 are realized by the CPU 32 executing a program stored in the ROM 33 instead of the FPGA 31, which is a hardware circuit. It is good also as things.
- the cost calculation unit 301, the determination unit 302, and the first generation unit 303 of the disparity value calculation processing unit 300 shown in FIG. 5 conceptually show the functions, and are limited to such a configuration. is not.
- a plurality of functional units illustrated as independent functional units in the parallax value calculation processing unit 300 illustrated in FIG. 5 may be configured as one functional unit.
- the parallax value arithmetic processing unit 300 illustrated in FIG. 5 may divide the function of one functional unit into a plurality of units and configure the plurality of functional units.
- FIG. 9 is a diagram showing an example of a functional block configuration of a recognition processing unit of the object recognition device according to the embodiment.
- FIG. 10 is a diagram illustrating an example of a V map generated from a parallax image.
- FIG. 11 is a diagram illustrating an example of a U map generated from a parallax image.
- FIG. 12 is a diagram showing an example of a real U map generated from the U map.
- FIG. 13 is a diagram for explaining the process of creating a detection frame. The configuration and operation of functional blocks of the recognition processing unit 5 will be described with reference to FIGS.
- the recognition processing unit 5 includes a second generation unit 500, a clustering processing unit 510 (detection means), and a tracking processing unit 520.
- the second generation unit 500 receives the parallax image from the parallax value calculation processing unit 300, and receives the reference image Ia from the parallax value derivation unit 3, and the V-Disparity map, the U-Disparity map, and the Real U-Disparity It is a functional unit that generates maps and the like. Specifically, the second generation unit 500 generates a V map VM, which is a V-Disparity map shown in FIG. 10B, in order to detect the road surface from the parallax image input from the parallax value calculation processing unit 300. .
- the V-Disparity map is a two-dimensional histogram showing the frequency distribution of the parallax value dp, with the vertical axis as the y-axis of the reference image Ia and the horizontal axis as the parallax value dp (or distance) of the parallax image.
- a road surface 700, a telephone pole 701, and a car 702 appear in the reference image Ia shown in FIG. 10A.
- the road surface 700 of the reference image Ia corresponds to the road surface portion 700a in the V map VM
- the electric pole 701 corresponds to the electric pole portion 701a
- the car 702 corresponds to the car portion 702a.
- the second generation unit 500 linearly approximates the position estimated to be a road surface from the generated V map VM.
- the road surface is flat, it can be approximated by one straight line, but in the case of a road surface where the slope changes, it is necessary to divide the section of the V map VM and perform linear approximation with high accuracy.
- the linear approximation the Hough transform or the least squares method, which is a known technique, can be used.
- the power pole portion 701a and the car portion 702a which are blocks located above the detected road surface portion 700a correspond to the power pole 701 and the car 702 which are objects on the road surface, respectively.
- the second generation unit 500 uses only information located above the road surface detected by the V map VM, that is, the left guard rail 711, the right guard rail 712, and the car 713 in the reference image Ia shown in FIG. And in order to recognize an object using information on the parallax image corresponding to the car 714, a U map UM which is a U-Disparity map shown in FIG. 11B is generated.
- the U map UM is a two-dimensional histogram showing the frequency distribution of the parallax values dp, with the horizontal axis as the x axis of the reference image Ia and the vertical axis as the parallax values dp (or distance) of parallax images.
- the left guard rail 711 of the reference image Ia shown in FIG. 11A corresponds to the left guard rail portion 711 a in the U map UM
- the right guard rail 712 corresponds to the right guard rail portion 712 a
- the car 713 corresponds to the car portion 713 a
- the car 714 corresponds to the car portion 714a.
- the second generation unit 500 is a real U map RM which is a Real U-Disparity map shown in FIG. 12 (b) obtained by converting the horizontal axis into an actual distance from the generated U map UM shown in FIG. 12 (a).
- the horizontal axis represents the actual distance in the direction from the imaging unit 10b (left camera) to the imaging unit 10a (right camera), and the vertical axis represents the parallax value dp of the parallax image It is a two-dimensional histogram which is a distance in the depth direction converted from the parallax value dp.
- the second generation unit 500 does not thin out the parallax information because the object is small at a distance (the parallax value dp is small) and the parallax information is small and the distance resolution is also small. Since a large object is captured, there is much disparity information and a large resolution of the distance, so that a real U map RM is generated by thinning out the pixels greatly.
- the clustering processing unit 510 can extract a block (object) of pixel values from the real U map RM to detect an object.
- the width of the rectangle surrounding the block corresponds to the width of the extracted object
- the height corresponds to the depth of the extracted object.
- the second generation unit 500 is not limited to generating the real U map RM from the U map UM, and can also generate the real U map RM directly from parallax images.
- the image input from the parallax value derivation unit 3 to the second generation unit 500 is not limited to the reference image Ia, and the comparison image Ib may be used as a target.
- the comparison image Ib may be used as a target.
- an example in which a parallax image is used as an example of a distance image is described because the parallax value can be treated equivalently to the distance value, but the present invention is not limited to this.
- distance information of laser radar or millimeter wave radar and the above-described parallax value may be integrated to be associated with image coordinates to be a distance image, and clustering processing may be performed using this distance image.
- the clustering processing unit 510 is a functional unit that detects an object appearing in a parallax image based on each map input from the second generation unit 500.
- the clustering processing unit 510 can specify the position and width (xmin, xmax) in the x-axis direction in the parallax image of the object and the reference image Ia from the generated U map UM or real U map RM.
- the clustering processing unit 510 can specify the actual depth of the object from the information (dmin, dmax) of the height of the object in the generated U map UM or real U map RM.
- the clustering processing unit 510 calculates the actual value of the object from the width (xmin, xmax) in the x-axis direction of the object specified in the parallax image, the height (ymin, ymax) in the y-axis direction and the parallax value dp corresponding to each.
- the size in the x-axis direction and the y-axis direction can be identified.
- the clustering processing unit 510 uses the V map VM, the U map UM, and the real U map RM to specify the position of the object in the reference image Ia, and the actual width, height, and depth. Can. Further, since the position of the object in the reference image Ia is specified, the clustering processing unit 510 can also determine the position in the parallax image and can specify the distance to the object.
- the clustering processing unit 510 finally corresponds to the detection regions 721 to 724 of the objects identified (detected) on the real U map RM as shown in FIG. Detection frames 721a to 724a are created on the reference image Ia or the parallax image Ip shown in b).
- the clustering processing unit 510 can identify what the object is from the actual sizes (width, height, depth) specified for the object, using (Table 1) below. For example, if the width of the object is 1300 mm, the height is 1800 mm, and the depth is 2000 mm, the object can be identified as a “normal car”. It should be noted that the information that associates the width, height and depth as shown in Table 1 with the type of the object (object type) may be stored as a table in the RAM 54 or the like.
- the second generation unit 500 and the clustering processing unit 510 of the recognition processing unit 5 shown in FIG. 9 are each realized by the FPGA 51 shown in FIG. Note that part or all of the second generation unit 500 and the clustering processing unit 510 may be realized by the CPU 52 executing a program stored in the ROM 53 instead of the FPGA 51 which is a hardware circuit. .
- the tracking processing unit 520 is a functional unit that executes tracking processing that discards the object or performs tracking processing based on recognition area information that is information related to the object detected (recognized) by the clustering processing unit 510. is there.
- the specific configuration of the tracking processing unit 520 will be described later with reference to FIG.
- rejection refers to processing for excluding the object from the object of processing (tracking processing and the like) in a later stage.
- recognition area information indicates information on an object detected by the clustering processing unit 510, and, for example, a reference image Ia of a detected object, a parallax image Ip, a V-Disparity map, a U-Disparity map, and a Real U- It includes information such as the position and size in the Disparity map etc., the type of the detected object, and the above-mentioned rejection flag.
- the “image processing apparatus” may be the tracking processing unit 520 or the recognition processing unit 5 including the tracking processing unit 520.
- FIG. 14 is a diagram showing an example of a functional block configuration of a tracking processing unit of the recognition processing unit of the object recognition device according to the embodiment.
- the configuration and operation of functional blocks of the tracking processing unit 520 of the recognition processing unit 5 will be described with reference to FIG.
- the tracking processing unit 520 includes a movement prediction unit 600 (prediction unit), a matching unit 610, a check unit 620, a feature update unit 630, and a state transition unit 640.
- the movement prediction unit 600 uses the history of movement and operation state of the object newly detected by the clustering processing unit 510 and the vehicle information to use the current luminance image for each object that has been tracked (tracked) so far. It is a functional unit that predicts a prediction region having a high probability of the presence of an object on the screen (which may be simply referred to as “frame” below).
- the movement prediction unit 600 uses the movement information (for example, the relative position history of the center of gravity, the relative speed history, and the like) up to the previous frame (which may be simply referred to as the “preceding frame” hereinafter)
- the motion of the object is predicted by (x: frame lateral position, z: distance).
- the movement prediction unit 600 may perform processing to expand more than the prediction region predicted last time in order to correspond to an object having a motion more than prediction.
- the above-described movement information may be included in the recognition area information for each detected object. In the following description, the recognition area information is described as including the above-described movement information.
- Matching section 610 performs template matching based on the degree of similarity with the feature value (template) obtained in the previous frame in the prediction area predicted by movement prediction section 600, and generates a current frame (hereinafter simply referred to as "current frame").
- Function unit for obtaining the position of the object in The matching unit 610 includes a determination unit 611 (determination unit), a first thinning processing unit 612, a first template matching unit 613, a correction processing unit 614, a third thinning processing unit 615, and a second template matching unit 616. And a third template matching unit 617.
- the determination unit 611 is a functional unit that estimates the distance of the object corresponding to the recognition area information based on the recognition area information up to the previous frame, and determines whether the estimated distance is equal to or greater than a predetermined distance.
- a predetermined distance the above-mentioned distance greater than or equal to the predetermined distance may be referred to as “long distance” (second distance range) and distance less than the predetermined distance may be referred to as “short distance” (first distance range) Do.
- the first thinning processing unit 612 is a functional unit that performs thinning processing on the image of the prediction area in the current frame predicted by the movement prediction unit 600 based on a predetermined thinning amount (first thinning amount).
- the first template matching unit 613 is a functional unit that performs template matching based on the template obtained in the previous frame, in the prediction area in which the first thinning processing unit 612 has performed thinning processing in the current frame.
- the correction processing unit 614 is a functional unit that performs correction processing on a frame (detection frame) of a detection area (second detection area) detected by template matching by the first template matching unit 613.
- the image of the detection frame in which the correction processing has been performed on the detection frame of the specific object by the correction processing unit 614 is the detection area of the object in the current frame.
- the first thinning processing unit 612, the first template matching unit 613, and the correction processing unit 614 correspond to "first matching processing means" in the present invention.
- the third thinning processing unit 615 sets the image of the prediction area in the current frame predicted by the movement predicting unit 600 to a predetermined thinning amount (for example, an amount smaller than the thinning amount of the first thinning processing unit 612) 2) is a functional unit that performs thinning processing based on the thinning amount).
- the second template matching unit 616 is a functional unit that performs template matching based on the template obtained in the previous frame within the prediction area in which the third thinning processing unit 615 performed the thinning process on the current frame.
- the third template matching unit 617 is a function of performing template matching based on a part template, which will be described later, within a detection region (fourth detection region) of an object detected by template matching by the second template matching unit 616 in the current frame. . Based on the position of the area similar to the part matching detected by the third template matching unit 617, the detection frame of the object of the current frame is corrected.
- the third thinning processing unit 615, the second template matching unit 616, and the third template matching unit 617 correspond to the "second matching processing means" in the present invention.
- the check unit 620 is a function of checking whether the size corresponds to the size of an object (for example, a vehicle) targeted by tracking based on the size of the detection area of the object detected by the matching unit 610.
- the feature update unit 630 creates a feature amount (template) used in template matching of the first template matching unit 613 or the second template matching unit 616 in the next frame from the image of the detection region of the object detected in the current frame. Is a functional unit to update.
- the feature updating unit 630 includes a second thinning unit 631 (first thinning unit), a first updating unit 632, a fourth thinning unit 633 (second thinning unit), a second updating unit 634, and a part template A selection unit 635 (selection unit) and a third update unit 636 are included.
- the second thinning processing unit 631 thins the image of the detection area (first detection area) of the object finally determined by the correction processing unit 614 in the current frame based on a predetermined thinning amount (first thinning amount). To create a template (thinning template) (first template) to be used in the next frame.
- the first update unit 632 is a functional unit that updates (for example, stores in the RAM 54) the thinning template created by the second thinning processing unit 631 instead of the thinning template used last time.
- the fourth thinning processing unit 633 performs thinning processing on the image of the detection region (third detection region) of the object finally determined by the third template matching unit 617 in the current frame, based on a predetermined thinning amount, It is a functional unit that creates a template (thinning template) (second template) used in the next frame.
- the second updating unit 634 is a functional unit that updates (for example, stores in the RAM 54) the thinning template created by the fourth thinning processing unit 633 instead of the thinning template used last time.
- the part template selection unit 635 is a functional unit that selects a partial image (part template) that satisfies a predetermined condition from the image of the detection area of the object finally determined by the third template matching unit 617 in the current frame.
- the third update unit 636 is a functional unit that updates (for example, stores in the RAM 54) the part template selected by the part template selection unit 635 instead of the part template used last time.
- the state transition unit 640 is a functional unit that changes the state of an object according to the state of the detection region of the object finally determined by the correction processing unit 614 or the third template matching unit 617.
- the state transition unit 640 outputs, to the vehicle control device 6, recognition area information in which the state of the transitioned object is reflected. For example, when the state transition unit 640 determines that the size of the detection area does not correspond to the size of the object targeted for tracking as a result of the check by the check unit 620, a flag indicating removal from the tracking target (rejection) May be included in the recognition area information to cause transition to a state outside the tracking target.
- the updating unit 636 and the state transition unit 640 are each realized by the FPGA 51 shown in FIG. Note that some or all of these functional units may be realized by the CPU 52 executing a program stored in the ROM 53 instead of the FPGA 51 which is a hardware circuit.
- Each functional unit of tracking processing unit 520 shown in FIG. 14 conceptually shows a function, and is not limited to such a configuration.
- a plurality of functional units illustrated as independent functional units in the tracking processing unit 520 illustrated in FIG. 14 may be configured as one functional unit.
- the tracking processing unit 520 illustrated in FIG. 14 may divide the function of one functional unit into a plurality of units and configure the plurality of functional units.
- FIG. 15 is a flowchart illustrating an example of the block matching process performed by the disparity value deriving unit according to the embodiment. The flow of the block matching process of the disparity value deriving unit 3 of the object recognition device 1 will be described with reference to FIG.
- Step S1-1 The image acquiring unit 100b of the parallax value deriving unit 3 captures an object in front with the left camera (imaging unit 10b), generates an analog image signal, and obtains a luminance image that is an image based on the image signal. . As a result, an image signal to be subjected to image processing in the subsequent stage can be obtained. Then, the process proceeds to step S2-1.
- Step S1-2> The image acquiring unit 100a of the parallax value deriving unit 3 captures an object in front with the right camera (imaging unit 10a), generates an analog image signal, and obtains a luminance image that is an image based on the image signal. . As a result, an image signal to be subjected to image processing in the subsequent stage can be obtained. Then, the process proceeds to step S2-2.
- Step S2-1> The conversion unit 200b of the parallax value derivation unit 3 removes noise from the analog image signal obtained by imaging by the imaging unit 10b and converts the signal into digital image data. As described above, by converting into image data of digital format, image processing for each pixel can be performed on an image based on the image data. Then, the process proceeds to step S3-1.
- Step S2-2> The conversion unit 200a of the parallax value derivation unit 3 removes noise from the analog image signal obtained by imaging by the imaging unit 10a, and converts it into digital image data. As described above, by converting into image data of digital format, image processing for each pixel can be performed on an image based on the image data. Then, the process proceeds to step S3-2.
- Step S3-1 The conversion unit 200b outputs an image based on the digital format image data converted in step S2-1 as a comparison image Ib in the block matching process. As a result, an image to be compared to obtain a parallax value in the block matching process is obtained. Then, the process proceeds to step S4.
- Step S3-2> The conversion unit 200a outputs an image based on the digital format image data converted in step S2-2 as a reference image Ia in the block matching process. As a result, an image serving as a reference for determining the parallax value in the block matching process is obtained. Then, the process proceeds to step S4.
- the cost calculation unit 301 of the disparity value calculation processing unit 300 of the disparity value derivation unit 3 compares the luminance value of the reference pixel p (x, y) in the reference image Ia and the comparison image Ib based on the reference pixel p (x, y). For each luminance value of the candidate pixel q (x + d, y) of the corresponding pixel, specified by shifting from the pixel corresponding to the position of the reference pixel p (x, y) by the shift amount d on the epipolar line EL in Based on this, the cost value C (p, d) of each candidate pixel q (x + d, y) is obtained by calculation.
- the cost calculation unit 301 performs block matching processing on a reference area pb which is a predetermined area centered on the reference pixel p of the reference image Ia and a candidate area qb centered on the candidate pixel q of the comparison image Ib.
- the degree of dissimilarity with (the size is the same as that of the reference area pb) is calculated as the cost value C. Then, the process proceeds to step S5.
- Step S5> The determination unit 302 of the disparity value calculation processing unit 300 of the disparity value derivation unit 3 has made the shift amount d corresponding to the minimum value of the cost value C calculated by the cost calculation unit 301 the target of the calculation of the cost value C.
- the parallax value dp for the pixels of the reference image Ia is determined.
- the first generation unit 303 of the parallax value calculation processing unit 300 of the parallax value derivation unit 3 sets the luminance value of each pixel of the reference image Ia to that pixel based on the parallax value dp determined by the determination unit 302.
- a parallax image that is an image represented by the corresponding parallax value dp is generated.
- the generation unit 303 outputs the generated parallax image to the recognition processing unit 5.
- the present invention is not limited to this and may be processing using a SGM (Semi-Global Matching) method.
- SGM Semi-Global Matching
- FIG. 16 is a flowchart illustrating an example of tracking processing performed by the tracking processing unit of the recognition processing unit according to the embodiment.
- FIG. 17 is a diagram for explaining the movement prediction operation. The flow of tracking processing of the tracking processing unit 520 of the recognition processing unit 5 will be described with reference to FIGS.
- Step S11> The movement prediction unit 600 of the tracking processing unit 520 follows up to this point using the history of the movement and operation state of the object newly detected by the clustering processing unit 510 in the previous stage and the recognition area information including the vehicle information ( For each object that has been tracked, as shown in FIG. 17, a prediction region 800 having a high probability of the presence of the object on the current frame (reference image Ia) is identified. Then, the process proceeds to step S12.
- Step S12> The matching unit 610 of the tracking processing unit 520 performs template matching based on the similarity with the feature value (template) obtained in the previous frame in the prediction area 800, and detects an object on the current frame. The details of the matching process by the matching unit 610 will be described later with reference to FIGS. Then, the process proceeds to step S13.
- Step S13 The check unit 620 of the tracking processing unit 520 checks, based on the size of the detection area of the object detected by the matching unit 610, whether or not the size corresponds to the size of the object (for example, a vehicle) targeted for tracking. Do. Then, the process proceeds to step S14.
- Step S14> The feature updating unit 630 of the tracking processing unit 520 performs the first template matching unit 613 or the second template matching unit 616 and the third template matching unit 617 in the next frame from the image of the detection region of the object detected in the current frame. Create and update feature quantities (templates) used in template matching in Details of the feature update processing by the feature update unit 630 will be described later with reference to FIGS. Then, the process proceeds to step S15.
- the state transition unit 640 of the tracking processing unit 520 is a functional unit that changes the state of the object according to the state of the detection region of the object finally determined by the correction processing unit 614 or the third template matching unit 617.
- the state transition unit 640 outputs, to the vehicle control device 6, recognition area information in which the state of the transitioned object is reflected.
- the tracking process by the tracking processing unit 520 is performed by the processes in steps S11 to S15 described above.
- the processing in steps S11 to S15 is performed for each detection area of the object detected by the clustering processing unit 510.
- FIG. 18 is a flowchart illustrating an example of the matching processing operation of the tracking processing of the tracking processing unit according to the embodiment. The flow of the matching process of the matching unit 610 of the tracking processing unit 520 will be described with reference to FIG.
- Step S121 The determination unit 611 of the matching unit 610 estimates the distance of the object corresponding to the recognition area information based on the recognition area information up to the previous frame, and determines whether the estimated distance is equal to or more than a predetermined distance. If the estimated distance is a short distance less than the predetermined distance (step S121: short distance), the process proceeds to step S122. If the distance is a predetermined distance or more (step S121: long distance), the process proceeds to step S123.
- the first thinning processing unit 612, the first template matching unit 613, and the correction processing unit 614 of the matching unit 610 perform rough matching processing using a template based on the detection area detected in the previous frame. The details of the rough matching process will be described later with reference to FIGS. Then, the matching process ends.
- the third thinning processing unit 615, the second template matching unit 616, and the third template matching unit 617 of the matching unit 610 perform parts matching processing using a template based on the detection area detected in the previous frame. Details of the part matching process will be described later with reference to FIGS. Then, the matching process ends.
- the matching process (matching process in the tracking process) by the matching unit 610 of the tracking process unit 520 is performed. Further, since the tracking process including the above-described matching process is repeatedly performed, an object detected in any one of the rough matching process and the part matching process is determined according to the distance estimated next. In some cases, the method of the matching process may be switched. For example, when the estimated distance is a short distance and the estimated distance becomes a long distance as time passes, the object is switched to the part matching process.
- FIG. 19 is a flowchart illustrating an example of operation of feature update processing in the case of performing rough matching in the tracking processing of the tracking processing unit of the embodiment.
- FIG. 20 is a diagram for explaining thinning-out processing on an image of a detection area in feature update processing in the case of performing rough matching in the tracking processing unit of the embodiment.
- the flow of the operation of the feature update process of the feature update unit 630 when the rough matching process is performed in the matching unit 610 will be described with reference to FIGS. 19 and 20.
- the feature update process shown in FIG. 19 is a feature update process executed in step S14 when the rough matching process is performed in step S12 in FIG.
- the second thinning processing unit 631 of the feature updating unit 630 determines a thinning amount for creating a thinning template from the detection region of the object detected in the rough matching process by the matching unit 610 in the current frame.
- the detection area 810 shown in FIG. 20A is the detection area of an object (vehicle) detected by the rough matching process, and the width Wd [pixel] in the horizontal direction and the height Hd in the vertical direction It is an area of [pixel] size.
- the 20B is an image after thinning processing by the second thinning processing unit 631, and has a width Wd_s [pixel] in the horizontal direction and a height Hd_s [pixel] in the vertical direction. It shall be a template of size.
- the second thinning processing unit 631 sets the ratio of the width to the height of the thinning template 811 such that the height Hd_s of the thinning template 811 becomes the fixed value c [pixel] ( ⁇ Hd).
- a thinning process is performed on the detection area 810 so as to match the ratio of the width to the height 810. That is, the thinning amount by the second thinning processing unit 631, that is, the height Hd_s and the width Wd_s of the thinning template 811 are calculated by the following Equation (3).
- FH in (Expression 3) is a ratio of the height Hd_s to the height Hd
- FW is a ratio of the width Wd_s to the width Wd.
- the size of the detection area 810 detected in the current frame is determined in the next frame. It is possible to reduce the dependence of the processing speed of the matching process on
- the thinning amount is determined so that the height Hd_s of the thinning template 811 becomes a fixed value in the above description, the thinning amount is not limited to this, and the thinning amount is determined so that the width Wd_s becomes a fixed value.
- Step S142 The second thinning processing unit 631 performs thinning processing on the detection area 810 based on the thinning amount determined (calculated) in the above-mentioned (Equation 3), and creates a thinning template 811. Then, the process proceeds to step S143.
- the first update unit 632 of the feature update unit 630 updates (for example, stores in the RAM 54) the thinning template 811 created by the second thinning processing unit 631 instead of the thinning template used for the previous rough matching process.
- the thinning template 811 created is used in the rough matching process in the next frame.
- the first update unit 632 stores the ratios FH and FW of the above-described (Expression 3) calculated in step S141 in the RAM 54 or the like.
- the proportions FH and FW are used in the image thinning process (described later) of the prediction area in the next frame. Then, the feature update processing ends.
- the feature updating process of the feature updating unit 630 in the case where the rough matching process is performed in the matching unit 610 is performed.
- FIG. 21 is a flowchart illustrating an example of rough matching processing operation in the tracking processing of the tracking processing unit according to the embodiment.
- FIG. 22 is a diagram for explaining thinning processing on an image of a prediction area in rough matching processing in tracking processing of the tracking processing unit according to the embodiment.
- FIG. 23 is a diagram for explaining frame correction processing in the rough matching processing in the tracking processing of the tracking processing unit according to the embodiment.
- the rough matching process shown in FIG. 21 is a rough matching process executed in step S122 in FIG.
- the first thinning processing unit 612 of the matching unit 610 performs thinning processing on the image of the prediction area 800 (see FIG. 22A) in the current frame predicted by the movement prediction unit 600 based on a predetermined thinning amount.
- the thinning prediction area 801 shown in FIG. 22 (b) is obtained.
- the first thinning processing unit 612 uses the ratios FH and FW shown in the above (Equation 3) stored in the RAM 54 and the like by the first updating unit 632 for the previous frame, to obtain the height Hp and the width.
- a thinning process is performed on the prediction area 800 of Wp so as to be a thinning prediction area 801 of the size of the height Hp_s and the width Wp_s calculated by the following (Expression 4).
- the ratio of thinning to the predicted area 800 by the first thinning processor 612 in the current frame is the same as the ratio of thinning to the detection area 810 by the second thinning processor 631 in the previous frame.
- the prediction area 800 is thinned at different thinning rates, even if template matching is performed on an object having the same size of the original object, the size of the object is different in the thinning prediction area, and the object can not be detected accurately. Since the thinning prediction region 801 is created by thinning the prediction region 800 at the same thinning ratio, an object can be detected with high accuracy. Then, the process proceeds to step S1222.
- the first template matching unit 613 of the matching unit 610 is a thinning template in which the previous frame is updated by the first updating unit 632 in the thinning prediction region 801 in which the thinning processing is performed by the first thinning processing unit 612 in the current frame.
- SAD the evaluation value indicating the similarity with the thinning template 811 in the thinning prediction area 801
- the first template matching unit 613 calculates the SAD based on the thinning template 811 while raster scanning the thinning prediction area 801, and obtains the position of the image with the smallest SAD.
- the position of the image with the smallest SAD in the thinning prediction area 801 for example, the position (px_thin, py_thin) of the upper left end of the image in the thinning prediction area 801 is obtained.
- the first template matching unit 613 also reduces the size of the frame of the detected image from the position (px_thin, py_thin) of the image detected in the thinned-out prediction area 801 using Equation 5 below before thinning out.
- the position (px, py) in the prediction area 800 is calculated for the frame of the image returned to.
- the first template matching unit 613 can obtain the position of the frame (detection frame 820 shown in FIG. 23) of the image (detection area) of the object detected in the prediction area 800 in the current frame.
- the position of the detection area in the current frame can be determined.
- the first template matching unit 613 performs template matching on the thinned template 811 for which thinning processing has been performed with the same amount of thinning on the thinned prediction region 801 for which thinning processing has been performed.
- Image processing can be performed on an image with a smaller number of pixels than that, and processing speed can be improved.
- the position of the detection area in the prediction area 800 obtained here is the position when the image detected by template matching in the thinning prediction area 801 obtained by thinning the prediction area 800 is restored to the size before thinning. So, it contains quantization error. Then, the process proceeds to step S1223.
- the correction processing unit 614 of the matching unit 610 performs correction processing on the frame (detection frame 820 shown in FIG. 23) of the image (detection area) of the object detected by the template matching by the first template matching unit 613 in the current frame. .
- the correction processing unit 614 is an image on a parallax image (a parallax image corresponding to the current frame) corresponding to the detection frame 820 detected by the first template matching unit 613.
- a histogram 900 indicating the frequency of pixels including disparity values in the X direction
- a histogram 901 indicating the frequency of pixels including disparity values in the Y direction.
- the correction processing unit 614 takes the positions in the X direction exceeding the threshold Th in the histogram 900 as the positions of the left end and the right end of the detection frame 821 after correction, and sets the threshold Th in the histogram 901.
- the positions in the Y direction which are exceeded are respectively set as the positions of the upper end and the lower end of the detection frame 821 after correction.
- the threshold Th may be, for example, a value of 10 to 20 [%] with respect to the maximum value of the histogram. In this case, although the threshold values in the X direction and the Y direction are set as the threshold value Th in FIG. 23, they need not be the same threshold value.
- the image of the detection frame 821 for which the correction processing has been performed by the correction processing unit 614 becomes a detection area finally detected by the rough matching processing by the matching unit 610.
- the correction processing unit 614 includes the information (the position, the size, and the like with respect to the frame) of the detected area of the detected object in the recognition area information of the object.
- the correction processing is performed on the detection frame 820 which is the frame of the detection area by the correction processing unit 614, so that the above-described quantization error can be mitigated.
- the template matching by the first template matching unit 613 is performed not on the entire frame but on the prediction area 800, the processing speed can be improved. Then, the rough matching process is finished.
- the rough matching process of the matching unit 610 is performed by the processes of steps S1221 to S1223 described above.
- FIG. 24 is a flowchart showing an example of operation of feature update processing in the case of performing part matching in the tracking processing of the tracking processing unit of the embodiment.
- FIG. 25 is a flowchart illustrating an example of an operation of a process of selecting a part template in the feature update process in the case of performing the part matching of the tracking processing unit of the embodiment.
- FIG. 26 is a diagram for explaining part template selection processing. The flow of the feature update process of the feature update unit 630 when the matching unit 610 performs the part matching process will be described with reference to FIGS. 24 to 26.
- the feature update process shown in FIG. 24 is a feature update process executed in step S14 when the part matching process is performed in step S12 in FIG.
- the fourth thinning-out processing unit 633 of the feature updating unit 630 determines the thinning-out amount for creating a thinning-out template from the detection region of the object detected by the part matching processing by the matching unit 610 in the current frame.
- the method of determining the thinning amount by the fourth thinning processing unit 633 is the same as the method of determining the thinning amount by the second thinning processing unit 631 in step S141 of FIG. 19 described above.
- the feature update process shown in FIG. 24 is executed by the determination unit 611 when the object is at a long distance, and since the object shown in the frame is smaller than the near distance, the second thinning processing unit
- the thinning amount is smaller than the thinning amount by 631.
- the height Hd_s of the thinning template is set to a fixed value smaller than the fixed value c. Then, the process proceeds to step S146.
- Step S146> The fourth thinning processing unit 633 performs thinning processing on the detection area detected in the current frame based on the thinning amount determined (calculated) in step S145, and creates a thinning template. Then, the process proceeds to step S147.
- the second updating unit 634 of the feature updating unit 630 updates (for example, stores in the RAM 54) the thinning template created by the fourth thinning processing unit 633 instead of the thinning template used for the part matching process last time.
- the created thinning template is used in the part matching process in the next frame.
- the second updating unit 634 stores the ratios corresponding to the ratios FH and FW of (Expression 3) in step S141 of FIG. 19 described above, which are calculated in step S145, in the RAM 54 or the like. This ratio is used in the thinning process (described later) of the image of the prediction area in the next frame. Then, the process proceeds to step S148.
- Step S148> The part template selection unit 635 of the feature update unit 630 selects a partial image (part template) that satisfies a predetermined condition from the image of the detection region of the object finally determined by the third template matching unit 617 in the current frame. .
- the selection processing by the parts template selection unit 635 will be described in detail with reference to the processing of steps S1481 to S1488 of FIG. 25 and FIG.
- Step S1481 The part template selection unit 635 creates temporary frames 840 and 841 at the upper left end and the lower right end in the detection area 830 shown in FIG. 26A detected by the part matching process in the current frame. Then, the process proceeds to step S1482.
- Step S1482 In the image on the parallax image (parallax image corresponding to the current frame) corresponding to the temporary frame 840 at the upper left end of the detection area 830, the part template selection unit 635 sets the number of pixels having valid parallax values (hereinafter (Sometimes referred to as “parallax score”).
- a pixel having a valid parallax value indicates, for example, a pixel having no parallax value or an image which is not a pixel having a very distant distance, that is, a pixel having a very small parallax value. . Then, the process proceeds to step S1483.
- Step S1483 The parts template selection unit 635 determines whether the ratio of the area occupied by the pixels of the counted parallax score to the area in the temporary frame 840 is equal to or more than a predetermined threshold. If the ratio is equal to or higher than the predetermined threshold (step S1483: YES), the process proceeds to step S1485. If the ratio is less than the predetermined threshold (step S1483: NO), the process proceeds to step S1484.
- the temporary frame image 850 which is an image in the temporary frame 840 in the state shown in FIG. 26A
- the effective parallax value is small, that is, an object (vehicle) included in the temporary frame image 850. Since there are few parts, effective template matching can not be performed in part matching processing described later. Therefore, it is necessary to perform the process of step S1484 described later.
- step S1484 The parts template selection unit 635 shifts the position of the temporary frame 840 from the current position toward the inside of the detection area 830 by a predetermined amount on the detection area 830.
- the part template selection unit 635 may shift the position of the temporary frame 840 to the right by N pixels (N is a predetermined value) from the current position and shift it downward by N pixels, or It may be shifted by a predetermined amount toward the center. Then, the process returns to step S1482.
- Step S1485 The part template selection unit 635 sets the number of pixels having a valid parallax value (parallax score) in the image on the parallax image (parallax image corresponding to the current frame) corresponding to the temporary frame 841 at the lower right of the detection area 830. Count. Then, the process proceeds to step S1486.
- step S1486 The parts template selection unit 635 determines whether the ratio of the area occupied by the pixels of the counted parallax score to the area in the temporary frame 841 is equal to or more than a predetermined threshold. If the ratio is equal to or higher than the predetermined threshold (step S1486: YES), the process proceeds to step S1488. If the ratio is less than the predetermined threshold (step S1486: NO), the process proceeds to step S1487. For example, in the temporary frame image 851 which is an image in the temporary frame 841 in the state shown in FIG. 26A, the effective parallax value is large, that is, the object (vehicle) included in the temporary frame image 851. Since there are many parts, effective template matching can be performed in part matching processing described later.
- Step S1487 The parts template selection unit 635 shifts the position of the temporary frame 841 from the current position toward the inside of the detection area 830 by a predetermined amount on the detection area 830.
- the part template selection unit 635 may shift the position of the temporary frame 841 to the left by N pixels (N is a predetermined value) from the current position and shift it upward by N pixels, or It may be shifted by a predetermined amount toward the center. Then, the process returns to step S1485.
- step S1488 The parts template selection unit 635 selects the images of the two temporary frames at the current position in the detection area 830 as the parts template, as a result of the processing in steps S1481 to S1487 described above.
- the process of step S1484 As shown in FIG. 26 (b), an image 850a in the temporary frame, which is an image of a temporary frame 840a shifted to the inside of the detection area 830 to be equal to or greater than a predetermined threshold, is selected as a part template.
- the image within temporary frame being an image of temporary frame 841 851 is selected as a part template.
- step S1481 to S1488 the selection process by the part template selection unit 635 is executed. Then, the process returns to the flow of FIG. 24 and proceeds to step S149.
- the third update unit 636 of the feature update unit 630 updates the two part templates selected by the part template selection unit 635 instead of the two part templates used for the previous part matching process (for example, stores them in the RAM 54) Do.
- the selected part template is used in the part matching process in the next frame. Then, the feature update processing ends.
- FIG. 27 is a flowchart illustrating an example of the part matching process in the tracking process of the tracking processing unit according to the embodiment.
- FIG. 28 is a diagram for explaining part matching processing. The flow of the part matching process of the matching unit 610 in the tracking process will be described with reference to FIGS. 27 and 28.
- the parts matching process shown in FIG. 27 is the parts matching process performed in step S123 in FIG.
- the third thinning processing unit 615 of the matching unit 610 generates the image of the prediction area 800 (see FIG. 28B) in the current frame (the reference image Ia shown in FIG. 28A) predicted by the movement prediction unit 600.
- a thinning process is performed based on a predetermined thinning amount (for example, an amount smaller than the thinning amount of the first thinning processing unit 612) to obtain a thinning prediction region.
- the third thinning processing unit 615 uses the ratio stored in the RAM 54 or the like by the second updating unit 634 for the previous frame (see step S147 in FIG. 24) to the above-described figure for the prediction area.
- a thinning process similar to that of step S1221 of 21 is performed.
- the decimation ratio to the prediction area 800 by the third decimation processing unit 615 in the current frame is the same as the decimation ratio to the detection area by the fourth decimation processing unit 633 in the previous frame.
- the prediction area 800 is thinned at different thinning rates, even if template matching is performed on an object having the same size of the original object, the size of the object is different in the thinning prediction area, and the object can not be detected accurately. Since the thinned-out prediction area is created by thinning the prediction area 800 at the same thinning rate, an object can be detected with high accuracy. Then, the process proceeds to step S1232.
- the second template matching unit 616 of the matching unit 610 is a thinning template in which the previous frame is updated by the second updating unit 634 in the thinning prediction area in which the thinning processing is performed by the third thinning processing unit 615 in the current frame.
- SAD or the like can be used as an evaluation value indicating the degree of similarity with the thinning template in the thinning prediction region.
- the second template matching unit 616 calculates the SAD based on the thinning template while performing raster scan in the thinning prediction area, and obtains the position of the image with the smallest SAD.
- the position of the image with the smallest SAD in the thinned-out prediction area for example, the position of the upper left end of the image in the thinned-out prediction area is determined.
- the second template matching unit 616 is a frame of the image in which the frame of the detected image is restored to the size before thinning out using (Expression 5) described above from the position of the image detected in the thinning-out prediction area.
- the position in the prediction area 800 is calculated.
- the second template matching unit 616 can obtain the position of the frame of the image of the object detected in the prediction area 800 (the detection area 860 shown in FIG. 28C) in the current frame.
- the second template matching unit 616 since the second template matching unit 616 performs template matching on the thinned-out predicted area in which the thinning-out process has been performed on the thinned-out predicted area in which the thinning-out process has been performed with the same thinning amount, Image processing can be performed on an image with a small number of pixels, and processing speed can be improved.
- the position of the detection area in the prediction area 800 obtained here is the position when the size before thinning is restored with respect to the image detected by template matching in the thinning prediction area in which the prediction area 800 is thinned. It contains quantization error. Then, the process proceeds to step S1233.
- the third template matching unit 617 of the matching unit 610 is the two parts updated by the third updating unit 636 for the previous frame in the detection area 860 detected by the template matching by the second template matching unit 616 in the current frame.
- Template matching is performed based on the template (part template 870, 871 shown in FIG. 28D). That is, the third template matching unit 617 detects, in the detection area 860, an image that can be regarded as matching with the part template 870 or 871 or can be regarded as matching (hereinafter simply referred to as “matching”).
- the part template 870, 871 corresponds to the temporary frame image 850a, 851 shown in FIG. 26 (b) described above.
- the third template matching unit 617 calculates the SAD based on the part templates 870 and 871 while raster scanning the inside of the detection area 860, and obtains the position of the image with the smallest SAD.
- the third template matching unit 617 determines the positions of the images in the detection area 860 that respectively match the part templates 870 and 871 (FIG.
- the detection area 860 is corrected based on the position of the template 870, 871).
- the third template matching unit 617 corrects the X-direction coordinate of the position of the upper left end of the image matching the part template 870 in the detection area 860.
- the Y direction coordinate is the upper end of the detection area after correction.
- the third template matching unit 617 corrects the X-direction coordinate of the position of the lower right end of the image that matches the part template 871 in the detection area 860.
- the third template matching unit 617 sets an area defined by the left end, the upper end, the right end, and the lower end obtained based on the part templates 870 and 871 to the detection area 860 as the detection area after correction. In this manner, the detection area after correction for the detection area 860 by the third template matching unit 617 becomes the detection area finally detected by the part matching process by the matching unit 610. Then, the third template matching unit 617 includes the information (the position, the size, and the like with respect to the frame) of the detected area of the detected object in the recognition area information of the object.
- the correction processing on the detection area 860 by the third template matching unit 617 makes it possible to determine the four corners of the final detection area with high accuracy, so that the above-mentioned quantization error can be mitigated. By this, it is possible to stably track (follow) an object at a long distance.
- the template matching by the second template matching unit 616 is performed not on the entire frame but on the prediction area 800, the processing speed can be improved. Then, the part matching process is finished.
- the parts matching process of the matching unit 610 is performed by the processes of steps S1231 to S1233 described above.
- the method of the matching process is switched according to the distance of the object. That is, when the distance of the object is a short distance, since the number of pixels in the detection area is large, rough matching processing is performed in which template matching is performed using a thinning prediction region and a thinning template with thin pixels. When the object distance is a long distance, template matching is performed using the thinning prediction area and the thinning template as in the rough matching process to specify the rough position of the object, and the part matching process is performed to perform the template matching using the part template .
- the created part template is created in a place that is not the part of the object that should originally follow because the attitude change of the object is larger in the case of the short distance than in the case of the long distance.
- the parts matching process not the parts matching process but the rough matching process for enhancing the detection accuracy of the detection area by the frame correction process is adopted.
- the parallax value of the detection area is often unstable compared to the case of the short distance, and therefore the boundary of the detection area may also be unstable.
- the part matching process is used to improve the detection accuracy of the detection area by the correction process by template matching using the part template.
- the accuracy of detection of the object can be improved.
- template matching is performed after thinning out the pixels significantly, so that the processing speed of the matching process can be improved. Since the process is performed, the detection area can be determined accurately.
- the processing speed can be improved. Further, after detection of a detection area by template matching after thinning processing, correction processing of a frame by the correction processing unit 614 and correction by template matching using a part template by the third template matching unit 617 are performed on the detection area. Since the detection area can be accurately determined by the processing, the accuracy of tracking can be improved.
- the present invention is not limited to this, and the tracking processing for an object having a small shape change as in the vehicle is also effective. It is.
- the cost value C is an evaluation value representing the dissimilarity, but may be an evaluation value representing the similarity.
- the shift amount d at which the cost value C, which is the similarity, is maximum (extreme value) is the disparity value dp.
- the vehicle 70 may be mounted on a vehicle such as a motorcycle, a bicycle, a wheelchair, or an agricultural cultivator as an example of another vehicle.
- the vehicle may be mounted on a vehicle such as a motorcycle, a bicycle, a wheelchair, or an agricultural cultivator as an example of another vehicle.
- the vehicle may be mounted on a vehicle such as a motorcycle, a bicycle, a wheelchair, or an agricultural cultivator as an example of another vehicle.
- the vehicle may be mounted on a vehicle such as a motorcycle, a bicycle, a wheelchair, or an agricultural cultivator as an example of another vehicle.
- a mobile body such as a robot
- the program when at least one of the parallax value deriving unit 3 of the object recognition device 1 and each functional unit of the recognition processing unit 5 is realized by execution of a program, the program is stored in advance in the ROM or the like. Provided built-in. Further, the program executed by the object recognition device 1 according to the above-described embodiment is a file in an installable format or an executable format, and is a CD-ROM, a flexible disk (FD), a CD-R (Compact Disk Recordable) , And may be provided by being recorded on a computer readable recording medium such as a DVD (Digital Versatile Disc).
- the program executed by the object recognition device 1 according to the above-described embodiment may be stored on a computer connected to a network such as the Internet and provided by being downloaded via the network. Furthermore, the program executed by the object recognition device 1 according to the above-described embodiment may be configured to be provided or distributed via a network such as the Internet.
- the program executed by the object recognition device 1 according to the above-described embodiment has a module configuration including at least one of the above-described functional units, and the CPU 52 (CPU 32) described above as an actual hardware.
- the above-described functional units are loaded and generated on the main storage device (RAM 54 (RAM 34) or the like) by reading and executing the program from the ROM 53 (ROM 33).
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Mechanical Engineering (AREA)
- Image Analysis (AREA)
- Traffic Control Systems (AREA)
- Measurement Of Optical Distance (AREA)
Abstract
Description
図1は、実施の形態に係る機器制御システムを車両に搭載した例を示す図である。図1を参照しながら、本実施の形態の機器制御システム60が車両70に搭載される場合を例に説明する。
図2は、実施の形態に係る物体認識装置の外観の一例を示す図である。図2に示すように、物体認識装置1は、上述のように、本体部2と、本体部2に固定された撮像部10aと、撮像部10bとを備えている。撮像部10a、10bは、本体部2に対して平行等位に配置された一対の円筒形状のカメラで構成されている。また、説明の便宜上、図2に示す撮像部10aを右のカメラと称し、撮像部10bを左のカメラと称する場合がある。
図3は、実施の形態に係る物体認識装置のハードウェア構成の一例を示す図である。図3を参照しながら、物体認識装置1のハードウェア構成について説明する。
図4は、実施の形態に係る物体認識装置の機能ブロック構成の一例を示す図である。まず、図4を参照しながら、物体認識装置1の要部の機能ブロックの構成および動作について説明する。
図5は、実施の形態に係る物体認識装置の視差値演算処理部の機能ブロック構成の一例を示す図である。図6は、撮像部から物体までの距離を導き出す原理を説明する図である。図7は、基準画像における基準画素に対応する比較画像における対応画素を求める場合の説明図である。図8は、ブロックマッチング処理の結果のグラフの一例を示す図である。
図6を参照しながら、ステレオマッチング処理により、ステレオカメラから物体に対する視差を導出し、この視差を示す視差値によって、ステレオカメラから物体までの距離を測定する原理について説明する。
次に、図7および8を用いて、ブロックマッチング処理による測距方法について説明する。
図5を参照しながら、視差値演算処理部300の機能ブロックの具体的な構成および動作について説明する。
図9は、実施の形態に係る物体認識装置の認識処理部の機能ブロック構成の一例を示す図である。図10は、視差画像から生成されるVマップの例を示す図である。図11は、視差画像から生成されるUマップの例を示す図である。図12は、Uマップから生成されるリアルUマップの例を示す図である。図13は、検出枠を作成する処理を説明する図である。図9~13を参照しながら、認識処理部5の機能ブロックの構成および動作について説明する。
図14は、実施の形態に係る物体認識装置の認識処理部のトラッキング処理部の機能ブロック構成の一例を示す図である。図14を参照しながら、認識処理部5のトラッキング処理部520の機能ブロックの構成および動作について説明する。
次に、図15~28を参照しながら、物体認識装置1の具体的な動作について説明する。
図15は、実施の形態に係る視差値導出部のブロックマッチング処理の動作の一例を示すフローチャートである。図15を参照しながら、物体認識装置1の視差値導出部3のブロックマッチング処理の動作の流れについて説明する。
視差値導出部3の画像取得部100bは、左のカメラ(撮像部10b)により前方の被写体を撮像して、それぞれアナログの画像信号を生成し、その画像信号に基づく画像である輝度画像を得る。これによって、後段の画像処理の対象となる画像信号が得られることになる。そして、ステップS2-1へ移行する。
視差値導出部3の画像取得部100aは、右のカメラ(撮像部10a)により前方の被写体を撮像して、それぞれアナログの画像信号を生成し、その画像信号に基づく画像である輝度画像を得る。これによって、後段の画像処理の対象となる画像信号が得られることになる。そして、ステップS2-2へ移行する。
視差値導出部3の変換部200bは、撮像部10bにより撮像されて得られたアナログの画像信号に対して、ノイズを除去し、デジタル形式の画像データに変換する。このように、デジタル形式の画像データに変換することによって、その画像データに基づく画像に対して画素ごとの画像処理が可能となる。そして、ステップS3-1へ移行する。
視差値導出部3の変換部200aは、撮像部10aにより撮像されて得られたアナログの画像信号に対して、ノイズを除去し、デジタル形式の画像データに変換する。このように、デジタル形式の画像データに変換することによって、その画像データに基づく画像に対して画素ごとの画像処理が可能となる。そして、ステップS3-2へ移行する。
変換部200bは、ステップS2-1において変換したデジタル形式の画像データに基づく画像をブロックマッチング処理における比較画像Ibとして出力する。これによって、ブロックマッチング処理において視差値を求めるための比較対象となる画像を得る。そして、ステップS4へ移行する。
変換部200aは、ステップS2-2において変換したデジタル形式の画像データに基づく画像をブロックマッチング処理における基準画像Iaとして出力する。これによって、ブロックマッチング処理において視差値を求めるための基準となる画像を得る。そして、ステップS4へ移行する。
視差値導出部3の視差値演算処理部300のコスト算出部301は、基準画像Iaにおける基準画素p(x,y)の輝度値、および、基準画素p(x,y)に基づく比較画像Ibにおけるエピポーラ線EL上で、基準画素p(x,y)の位置に相当する画素からシフト量dでシフトすることにより特定される、対応画素の候補画素q(x+d,y)の各輝度値に基づいて、各候補画素q(x+d,y)のコスト値C(p,d)を算出することにより取得する。具体的には、コスト算出部301は、ブロックマッチング処理により、基準画像Iaの基準画素pを中心とする所定領域である基準領域pbと、比較画像Ibの候補画素qを中心とする候補領域qb(大きさは基準領域pbと同一)との非類似度をコスト値Cとして算出する。そして、ステップS5へ進む。
視差値導出部3の視差値演算処理部300の決定部302は、コスト算出部301により算出されたコスト値Cの最小値に対応するシフト量dを、コスト値Cの算出の対象となった基準画像Iaの画素についての視差値dpとして決定する。そして、視差値導出部3の視差値演算処理部300の第1生成部303は、決定部302により決定された視差値dpに基づいて、基準画像Iaの各画素の輝度値を、その画素に対応する視差値dpで表した画像である視差画像を生成する。生成部303は、生成した視差画像を、認識処理部5に出力する。
図16は、実施の形態に係る認識処理部のトラッキング処理部のトラッキング処理の動作の一例を示すフローチャートである。図17は、移動予測の動作を説明する図である。図16および17を参照しながら、認識処理部5のトラッキング処理部520のトラッキング処理の動作の流れについて説明する。
トラッキング処理部520の移動予測部600は、前段のクラスタリング処理部510により新規検出された物体のこれまでの移動および動作状態の履歴、ならびに車両情報を含む認識領域情報を用いて、これまで追従(トラッキング)してきた物体ごとに、図17に示すように、現在フレーム(基準画像Ia)上で物体が存在する確率が高い予測領域800を特定する。そして、ステップS12へ移行する。
トラッキング処理部520のマッチング部610は、予測領域800内における前フレームで求めた特徴量(テンプレート)との類似度に基づくテンプレートマッチングを行い、現在フレーム上で物体を検出する。マッチング部610によるマッチング処理の詳細は、図18、21および27で後述する。そして、ステップS13へ移行する。
トラッキング処理部520のチェック部620は、マッチング部610により検出された物体の検出領域の大きさに基づいて、トラッキングの目的とする物体(例えば、車両)の大きさに対応するか否かをチェックする。そして、ステップS14へ移行する。
トラッキング処理部520の特徴更新部630は、現在フレームで検出された物体の検出領域の画像から、次のフレームにおいて第1テンプレートマッチング部613、または第2テンプレートマッチング部616および第3テンプレートマッチング部617のテンプレートマッチングで用いる特徴量(テンプレート)を作成して更新する。特徴更新部630による特徴更新処理の詳細は、図19および24で後述する。そして、ステップS15へ移行する。
トラッキング処理部520の状態遷移部640は、補正処理部614または第3テンプレートマッチング部617により最終的に定まった物体の検出領域の状態に応じて、物体の状態を遷移させる機能部である。状態遷移部640は、遷移させた物体の状態を反映させた認識領域情報を、車両制御装置6に出力する。
図18は、実施の形態のトラッキング処理部のトラッキング処理のうちのマッチング処理の動作の一例を示すフローチャートである。図18を参照しながら、トラッキング処理部520のマッチング部610のマッチング処理の動作の流れについて説明する。
マッチング部610の判定部611は、前フレームまでの認識領域情報に基づいて、その認識領域情報に対応する物体の距離を推測し、推測した距離が所定距離以上であるか否かを判定する。推測した距離が所定距離未満の近距離である場合(ステップS121:近距離)、ステップS122へ移行し、所定距離以上の遠距離である場合(ステップS121:遠距離)、ステップS123へ移行する。
マッチング部610の第1間引き処理部612、第1テンプレートマッチング部613、および補正処理部614は、前フレームで検出された検出領域に基づくテンプレートを用いたラフマッチング処理を行う。ラフマッチング処理の詳細は、図21~23で後述する。そして、マッチング処理を終了する。
マッチング部610の第3間引き処理部615、第2テンプレートマッチング部616、および第3テンプレートマッチング部617は、前フレームで検出された検出領域に基づくテンプレートを用いたパーツマッチング処理を行う。パーツマッチング処理の詳細は、図27および28で後述する。そして、マッチング処理を終了する。
図19は、実施の形態のトラッキング処理部のトラッキング処理のうちラフマッチングを行う場合の特徴更新処理の動作の一例を示すフローチャートである。図20は、実施の形態のトラッキング処理部のラフマッチングを行う場合の特徴更新処理における検出領域の画像に対する間引き処理を説明する図である。図19および20を参照しながら、マッチング部610においてラフマッチング処理を行う場合の特徴更新部630の特徴更新処理の動作の流れについて説明する。図19に示す特徴更新処理は、図16におけるステップS12でラフマッチング処理を行う場合に、ステップS14で実行される特徴更新処理である。
特徴更新部630の第2間引き処理部631は、現在フレームにおいて、マッチング部610によるラフマッチング処理で検出された物体の検出領域から間引きテンプレートを作成するための間引き量を決定する。例えば、図20(a)に示す検出領域810が、ラフマッチング処理で検出された物体(車両)の検出領域であるものとし、かつ、横方向に幅Wd[ピクセル]、縦方向に高さHd[ピクセル]の大きさの領域であるものとする。そして、図20(b)に示す間引きテンプレート811が、第2間引き処理部631による間引き処理後の画像であり、かつ、横方向に幅Wd_s[ピクセル]、縦方向に高さHd_s[ピクセル]の大きさのテンプレートであるものとする。この場合、第2間引き処理部631は、間引きテンプレート811の高さHd_sが固定値c[ピクセル](<Hd)となるように、かつ、間引きテンプレート811の幅と高さとの割合が、検出領域810の幅と高さとの割合と一致するように、検出領域810に対して間引き処理を行う。すなわち、第2間引き処理部631による間引き処理の間引き量、すなわち、間引きテンプレート811の高さHd_sおよび幅Wd_sは、以下の(式3)で算出される。
Wd_s=(Wd/Hd)×Hd_s
FH=Hd_s/Hd
FW=Wd_s/Wd (式3)
第2間引き処理部631は、上述の(式3)で決定(算出)した間引き量に基づいて、検出領域810に対して間引き処理を行い、間引きテンプレート811を作成する。そして、ステップS143へ移行する。
特徴更新部630の第1更新部632は、第2間引き処理部631により作成された間引きテンプレート811を、前回ラフマッチング処理に使用された間引きテンプレートに代えて更新(例えば、RAM54に記憶)する。作成された間引きテンプレート811は、次のフレームでのラフマッチング処理で使用される。また、第1更新部632は、ステップS141で算出された上述の(式3)の割合FHおよびFWを、RAM54等に記憶させる。この割合FHおよびFWは、次のフレームでの予測領域の画像の間引き処理(後述)で使用される。そして、特徴更新処理を終了する。
図21は、実施の形態のトラッキング処理部のトラッキング処理におけるラフマッチング処理の動作の一例を示すフローチャートである。図22は、実施の形態のトラッキング処理部のトラッキング処理におけるラフマッチング処理での予測領域の画像に対する間引き処理を説明する図である。図23は、実施の形態のトラッキング処理部のトラッキング処理におけるラフマッチング処理での枠補正処理を説明する図である。図21~23を参照しながら、トラッキング処理のうち、マッチング部610のラフマッチング処理の動作の流れについて説明する。図21に示すラフマッチング処理は、図18においてステップS122で実行されるラフマッチング処理である。
マッチング部610の第1間引き処理部612は、移動予測部600により予測された現在フレームでの予測領域800(図22(a)参照)の画像を所定の間引き量に基づいて間引き処理を行い、図22(b)に示す間引き予測領域801を得る。具体的には、第1間引き処理部612は、前フレームについて、第1更新部632によりRAM54等に記憶された上述の(式3)に示す割合FHおよびFWを用いて、高さHpおよび幅Wpの予測領域800に対し、下記の(式4)により算出した高さHp_sおよび幅Wp_sの大きさの間引き予測領域801となるように間引き処理を行う。
Wp_s=FW×Hp_s (式4)
マッチング部610の第1テンプレートマッチング部613は、現在フレームにおいて、第1間引き処理部612により間引き処理が行われた間引き予測領域801内で、前フレームについて第1更新部632により更新された間引きテンプレート811に基づくテンプレートマッチングを行う。すなわち、第1テンプレートマッチング部613は、間引き予測領域801内で、間引きテンプレート811と一致する、または、一致するとみなせる画像を検出する。ここで、間引き予測領域801内で、間引きテンプレート811との類似度を示す評価値としては、SAD等を用いることができる。第1テンプレートマッチング部613は、間引き予測領域801内をラスタスキャンしながら、間引きテンプレート811に基づくSADを算出し、SADが最も小さい画像の位置を求める。間引き予測領域801内で最もSADが小さい画像の位置としては、例えば、間引き予測領域801におけるその画像の左上端の位置(px_thin,py_thin)を求める。
py=py_thin×FH (式5)
マッチング部610の補正処理部614は、現在フレームにおいて、第1テンプレートマッチング部613によるテンプレートマッチングにより検出された物体の画像(検出領域)の枠(図23に示す検出枠820)について補正処理を行う。具体的には、補正処理部614は、まず、図23に示すように、第1テンプレートマッチング部613により検出された検出枠820に対応する視差画像(現在フレームに対応る視差画像)上の画像について、X方向で視差値を含む画素の頻度を示すヒストグラム900、および、Y方向で視差値を含む画素の頻度を示すヒストグラム901を作成する。そして、補正処理部614は、図23に示すように、ヒストグラム900において閾値Thを超えるX方向の位置を、それぞれ、補正後の検出枠821の左端および右端の位置とし、ヒストグラム901において閾値Thを超えるY方向の位置を、それぞれ、補正後の検出枠821の上端および下端の位置とする。閾値Thは、例えば、ヒストグラムの最大値に対して10~20[%]の値とすればよい。この場合、図23ではX方向およびY方向の閾値を、閾値Thとしているが、同一の閾値である必要はない。このようにして、補正処理部614により補正処理が行われた検出枠821の画像が、マッチング部610によるラフマッチング処理によって最終的に検出された検出領域となる。そして、補正処理部614は、検出した物体の検出領域の情報(フレームに対する位置、および大きさ等)を、その物体の認識領域情報に含める。
図24は、実施の形態のトラッキング処理部のトラッキング処理のうちパーツマッチングを行う場合の特徴更新処理の動作の一例を示すフローチャートである。図25は、実施の形態のトラッキング処理部のパーツマッチングを行う場合の特徴更新処理におけるパーツテンプレートの選択処理の動作の一例を示すフローチャートである。図26は、パーツテンプレートの選択処理を説明する図である。図24~26を参照しながら、マッチング部610においてパーツマッチング処理を行う場合の特徴更新部630の特徴更新処理の動作の流れについて説明する。図24に示す特徴更新処理は、図16においてステップS12でパーツマッチング処理を行う場合に、ステップS14で実行される特徴更新処理である。
特徴更新部630の第4間引き処理部633は、現在フレームにおいて、マッチング部610によるパーツマッチング処理で検出された物体の検出領域から間引きテンプレートを作成するための間引き量を決定する。第4間引き処理部633による間引き量の決定方法は、上述の図19のステップS141の第2間引き処理部631による間引き量の決定方法と同様である。ただし、図24に示す特徴更新処理は、判定部611により物体が遠距離にある場合に実行されるものであり、フレームに写っている物体は近距離に比べて小さいので、第2間引き処理部631による間引き量よりも小さい間引き量とする。例えば、間引きテンプレートの高さHd_sを、固定値cよりも値が小さい固定値とする。そして、ステップS146へ移行する。
第4間引き処理部633は、ステップS145で決定(算出)した間引き量に基づいて、現在フレームで検出された検出領域に対して間引き処理を行い、間引きテンプレートを作成する。そして、ステップS147へ移行する。
特徴更新部630の第2更新部634は、第4間引き処理部633により作成された間引きテンプレートを、前回パーツマッチング処理に使用された間引きテンプレートに代えて更新(例えば、RAM54に記憶)する。作成された間引きテンプレートは、次のフレームでのパーツマッチング処理で使用される。また、第2更新部634は、ステップS145で算出された、上述の図19のステップS141の(式3)の割合FHおよびFWにそれぞれ対応する割合を、RAM54等に記憶させる。この割合は、次のフレームでの予測領域の画像の間引き処理(後述)で使用される。そして、ステップS148へ移行する。
特徴更新部630のパーツテンプレート選択部635は、現在フレームにおいて、第3テンプレートマッチング部617により最終的に定まった物体の検出領域の画像から、所定の条件を満たす部分画像(パーツテンプレート)を選択する。このパーツテンプレート選択部635による選択処理を、図25のステップS1481~S1488の処理、および図26を参照しながら、詳述する。
パーツテンプレート選択部635は、現在フレームにおいて、パーツマッチング処理により検出された図26(a)に示す検出領域830内の左上端および右下端に、それぞれ仮枠840、841を作成する。そして、ステップS1482へ移行する。
パーツテンプレート選択部635は、検出領域830の左上端の仮枠840内に対応する視差画像(現在フレームに対応する視差画像)上の画像において、有効な視差値を有する画素の数(以下、「視差点数」という場合がある)をカウントする。ここで、有効な視差値を有する画素とは、例えば、視差値を有さない画素、または、非常に遠方の距離を視差値、すなわち非常に値が小さい視差値を有する画素ではない画像を示す。そして、ステップS1483へ移行する。
パーツテンプレート選択部635は、仮枠840内の面積に対する、カウントした視差点数の画素が占める面積の割合が所定の閾値以上であるか否かを判定する。割合が所定の閾値以上(ステップS1483:Yes)である場合、ステップS1485へ移行し、割合が所定の閾値未満(ステップS1483:No)である場合、ステップS1484へ移行する。例えば、図26(a)に示した状態の仮枠840内の画像である仮枠内画像850には、有効な視差値が少ない、すなわち、仮枠内画像850に含まれる物体(車両)の部分が少ないため、後述するパーツマッチング処理での効果的なテンプレートマッチングができない。そのため、後述のステップS1484の処理を行う必要がある。
パーツテンプレート選択部635は、検出領域830上で、仮枠840の位置を現在の位置から検出領域830の内側に向かって所定量ずらす。例えば、パーツテンプレート選択部635は、仮枠840の位置を現在の位置から、N画素分(Nは所定値)右にずらし、かつN画素分下にずらすものとしてもよく、または検出領域830の中心に向かって所定量だけずらすものとしてもよい。そして、ステップS1482へ戻る。
パーツテンプレート選択部635は、検出領域830の右下端の仮枠841内に対応する視差画像(現在フレームに対応する視差画像)上の画像において、有効な視差値を有する画素の数(視差点数)をカウントする。そして、ステップS1486へ移行する。
パーツテンプレート選択部635は、仮枠841内の面積に対する、カウントした視差点数の画素が占める面積の割合が所定の閾値以上であるか否かを判定する。割合が所定の閾値以上(ステップS1486:Yes)である場合、ステップS1488へ移行し、割合が所定の閾値未満(ステップS1486:No)である場合、ステップS1487へ移行する。例えば、図26(a)に示した状態の仮枠841内の画像である仮枠内画像851には、有効な視差値が多く、すなわち、仮枠内画像851に含まれる物体(車両)の部分が多いため、後述するパーツマッチング処理での効果的なテンプレートマッチングができる。
パーツテンプレート選択部635は、検出領域830上で、仮枠841の位置を現在の位置から検出領域830の内側に向かって所定量ずらす。例えば、パーツテンプレート選択部635は、仮枠841の位置を現在の位置から、N画素分(Nは所定値)左にずらし、かつN画素分上にずらすものとしてもよく、または検出領域830の中心に向かって所定量だけずらすものとしてもよい。そして、ステップS1485へ戻る。
パーツテンプレート選択部635は、上述のステップS1481~S1487の処理の結果、検出領域830における現在の位置の2つの仮枠の画像をパーツテンプレートとして選択する。図26の例では、検出領域830上で最初に作成した左上端の仮枠840内の画像において視差点数の画素が占める面積の割合が所定の閾値未満であるので、ステップS1484の処理により、図26(b)に示すように検出領域830の内側にずらして所定の閾値以上となるようにした仮枠840aの画像である仮枠内画像850aをパーツテンプレートとして選択している。また、検出領域830上で最初に作成した右下端の仮枠841内の画像において視差点数の画素が占める面積の割合が所定の閾値以上であるので、仮枠841の画像である仮枠内画像851をパーツテンプレートとして選択している。
特徴更新部630の第3更新部636は、パーツテンプレート選択部635により選択された2つのパーツテンプレートを、前回パーツマッチング処理に使用された2つのパーツテンプレートに代えて更新(例えば、RAM54に記憶)する。選択されたパーツテンプレートは、次のフレームでのパーツマッチング処理で使用される。そして、特徴更新処理を終了する。
図27は、実施の形態のトラッキング処理部のトラッキング処理におけるパーツマッチング処理の動作の一例を示すフローチャートである。図28は、パーツマッチング処理を説明する図である。図27および28を参照しながら、トラッキング処理のうち、マッチング部610のパーツマッチング処理の動作の流れについて説明する。図27に示すパーツマッチング処理は、図18においてステップS123で実行されるパーツマッチング処理である。
マッチング部610の第3間引き処理部615は、移動予測部600により予測された現在フレーム(図28(a)に示す基準画像Ia)での予測領域800(図28(b)参照)の画像を所定の間引き量(例えば、第1間引き処理部612の間引き量よりも小さい量とする)に基づいて間引き処理を行い、間引き予測領域を得る。具体的には、第3間引き処理部615は、前フレームについて、第2更新部634によりRAM54等に記憶された割合(図24のステップS147参照)を用いて、予測領域に対し、上述の図21のステップS1221と同様の間引き処理を行う。
マッチング部610の第2テンプレートマッチング部616は、現在フレームにおいて、第3間引き処理部615により間引き処理が行われた間引き予測領域内で、前フレームについて第2更新部634により更新された間引きテンプレートに基づくテンプレートマッチングを行う。すなわち、第2テンプレートマッチング部616は、間引き予測領域内で、間引きテンプレートと一致する、または、一致するとみなせる画像を検出する。ここで、間引き予測領域内で、間引きテンプレートとの類似度を示す評価値としては、SAD等を用いることができる。第2テンプレートマッチング部616は、間引き予測領域内をラスタスキャンしながら、間引きテンプレートに基づくSADを算出し、SADが最も小さい画像の位置を求める。間引き予測領域内で最もSADが小さい画像の位置としては、例えば、間引き予測領域におけるその画像の左上端の位置を求める。
マッチング部610の第3テンプレートマッチング部617は、現在フレームにおいて、第2テンプレートマッチング部616によるテンプレートマッチングにより検出された検出領域860内で、前フレームについて第3更新部636により更新された2つのパーツテンプレート(図28(d)に示すパーツテンプレート870、871)に基づくテンプレートマッチングを行う。すなわち、第3テンプレートマッチング部617は、検出領域860内で、パーツテンプレート870、871とそれぞれ一致する、または、一致するとみなせる(以下、単に「一致する」という)画像を検出する。ここで、パーツテンプレート870、871は、上述の図26(b)に示す仮枠内画像850a、851にそれぞれ対応する。検出領域860内で、パーツテンプレート870、871との類似度を示す評価値としては、SAD等を用いることができる。第3テンプレートマッチング部617は、検出領域860内をラスタスキャンしながら、パーツテンプレート870、871に基づくSADをそれぞれ算出し、SADが最も小さい画像の位置をそれぞれ求める。
2 本体部
3 視差値導出部
4 通信線
5 認識処理部
6 車両制御装置
7 ステアリングホイール
8 ブレーキペダル
10a、10b 撮像部
11a、11b 撮像レンズ
12a、12b 絞り
13a、13b 画像センサ
20a、20b 信号変換部
21a、21b CDS
22a、22b AGC
23a、23b ADC
24a、24b フレームメモリ
30 画像処理部
31 FPGA
32 CPU
33 ROM
34 RAM
35 I/F
39 バスライン
51 FPGA
52 CPU
53 ROM
54 RAM
55 I/F
58 CANI/F
59 バスライン
60 機器制御システム
70 車両
100a、100b 画像取得部
200a、200b 変換部
300 視差値演算処理部
301 コスト算出部
302 決定部
303 第1生成部
500 第2生成部
510 クラスタリング処理部
520 トラッキング処理部
600 移動予測部
610 マッチング部
611 判定部
612 第1間引き処理部
613 第1テンプレートマッチング部
614 補正処理部
615 第3間引き処理部
616 第2テンプレートマッチング部
617 第3テンプレートマッチング部
620 チェック部
630 特徴更新部
631 第2間引き処理部
632 第1更新部
633 第4間引き処理部
634 第2更新部
635 パーツテンプレート選択部
636 第3更新部
640 状態遷移部
700 路面
700a 路面部
701 電柱
701a 電柱部
702 車
702a 車部
711 左ガードレール
711a、711b 左ガードレール部
712 右ガードレール
712a、712b 右ガードレール部
713 車
713a、713b 車部
714 車
714a、714b 車部
721~724 検出領域
721a~724a 検出枠
800 予測領域
801 間引き予測領域
810 検出領域
811 間引きテンプレート
820、821 検出枠
830 検出領域
840、840a、841 仮枠
850、850a、851 仮枠内画像
860 検出領域
870、871 パーツテンプレート
900、901 ヒストグラム
B 基線長
C コスト値
d シフト量
dp 視差値
E 物体
EL エピポーラ線
f 焦点距離
Ia 基準画像
Ib 比較画像
Ip 視差画像
p 基準画素
pb 基準領域
q 候補画素
qb 候補領域
RM リアルUマップ
S、Sa、Sb 点
Th 閾値
UM Uマップ
VM Vマップ
Z 距離
Claims (17)
- 現在のフレームに対する前のフレームにおける物体の位置から、前記現在のフレームにおける前記物体の位置を予測して予測領域を特定する予測手段と、
前記前のフレームにおける前記物体の距離に基づいて、該物体が第1距離域に存在するか、前記第1距離域よりも遠い第2距離域に存在するかを判定する判定手段と、
前記判定手段により前記物体が前記第1距離域に存在すると判定された場合、前記現在のフレームの前記予測領域において、前記前のフレームの該物体についての第1テンプレートを用いたテンプレートマッチングを行い、該物体を検出する第1マッチング処理手段と、
前記判定手段により前記物体が前記第2距離域に存在すると判定された場合、前記現在のフレームの前記予測領域において、前記前のフレームの該物体についての、前記第1テンプレートとは異なる第2テンプレートを用いたテンプレートマッチングを行い、該物体を検出する第2マッチング処理手段と、
を備えた画像処理装置。 - 前記第1マッチング処理手段は、前記予測領域内における画素を間引いた間引き予測領域に対してテンプレートマッチングを行う
ことを特徴とする請求項1に記載の画像処理装置。 - 前記第1マッチング処理手段は、前記予測領域内における画素を間引いた間引き予測領域に対してテンプレートマッチングを行った後、前記画素の間引きによる検出誤差を補正することにより前記物体を検出する
ことを特徴とする請求項2に記載の画像処理装置。 - 前記第2マッチング処理手段は、前記物体の部分的なテンプレートを用いて該物体を検出する
ことを特徴とする請求項1~3のいずれか一項に記載の画像処理装置。 - 前記第2マッチング処理手段は、前記予測領域内における画素を間引いた間引き予測領域に対してテンプレートマッチングを行い、前記画素の間引きによる検出誤差を前記物体の部分的なテンプレートを用いて補正することにより前記物体を検出する
ことを特徴とする請求項4に記載の画像処理装置。 - 前記判定手段は、距離情報に基づく物体の距離に基づいて、前記物体が前記第1距離域に存在するか、前記第2距離域に存在するかを判定し、
前記第1マッチング処理手段は、前記判定手段により前記物体が前記第1距離域に存在すると判定された場合、前記現在のフレームに対する、前記前のフレームでの該物体の第1検出領域についての前記第1テンプレートを用いたテンプレートマッチングにより該物体を検出し、検出した該物体の第2検出領域の距離情報に基づいて前記第2検出領域を補正し、
前記第2マッチング処理手段は、前記判定手段により前記物体が前記第2距離域に存在すると判定された場合、前記現在のフレームに対する、前記前のフレームでの該物体の第3検出領域についての前記第2テンプレートを用いたテンプレートマッチングにより該物体を検出し、検出した該物体の第4検出領域の大きさを前記第3検出領域の部分画像に基づいて補正する請求項1に記載の画像処理装置。 - 前記第1マッチング処理手段は、前記第2検出領域に対応する距離情報の頻度に基づいて該第2検出領域の大きさを補正する請求項6に記載の画像処理装置。
- 前記第2マッチング処理手段は、前記第4検出領域内の前記部分画像に一致する部分の位置に基づいて、該第4検出領域の大きさを補正する請求項6または7に記載の画像処理装置。
- 前記第1マッチング処理手段は、前記予測領域内で前記第1テンプレートを用いたテンプレートマッチングを行い、
前記第2マッチング処理手段は、前記予測領域内で前記第2テンプレートを用いたテンプレートマッチングを行う請求項6~8のいずれか一項に記載の画像処理装置。 - 前記第1マッチング処理手段は、前記予測領域を第1間引き量で間引き、間引いた前記予測領域内で、前記第1検出領域が前記第1間引き量で間引かれた前記第1テンプレートを用いたテンプレートマッチングを行い、
前記第2マッチング処理手段は、前記予測領域を第2間引き量で間引き、間引いた前記予測領域内で、前記第3検出領域が前記第2間引き量で間引かれた前記第2テンプレートを用いたテンプレートマッチングを行う請求項9に記載の画像処理装置。 - 前記第1マッチング処理手段は、前記第2間引き量よりも大きい前記第1間引き量で前記予測領域を間引く請求項10に記載の画像処理装置。
- 矩形状の前記第1検出領域を、高さ方向の長さまたは幅方向の長さのいずれかが固定長となるように前記第1間引き量で間引いて前記第1テンプレートを作成する第1間引き手段と、
矩形状の前記第3検出領域を、高さ方向の長さまたは幅方向の長さのいずれかが固定長となるように前記第2間引き量で間引いて前記第2テンプレートを作成する第2間引き手段と、
をさらに備えた請求項10または11に記載の画像処理装置。 - 前のフレームでの前記物体の前記第3検出領域において、該第3検出領域の端に仮枠を配置し、前記仮枠を該第3検出領域の内側にずらしながら、該仮枠に含まれる視差値が所定割合以上となった場合に、該仮枠を前記部分画像として選択する選択手段を、さらに備えた請求項6~12のいずれか一項に記載の画像処理装置。
- 被写体を撮像することにより第1撮像画像を得る第1撮像手段と、
前記第1撮像手段の位置とは異なる位置に配置され、前記被写体を撮像することにより第2撮像画像を得る第2撮像手段と、
前記第1撮像画像および前記第2撮像画像から前記被写体に対して求めた視差値に基づいて、前記距離情報を生成する生成手段と、
前記第1撮像画像または前記第2撮像画像、および前記距離情報に基づいて、新規に物体を検出する検出手段と、
請求項1~13のいずれか一項に記載の画像処理装置と、
を備えた物体認識装置。 - 請求項14に記載の物体認識装置と、
前記物体認識装置により検出された前記物体の情報に基づいて、制御対象を制御する制御装置と、
を備えた機器制御システム。 - 現在のフレームに対する前のフレームにおける物体の位置から、前記現在のフレームにおける前記物体の位置を予測して予測領域を特定する予測ステップと、
前記前のフレームにおける前記物体の距離に基づいて、該物体が第1距離域に存在するか、前記第1距離域よりも遠い第2距離域に存在するかを判定する判定ステップと、
前記物体が前記第1距離域に存在すると判定した場合、前記現在のフレームの前記予測領域において、前記前のフレームの該物体についての第1テンプレートを用いたテンプレートマッチングを行い、該物体を検出する第1マッチング処理ステップと、
前記物体が前記第2距離域に存在すると判定した場合、前記現在のフレームの前記予測領域において、前記前のフレームの該物体についての、前記第1テンプレートとは異なる第2テンプレートを用いたテンプレートマッチングを行い、該物体を検出する第2マッチング処理ステップと、
を有する画像処理方法。 - コンピュータを、
現在のフレームに対する前のフレームにおける物体の位置から、前記現在のフレームにおける前記物体の位置を予測して予測領域を特定する予測手段と、
前記前のフレームにおける前記物体の距離に基づいて、該物体が第1距離域に存在するか、前記第1距離域よりも遠い第2距離域に存在するかを判定する判定手段と、
前記判定手段により前記物体が前記第1距離域に存在すると判定された場合、前記現在のフレームの前記予測領域において、前記前のフレームの該物体についての第1テンプレートを用いたテンプレートマッチングを行い、該物体を検出する第1マッチング処理手段と、
前記判定手段により前記物体が前記第2距離域に存在すると判定された場合、前記現在のフレームの前記予測領域において、前記前のフレームの該物体についての、前記第1テンプレートとは異なる第2テンプレートを用いたテンプレートマッチングを行い、該物体を検出する第2マッチング処理手段と、
して機能させるためのプログラム。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP16889946.6A EP3416132B1 (en) | 2016-02-08 | 2016-12-14 | Image processing device, object recognition device, device control system, and image processing method and program |
JP2017566537A JP6614247B2 (ja) | 2016-02-08 | 2016-12-14 | 画像処理装置、物体認識装置、機器制御システム、画像処理方法およびプログラム |
US16/046,162 US10776946B2 (en) | 2016-02-08 | 2018-07-26 | Image processing device, object recognizing device, device control system, moving object, image processing method, and computer-readable medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2016-021953 | 2016-02-08 | ||
JP2016021953 | 2016-02-08 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/046,162 Continuation US10776946B2 (en) | 2016-02-08 | 2018-07-26 | Image processing device, object recognizing device, device control system, moving object, image processing method, and computer-readable medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017138245A1 true WO2017138245A1 (ja) | 2017-08-17 |
Family
ID=59563277
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2016/087158 WO2017138245A1 (ja) | 2016-02-08 | 2016-12-14 | 画像処理装置、物体認識装置、機器制御システム、画像処理方法およびプログラム |
Country Status (4)
Country | Link |
---|---|
US (1) | US10776946B2 (ja) |
EP (1) | EP3416132B1 (ja) |
JP (1) | JP6614247B2 (ja) |
WO (1) | WO2017138245A1 (ja) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110942668A (zh) * | 2018-09-21 | 2020-03-31 | 丰田自动车株式会社 | 图像处理系统、图像处理方法和图像处理设备 |
CN111127508A (zh) * | 2018-10-31 | 2020-05-08 | 杭州海康威视数字技术股份有限公司 | 一种基于视频的目标跟踪方法及装置 |
JPWO2019116708A1 (ja) * | 2017-12-12 | 2020-12-17 | ソニー株式会社 | 画像処理装置と画像処理方法およびプログラムと情報処理システム |
JP2022014135A (ja) * | 2020-07-06 | 2022-01-19 | 株式会社ハイシンク創研 | 物体認識装置及びこれを用いた物体搬送システム |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3352134B1 (en) * | 2015-09-15 | 2023-10-11 | Ricoh Company, Ltd. | Image processing device, object recognition device, device control system, image processing method, and program |
WO2017115732A1 (ja) * | 2015-12-28 | 2017-07-06 | 株式会社リコー | 画像処理装置、物体認識装置、機器制御システム、画像処理方法及び画像処理プログラム |
US11300663B2 (en) * | 2016-03-31 | 2022-04-12 | Nec Corporation | Method for predicting a motion of an object |
JP6810009B2 (ja) * | 2017-09-29 | 2021-01-06 | トヨタ自動車株式会社 | 視差算出装置 |
US11565698B2 (en) * | 2018-04-16 | 2023-01-31 | Mitsubishi Electric Cornoration | Obstacle detection apparatus, automatic braking apparatus using obstacle detection apparatus, obstacle detection method, and automatic braking method using obstacle detection method |
JP2020190438A (ja) | 2019-05-20 | 2020-11-26 | 株式会社リコー | 計測装置および計測システム |
CN112257542B (zh) * | 2020-10-15 | 2024-03-15 | 东风汽车有限公司 | 障碍物感知方法、存储介质及电子设备 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008176504A (ja) * | 2007-01-17 | 2008-07-31 | Toshiba Corp | 物体検出装置及びその方法 |
JP2009122859A (ja) * | 2007-11-13 | 2009-06-04 | Toyota Motor Corp | 物体検出装置 |
JP2012059030A (ja) * | 2010-09-09 | 2012-03-22 | Optex Co Ltd | 距離画像カメラを用いた人体識別方法および人体識別装置 |
JP2012164275A (ja) | 2011-02-09 | 2012-08-30 | Toyota Motor Corp | 画像認識装置 |
Family Cites Families (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005214914A (ja) | 2004-02-02 | 2005-08-11 | Fuji Heavy Ind Ltd | 移動速度検出装置および移動速度検出方法 |
JP2006012013A (ja) * | 2004-06-29 | 2006-01-12 | Seiwa Electric Mfg Co Ltd | 移動物体追跡装置 |
JP2008158640A (ja) | 2006-12-21 | 2008-07-10 | Fuji Heavy Ind Ltd | 移動物体検出装置 |
JP5664152B2 (ja) | 2009-12-25 | 2015-02-04 | 株式会社リコー | 撮像装置、車載用撮像システム及び物体識別装置 |
JP5675229B2 (ja) * | 2010-09-02 | 2015-02-25 | キヤノン株式会社 | 画像処理装置及び画像処理方法 |
JP5895734B2 (ja) * | 2011-09-05 | 2016-03-30 | コニカミノルタ株式会社 | 画像処理装置及び画像処理方法 |
JP5829980B2 (ja) | 2012-06-19 | 2015-12-09 | トヨタ自動車株式会社 | 路側物検出装置 |
JP6398347B2 (ja) | 2013-08-15 | 2018-10-03 | 株式会社リコー | 画像処理装置、認識対象物検出方法、認識対象物検出プログラム、および、移動体制御システム |
JP6519262B2 (ja) | 2014-04-10 | 2019-05-29 | 株式会社リコー | 立体物検出装置、立体物検出方法、立体物検出プログラム、及び移動体機器制御システム |
JP2016001464A (ja) | 2014-05-19 | 2016-01-07 | 株式会社リコー | 処理装置、処理システム、処理プログラム、及び、処理方法 |
JP2016001170A (ja) | 2014-05-19 | 2016-01-07 | 株式会社リコー | 処理装置、処理プログラム、及び、処理方法 |
JP6550881B2 (ja) | 2014-07-14 | 2019-07-31 | 株式会社リコー | 立体物検出装置、立体物検出方法、立体物検出プログラム、及び移動体機器制御システム |
US20160019429A1 (en) | 2014-07-17 | 2016-01-21 | Tomoko Ishigaki | Image processing apparatus, solid object detection method, solid object detection program, and moving object control system |
US10007998B2 (en) * | 2015-03-20 | 2018-06-26 | Ricoh Company, Ltd. | Image processor, apparatus, and control system for correction of stereo images |
JP6753134B2 (ja) | 2015-07-07 | 2020-09-09 | 株式会社リコー | 画像処理装置、撮像装置、移動体機器制御システム、画像処理方法、及び画像処理プログラム |
US9760791B2 (en) * | 2015-09-01 | 2017-09-12 | Sony Corporation | Method and system for object tracking |
EP3385904A4 (en) | 2015-11-30 | 2018-12-19 | Ricoh Company, Ltd. | Image processing device, object recognition device, device conrol system, image processing method, and program |
-
2016
- 2016-12-14 JP JP2017566537A patent/JP6614247B2/ja active Active
- 2016-12-14 EP EP16889946.6A patent/EP3416132B1/en active Active
- 2016-12-14 WO PCT/JP2016/087158 patent/WO2017138245A1/ja active Application Filing
-
2018
- 2018-07-26 US US16/046,162 patent/US10776946B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008176504A (ja) * | 2007-01-17 | 2008-07-31 | Toshiba Corp | 物体検出装置及びその方法 |
JP2009122859A (ja) * | 2007-11-13 | 2009-06-04 | Toyota Motor Corp | 物体検出装置 |
JP2012059030A (ja) * | 2010-09-09 | 2012-03-22 | Optex Co Ltd | 距離画像カメラを用いた人体識別方法および人体識別装置 |
JP2012164275A (ja) | 2011-02-09 | 2012-08-30 | Toyota Motor Corp | 画像認識装置 |
Non-Patent Citations (2)
Title |
---|
See also references of EP3416132A4 |
SHINICHI OKUSAKO ET AL.: "Human Tracking with a Mobile Robot using a Laser Range-Finder", JOURNAL OF THE ROBOTICS SOCIETY OF JAPAN, vol. 24, no. 5, 15 July 2006 (2006-07-15), pages 43 - 51, XP055548558 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPWO2019116708A1 (ja) * | 2017-12-12 | 2020-12-17 | ソニー株式会社 | 画像処理装置と画像処理方法およびプログラムと情報処理システム |
JP7136123B2 (ja) | 2017-12-12 | 2022-09-13 | ソニーグループ株式会社 | 画像処理装置と画像処理方法およびプログラムと情報処理システム |
CN110942668A (zh) * | 2018-09-21 | 2020-03-31 | 丰田自动车株式会社 | 图像处理系统、图像处理方法和图像处理设备 |
CN111127508A (zh) * | 2018-10-31 | 2020-05-08 | 杭州海康威视数字技术股份有限公司 | 一种基于视频的目标跟踪方法及装置 |
CN111127508B (zh) * | 2018-10-31 | 2023-05-02 | 杭州海康威视数字技术股份有限公司 | 一种基于视频的目标跟踪方法及装置 |
JP2022014135A (ja) * | 2020-07-06 | 2022-01-19 | 株式会社ハイシンク創研 | 物体認識装置及びこれを用いた物体搬送システム |
JP7333546B2 (ja) | 2020-07-06 | 2023-08-25 | 株式会社ハイシンク創研 | 物体認識装置及びこれを用いた物体搬送システム |
Also Published As
Publication number | Publication date |
---|---|
JPWO2017138245A1 (ja) | 2018-09-27 |
EP3416132A1 (en) | 2018-12-19 |
US10776946B2 (en) | 2020-09-15 |
JP6614247B2 (ja) | 2019-12-04 |
EP3416132B1 (en) | 2021-06-16 |
US20180336701A1 (en) | 2018-11-22 |
EP3416132A4 (en) | 2019-01-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017138245A1 (ja) | 画像処理装置、物体認識装置、機器制御システム、画像処理方法およびプログラム | |
JP6795027B2 (ja) | 情報処理装置、物体認識装置、機器制御システム、移動体、画像処理方法およびプログラム | |
JP6597792B2 (ja) | 画像処理装置、物体認識装置、機器制御システム、画像処理方法およびプログラム | |
JP6597795B2 (ja) | 画像処理装置、物体認識装置、機器制御システム、画像処理方法およびプログラム | |
JP6565188B2 (ja) | 視差値導出装置、機器制御システム、移動体、ロボット、視差値導出方法、およびプログラム | |
JP7206583B2 (ja) | 情報処理装置、撮像装置、機器制御システム、移動体、情報処理方法およびプログラム | |
JP6561512B2 (ja) | 視差値導出装置、移動体、ロボット、視差値導出方法、視差値生産方法及びプログラム | |
JP2017151535A (ja) | 画像処理装置、物体認識装置、機器制御システム、画像処理方法およびプログラム | |
JP6547841B2 (ja) | 画像処理装置、物体認識装置、機器制御システム、画像処理方法およびプログラム | |
JP6516012B2 (ja) | 画像処理装置、物体認識装置、機器制御システム、画像処理方法およびプログラム | |
CN111971682A (zh) | 路面检测装置、利用了路面检测装置的图像显示装置、利用了路面检测装置的障碍物检测装置、路面检测方法、利用了路面检测方法的图像显示方法以及利用了路面检测方法的障碍物检测方法 | |
JP2016152027A (ja) | 画像処理装置、画像処理方法およびプログラム | |
JP6572696B2 (ja) | 画像処理装置、物体認識装置、機器制御システム、画像処理方法およびプログラム | |
JP6543935B2 (ja) | 視差値導出装置、機器制御システム、移動体、ロボット、視差値導出方法、およびプログラム | |
JP6992356B2 (ja) | 情報処理装置、撮像装置、機器制御システム、移動体、情報処理方法およびプログラム | |
JP2017167970A (ja) | 画像処理装置、物体認識装置、機器制御システム、画像処理方法およびプログラム | |
JP2017027578A (ja) | 検出装置、視差値導出装置、物体認識装置、機器制御システム、検出方法、およびプログラム | |
WO2018097269A1 (en) | Information processing device, imaging device, equipment control system, mobile object, information processing method, and computer-readable recording medium | |
EP2919191B1 (en) | Disparity value deriving device, equipment control system, movable apparatus, robot, and disparity value producing method | |
JP2019160251A (ja) | 画像処理装置、物体認識装置、機器制御システム、移動体、画像処理方法およびプログラム | |
JP6701738B2 (ja) | 視差値導出装置、視差値導出方法及びプログラム | |
JP2018045328A (ja) | 画像処理装置、物体認識装置、機器制御システム、画像処理方法およびプログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16889946 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2017566537 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2016889946 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2016889946 Country of ref document: EP Effective date: 20180910 |