WO2014084218A1 - Subject detection device - Google Patents
Subject detection device Download PDFInfo
- Publication number
- WO2014084218A1 WO2014084218A1 PCT/JP2013/081808 JP2013081808W WO2014084218A1 WO 2014084218 A1 WO2014084218 A1 WO 2014084218A1 JP 2013081808 W JP2013081808 W JP 2013081808W WO 2014084218 A1 WO2014084218 A1 WO 2014084218A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- identification
- learning
- target area
- identification target
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/16—Anti-collision systems
- G08G1/166—Anti-collision systems for active traffic, e.g. moving vehicles, pedestrians, bikes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
Definitions
- the present invention relates to an object detection device that detects an object based on a captured image acquired from an imaging means.
- an image is picked up using an image pickup means such as a camera, and an object is detected based on the obtained picked-up image.
- an image pickup means such as a camera
- an object is detected based on the obtained picked-up image.
- a technique for detecting a pedestrian reflected in a captured image using a feature amount (so-called HOG feature amount) related to intensity and gradient of luminance in a local region of the image is described.
- a vehicle periphery monitoring device that monitors the relative positional relationship between the object and the host vehicle is constructed. By introducing this device, the vehicle occupant (especially the driver) is supported.
- the above discriminator can discriminate not only pedestrians (human bodies) but also animals, artificial structures, etc. by performing various machine learning.
- an image area hereinafter referred to as an identification target area
- the identification accuracy may differ depending on the type of the object. This is probably because the shape of the projected image varies depending on the type of the object, and the ratio of capturing the image information of the background portion changes.
- the image information of the background portion acts as a disturbance factor (noise information) that lowers the learning / identification accuracy of the object during the learning process or the identification process.
- this type of device sequentially captures images while the vehicle is running, the scene (background, weather, etc.) of the captured image obtained from the camera can change from moment to moment.
- road surface patterns such as pedestrian crossings and guardrails are often included in images of identification objects (pedestrians and the like) acquired when a vehicle travels on a road surface. For this reason, as a result of causing the classifier to perform machine learning, there is a high possibility that the road surface pattern is erroneously learned as a pedestrian. That is, it is difficult to predict the image feature in the background portion, and there is a problem that the influence as a disturbance factor cannot be ignored.
- the present invention has been made to solve the above-described problems, and an object of the present invention is to provide an object detection apparatus that can improve learning / identification accuracy regardless of the type of the object.
- An object detection apparatus includes an imaging unit that acquires a captured image, an identification target region extraction unit that extracts an identification target region from the captured image acquired by the imaging unit, and the identification target region extraction Object identifying means for identifying, for each type of the object, whether or not the object exists in the identification target area from the image feature amount extracted in the identification target area by the means,
- the discriminating means is a discriminator generated using machine learning that receives a feature data group as the image feature amount and outputs the presence / absence information of the object, and the discriminator is used for the machine learning.
- the feature data group is created and input from images of at least one of the sub-regions selected according to the type of the object among the plurality of sub-regions constituting each learning sample image.
- the discriminator as the object discriminating means is an image of at least one sub-area selected according to the type of the object out of a plurality of sub-areas constituting each learning sample image used for machine learning. Since the feature data group is created and input, image information of sub-regions suitable for the shape of the projected image of the object can be selectively employed for the learning process, The learning accuracy can be improved regardless of the type. Then, by excluding the remaining sub-regions from the learning process, it is possible to prevent over-learning for image information other than the projected image of the object that acts as a disturbance factor during the identification process, and the identification accuracy of the object Can be improved.
- the object identifying means identifies whether or not the object exists in the identification object area for each moving direction of the object. Based on the tendency that the image shape on the captured image changes according to the moving direction, the learning / identification accuracy of the object is further improved by identifying each moving direction.
- the identification target area extracting unit extracts the identification target area having a size corresponding to a distance from the imaging unit to the target. For example, by making the relative magnitude relationship between the target object and the identification target area constant or substantially constant regardless of the distance, the influence of disturbance factors (image information other than the projected image of the target object) can be suppressed uniformly. As a result, the learning / identification accuracy of the object is further improved.
- the image feature amount preferably includes a luminance gradient direction histogram in space. Because the fluctuation in the brightness gradient direction due to the exposure amount of imaging is small, it is possible to accurately grasp the characteristics of the target object, and stable identification accuracy even in outdoor environments where the intensity of ambient light changes from moment to moment Is obtained.
- the image feature amount includes a luminance gradient direction histogram in time and space.
- the imaging unit is mounted on a moving body and acquires the captured image by capturing an image while the moving body is moving. Since the scene of the captured image obtained from the imaging means mounted on the moving body changes every moment, it is particularly effective.
- the classifier as the object identification unit is selected according to the type of the object from among a plurality of sub-regions constituting each learning sample image provided for machine learning. Since the feature data group is created and input from the at least one sub-region image, image information of the sub-region suitable for the shape of the projected image of the object is selectively selected for the learning process. The learning accuracy can be improved regardless of the type of the object. Then, by excluding the remaining sub-regions from the learning process, it is possible to prevent over-learning for image information other than the projected image of the object that acts as a disturbance factor during the identification process, and the identification accuracy of the object Can be improved. In particular, since the scene of the captured image obtained from the imaging means mounted on the moving body changes every moment, it is particularly effective.
- FIG. 3A and 3B are image diagrams showing an example of a captured image acquired by imaging using a camera. It is a flowchart with which it uses for description of the learning process by the discrimination device shown in FIG. It is an image figure which shows an example of the learning sample image containing a crossing pedestrian. It is a schematic explanatory drawing regarding the definition method of each sub area
- FIG. 8B is a schematic explanatory diagram illustrating a non-mask area and a mask area common to each learning sample image including a crossing pedestrian.
- FIG. 9A is a schematic diagram showing a typical contour image obtained from a large number of learning sample images including facing pedestrians.
- FIG. 9B is a schematic explanatory diagram showing a non-mask area and a mask area common to each learning sample image including a face-to-face pedestrian.
- 2 is a flowchart provided for explaining the operation of the ECU shown in FIG. It is a schematic explanatory drawing showing the positional relationship of a vehicle, a camera, and a human body. It is a schematic explanatory drawing regarding the determination method of an identification object area
- FIG. 15A is a schematic explanatory diagram relating to a method for calculating the HOG feature amount.
- FIG. 15B is a schematic explanatory diagram regarding a method of calculating an STHOG feature amount. It is a schematic diagram for demonstrating the principle of the precision maintenance in the detection process using an STHOG feature-value.
- FIG. 1 is a block diagram illustrating a configuration of a vehicle periphery monitoring device 10 as an object detection device according to the present embodiment.
- FIG. 2 is a schematic perspective view of the vehicle 12 on which the vehicle periphery monitoring device 10 shown in FIG. 1 is mounted.
- the vehicle periphery monitoring apparatus 10 includes a color camera (hereinafter simply referred to as “camera 14”) that captures a color image (hereinafter referred to as a captured image Im) including a plurality of color channels, and A vehicle speed sensor 16 for detecting the vehicle speed Vs of the vehicle 12, a yaw rate sensor 18 for detecting the yaw rate Yr of the vehicle 12, a brake sensor 20 for detecting the brake pedal operation amount Br by the driver, and the vehicle periphery monitoring device 10
- An electronic control device (hereinafter referred to as “ECU 22”) to be controlled, a speaker 24 for issuing an alarm or the like by sound, and a display device 26 for displaying a captured image output from the camera 14 are provided.
- the camera 14 is a camera that mainly uses light having a wavelength in the visible light region, and functions as an imaging unit that images the periphery of the vehicle 12.
- the camera 14 has a characteristic that the output signal level increases as the amount of light reflected on the surface of the subject increases, and the luminance (for example, RGB value) of the image increases.
- the camera 14 is fixedly disposed (mounted) at a substantially central portion of the front bumper portion of the vehicle 12.
- the imaging means for imaging the periphery of the vehicle 12 is not limited to the above configuration example (so-called monocular camera), and may be, for example, a compound eye camera (stereo camera). Further, an infrared camera may be used instead of the color camera, or both may be provided. Further, in the case of a monocular camera, another ranging means (radar apparatus) may be provided.
- the speaker 24 outputs an alarm sound or the like in response to a command from the ECU 22.
- the speaker 24 is provided on a dashboard (not shown) of the vehicle 12.
- an audio output function provided in another device (for example, an audio device or a navigation device) may be used.
- the display device 26 (see FIGS. 1 and 2) is a HUD (head-up display) arranged on the front windshield of the vehicle 12 at a position that does not obstruct the driver's front view.
- the display device 26 is not limited to the HUD, but a display that displays a map or the like of a navigation system mounted on the vehicle 12 or a display (MID; multi-information display) that displays fuel consumption or the like provided in a meter unit or the like is used. can do.
- the ECU 22 basically includes an input / output unit 28, a calculation unit 30, a display control unit 32, and a storage unit 34.
- Each signal from the camera 14, the vehicle speed sensor 16, the yaw rate sensor 18, and the brake sensor 20 is input to the ECU 22 side via the input / output unit 28.
- Each signal from the ECU 22 is output to the speaker 24 and the display device 26 via the input / output unit 28.
- the input / output unit 28 includes an A / D conversion circuit (not shown) that converts an input analog signal into a digital signal.
- the calculation unit 30 performs calculations based on the signals from the camera 14, the vehicle speed sensor 16, the yaw rate sensor 18, and the brake sensor 20, and generates signals for the speaker 24 and the display device 26 based on the calculation results.
- the calculation unit 30 functions as a distance estimation unit 40, an identification target region determination unit 42 (identification target region extraction unit), an object identification unit 44 (subject identification unit), and an object detection unit 46.
- the target object identification unit 44 is configured by a classifier 50 generated using machine learning that receives a feature data group as an image feature amount and outputs the presence / absence information of the target object.
- each unit in the calculation unit 30 is realized by reading and executing a program stored in the storage unit 34.
- the program may be supplied from the outside via a wireless communication device (mobile phone, smartphone, etc.) not shown.
- the display control unit 32 is a control circuit that drives and controls the display device 26.
- the display control unit 32 drives the display device 26 by outputting a signal used for display control to the display device 26 via the input / output unit 28. Thereby, the display apparatus 26 can display various images (captured image Im, a mark, etc.).
- the storage unit 34 is a computer-readable and non-transitory storage medium.
- the storage unit 34 includes an imaging signal converted into a digital signal, a RAM (Random Access Memory) that stores temporary data used for various arithmetic processes, and a ROM (Read Only Memory) that stores an execution program, a table, a map, or the like. ) Etc.
- the vehicle periphery monitoring apparatus 10 is basically configured as described above. An outline of the operation of the vehicle periphery monitoring device 10 will be described below.
- the ECU 22 converts an analog video signal output from the camera 14 into a digital signal at a predetermined frame clock interval / cycle (for example, 30 frames per second) and temporarily captures it in the storage unit 34.
- a road area hereinafter simply referred to as “road 60” on which the vehicle 12 travels
- a plurality of utility pole areas hereinafter simply referred to as “electric poles”
- crossing pedestrian 64 a crossing pedestrian area on the road 60
- T T2
- a road 60, a plurality of utility poles 62, and a crossing pedestrian 64 exist in the captured image Im shown in FIG. 3A.
- the relative positional relationship between the camera 14 and each object changes every moment. Thereby, even if the first and second frames (captured image Im) are in the same field angle range, each object is imaged in a different form (shape, size, or color).
- the ECU 22 performs various arithmetic processing with respect to the captured image Im (front image of the vehicle 12) read from the memory
- FIG. The ECU 22 comprehensively considers the processing results for the captured image Im, and signals (vehicle speed Vs, yaw rate Yr, and operation amount Br) that indicate the traveling state of the vehicle 12 as necessary.
- Vs vehicle speed
- Yr yaw rate
- the ECU 22 controls each output unit of the vehicle periphery monitoring device 10 in order to call the driver's attention. For example, the ECU 22 outputs an alarm sound (for example, a beeping sound) via the speaker 24 and emphasizes the part of the monitored object in the captured image Im visualized on the display device 26. Display.
- an alarm sound for example, a beeping sound
- step S11 learning data used for machine learning is collected.
- the learning data is a data set of a learning sample image including (or not including) the target object and the type of the target object (including the attribute “no target object”).
- types of objects include human bodies, various animals (specifically, mammals such as deer, horses, sheep, dogs and cats, birds, etc.), artificial structures (specifically, vehicles, signs, utility poles) , Guardrails, walls, etc.).
- FIG. 5 is an image diagram showing an example of learning sample images 74 and 76 including crossing pedestrians 70 and 72.
- a projected image of a human body that crosswalks from the left side to the right side (hereinafter referred to as a crossing pedestrian 70) is displayed.
- a projected image of a human body that crosswalks from the right front side to the left back side (hereinafter referred to as crossing pedestrian 72) is displayed at the approximate center of another learning sample image 76.
- Each of the learning sample images 74 and 76 in this example corresponds to a correct image including the object.
- Each learning sample image collected may include an incorrect image that does not include an object.
- the learning sample images 74 and 76 have an image region 80 whose shapes match or are similar to each other.
- Step S12 a plurality of sub-regions 82 are defined by dividing the image region 80 of each of the learning sample images 74 and 76, respectively.
- the rectangular image region 80 is equally divided into a lattice shape with eight rows and six columns regardless of the size. That is, 48 sub-regions 82 having the same shape are defined in the image region 80, respectively.
- step S13 the learning architecture of the classifier 50 is constructed.
- the learning architecture include boosting method, SVM (Support Vector Machine), neural network, EM (Expectation Maximization) algorithm and the like.
- AdaBoost which is a kind of boosting method, is applied.
- the discriminator 50 includes N (N is a natural number of 2 or more) feature data generators 90, N weak learners 92, a weight updater 93, and a weight calculator 94. And a sample load updater 95.
- the first data generator, the second data generator, the third data generator,..., The Nth data generator are sequentially arranged from the top with respect to each feature data generator 90. It is written. Similarly, the first weak learner, the second weak learner, the third weak learner,..., And the Nth weak learner are written in order from the top for each weak learner 92.
- one feature data generator 90 for example, the first data generator
- one weak learner 92 for example, the first weak learner
- N subsystems are constructed.
- the output side (each weak learner 92) of N subsystems is connected to the input side of the weight updater 93, and the weighting calculator 94 and the sample load updater 95 are connected in series to the output side. It is connected.
- the detailed operation of the classifier 50 during machine learning will be described later.
- the mask conditions for the sub-region 82 are determined for each type of object.
- the mask condition is a selection as to whether or not to adopt an image in each sub-region 82 when N feature data (hereinafter referred to as a feature data group) is generated from one learning sample image 74. Means a condition.
- the background portion 78 excluding the crossing pedestrian 70 shows another three human bodies, a road surface, a building wall, and the like.
- the image information of the background part 78 acts as a disturbance factor (noise information) that reduces the learning / identification accuracy of the crossing pedestrian 70 as the object. Therefore, it is effective to select only the sub-region 82 suitable for the identification process from among all 48 sub-regions 82 and to learn using the created feature data group.
- FIG. 8A is a schematic diagram showing a typical contour image 100 obtained from a large number of learning sample images 74 and 76 including crossing pedestrians 70 and 72.
- a contour extraction image (not shown) in which the contour of each object is extracted by performing known edge extraction processing on each learning sample image 74 or the like using an image processing device (not shown). Create each one.
- each contour extraction image represents a part from which the contour of the object has been extracted in white, and represents a part from which the outline has not been extracted in black.
- a contour image typical for each learning sample image 74 (typical contour image 100).
- the typical contour image 100 represents a part from which the contour of the object has been extracted in white, and represents a part from which the contour has not been extracted in black. That is, the typical contour image 100 corresponds to an image representing the contour of an object (crossing pedestrians 70 and 72) included in common in a large number of learning sample images 74 and the like.
- a sub-region 82 whose contour feature amount exceeds a predetermined threshold is adopted as a calculation target, and a sub-region 82 whose contour feature amount falls below a predetermined threshold is excluded from the calculation target.
- a set of 20 sub-regions 82 is obtained among the 48 sub-regions 82 defined in the image region 80. It is assumed that the non-mask area 102 is determined. Of the 48 sub-regions 82, the remaining 28 sub-regions 82 (regions filled in with white) are determined as the mask region 104.
- FIG. 9A is a schematic diagram showing a typical contour image 106 obtained from a large number of learning sample images including facing pedestrians.
- the method for creating the typical contour image 106 is the same as that for the typical contour image 100 in FIG.
- FIG. 9B is a schematic explanatory diagram showing a non-mask area 108 and a mask area 110 that are common to each learning sample image including a face-to-face pedestrian. Since the determination method of the mask area 110 is the same as that of the mask area 104 in FIG. 8B, the description thereof is omitted.
- the mask areas 104 and 110 are different even though the object is the same (pedestrian). More specifically, the mask areas 104 and 110 differ depending on whether or not the four sub-areas 82 (see FIG. 9B) with hatching are mask targets. This is because the degree of change in the image shape due to walking motion (movement of shaking hands and feet) varies depending on the moving direction of the pedestrian. In this way, learning may be performed for each moving direction of the object based on the tendency of the image shape on the image to change according to the moving direction.
- the moving direction may be any of a transverse direction (more specifically, right direction and left direction), a facing direction (more specifically, near side direction and back direction), and an oblique direction with respect to the image plane.
- step S15 machine learning is performed by sequentially inputting a large number of learning data collected in step S11 to the discriminator 50.
- the discriminator 50 inputs the learning sample image 74 including the crossing pedestrian 70 among the collected learning data to the feature data generator 90 side.
- each feature data generator 90 creates each feature data (collectively, feature data group) by performing specific arithmetic processing on the learning sample image 74 according to the mask condition determined in step S14.
- the calculation may be performed without using the values of all pixels belonging to the mask area 104 (see FIG. 8B), or the values of all the pixels described above are replaced with predetermined values (for example, 0).
- the feature data may be substantially invalidated by calculating later.
- Each weak learner 92 (i-th weak learner; 1 ⁇ i ⁇ N) is predetermined for each feature data (i-th feature data) acquired from the feature data generator 90 (i-th data generator). Each output result (i-th output result) is obtained by performing the above calculation.
- the weight updater 93 receives the first to Nth output results acquired from the weak learners 92, and the object information 96 which is the presence / absence information of the object in the collected learning data. Enter.
- the object information 96 indicates that the crossing pedestrian 70 is included in the learning sample image 74.
- the weight updater 93 selects one weak learner 92 that has obtained an output result that minimizes the amount of error from the output value corresponding to the object information 96, and updates the weighting coefficient ⁇ so as to increase.
- the quantity ⁇ is determined.
- the weighting calculator 94 updates the weighting coefficient ⁇ by adding the update amount ⁇ supplied from the weight updater 93.
- the sample load updater 95 updates a load (hereinafter referred to as a sample load 97) previously applied to the learning sample image 74 and the like based on the updated weighting coefficient ⁇ and the like.
- the learning data is input, the weighting coefficient ⁇ is updated, and the sample weight 97 is sequentially updated, and the discriminator 50 performs machine learning until the convergence condition is satisfied (step S15).
- An object identification unit 44 that can identify whether or not is constructed.
- the object identifying unit 44 may be configured to be able to identify not only the object but also an object excluding the object (for example, the road 60 in FIG. 3A and the like).
- step S22 the distance estimation unit 40 estimates the distance Dis from the vehicle 12 by calculating the elevation angle of the vehicle 12 using the captured image Im acquired in step S21.
- FIG. 11 is a schematic explanatory diagram showing the positional relationship between the vehicle 12, the camera 14, and the human body M.
- the vehicle 12 on which the camera 14 is mounted and the human body M as an object are present on a flat road surface S.
- a contact point between the human body M and the road surface S is Pc
- an optical axis of the camera 14 is L1
- a straight line connecting the optical center C of the camera 14 and the contact point Pc is L2.
- the angle (elevation angle) formed by the optical axis L1 of the camera 14 with respect to the road surface S is ⁇
- the angle formed by the straight line L2 with respect to the optical axis L1 is ⁇
- the height of the camera 14 with respect to the road surface S is Let it be Hc.
- the distance estimation unit 40 can estimate the distance Dis corresponding to each position of the road surface S (the road 60 in FIG. 3A) on the captured image Im.
- the distance estimation unit 40 considers the posture change between the road surface S and the camera 14 due to the motion of the vehicle 12, and estimates the distance Dis using a known method such as SfM (Structure from Motion). Also good.
- SfM Structure from Motion
- the vehicle periphery monitoring apparatus 10 is provided with a distance measuring sensor, you may measure the distance Dis using this.
- the identification target area determination unit 42 determines the size and the like of the identification target area 122, which is the image area to be identified. In the present embodiment, the identification target area determination unit 42 determines the size or the like of the identification target area 122 according to the distance Dis estimated in step S22 and / or the vehicle speed Vs acquired from the vehicle speed sensor 16. A specific example will be described with reference to FIG.
- the position on the captured image Im corresponding to the contact point Pc (see FIG. 11) between the road surface S (road 60) and the human body M (crossing pedestrian 64) is defined as a reference position 120.
- region 122 is set so that all the crossing pedestrians 64 may be included.
- the identification target area determination unit 42 determines the size of the identification target area 122 according to the distance Dis from the camera 14 (optical center C in FIG. 11) to the target using an arbitrary calculation formula including a linear function and a nonlinear function. decide. When it is assumed that the crossing pedestrian 64f illustrated by a broken line exists at the reference position 124 on the road surface S (the road 60), an identification target area 126 similar to the identification target area 122 is set.
- the relative magnitude relationship between the crossing pedestrian 64 (64f) and the identification target region 122 (126) is constant or substantially constant regardless of the distance Dis, a disturbance factor (other than the projected image of the target) The influence of the image information) can be uniformly suppressed, and as a result, the learning / identification accuracy of the object is further improved.
- the identification target area determination unit 42 also determines a designated area 128 that is a target range of a raster scan described later. For example, Dis1 that is a distance in which the object is reliably detected within the normal operation range is determined as the lower limit value, and Dis2 that is a distance that does not cause a collision with the object immediately within the normal operation range is set as the upper limit value. May be determined as As described above, by omitting scanning in a part of the captured image Im, not only can the calculation amount and calculation time of the identification process be reduced, but also erroneous detection itself that may occur in a region other than the designated region 128. Can be eliminated.
- the identification target region determination unit 42 may appropriately change the shape of the identification target region 122 according to the type of the target object.
- the size may be determined according to the distance Dis (for example, a value proportional to the distance Dis).
- the identification target area determination unit 42 may change the size of the identification target area 122 according to the height of the target object even at the same distance Dis. Thereby, it is possible to set an appropriate size according to the height of the object, and the learning / identification accuracy of the object is further improved by uniformly suppressing the influence of the disturbance factor.
- step S24 the arithmetic unit 30 starts a raster scan of the captured image Im within the designated area 128 determined in step S23.
- the raster scan refers to a method of successively identifying the presence or absence of an object while moving the reference position 120 (pixels in the captured image Im) in a predetermined direction.
- the identification target area determination unit 42 sequentially determines the reference position 120 currently being scanned and the position / size of the identification target area 122 identified from the reference position 120.
- step S25 the object identifying unit 44 identifies whether or not there is at least one kind of object in the determined identification target area 122.
- the object identifying unit 44 is a classifier 50 (see FIG. 7) generated using machine learning.
- the weighting calculator 94 an appropriate weighting coefficient ⁇ f obtained by machine learning (step S15 in FIG. 4) is preset.
- the object identifying unit 44 inputs an evaluation image 130 having an image area 80 including the identification target area 122 to each feature data generator 90 side.
- necessary image processing such as normalization processing (gradation processing / enlargement / reduction processing), alignment processing, etc. may be appropriately performed on the image of the identification target region 122.
- the object identification unit 44 includes each feature data generator 90, each weak learner 92, a weighting calculator 94, and an integrated learner that applies a step function to the weighted output result acquired from the weighting calculator 94.
- the evaluation image 130 is sequentially processed via 98, and an identification result indicating that the crossing pedestrian 64 exists in the identification target area 122 is output.
- the object discriminating unit 44 functions as a strong discriminator having high discrimination performance by combining N weak discriminators (weak learners 92).
- Each feature data generator 90 calculates an image feature amount (that is, the above-described feature data group) in each sub-region 82 using the same calculation method as in the learning process (see FIG. 7).
- HOG Heistograms of Oriented Gradient: luminance gradient direction histogram
- each block is defined below corresponding to each sub-region 82.
- the sub-region 82 as a block is composed of a total of 36 pixels 84, 6 pixels vertically and 6 pixels horizontally.
- a two-dimensional gradient (Ix, Iy) of luminance is calculated for each pixel 84 constituting the block.
- the gradient intensity I and the spatial luminance gradient angle ⁇ are calculated according to the following equations (2) and (3).
- I (Ix 2 + Iy 2 ) 1/2
- ⁇ tan ⁇ 1 (Iy / Ix) (3)
- each grid in the first row illustrates the direction of the planar luminance gradient.
- the gradient intensity I and the spatial luminance gradient angle ⁇ are calculated for all the pixels 84, but the illustration of arrows in the second and subsequent rows is omitted.
- a histogram for the spatial luminance gradient angle ⁇ is created for each block.
- the horizontal axis of the histogram is the spatial luminance gradient angle ⁇ (eight divisions in this example), and the vertical axis of the histogram is the gradient intensity I.
- a histogram for each block is created based on the gradient intensity I shown in Equation (2).
- the HOG feature amount of the evaluation image 130 is obtained by connecting the histogram for each block (spatial luminance gradient angle ⁇ in the example of FIG. 14C) in a predetermined order, for example, ascending order.
- the image feature quantity may include a luminance gradient direction histogram (HOG feature quantity) in space. Because the fluctuation of the luminance gradient direction ( ⁇ ) due to the exposure amount of imaging is small, it is possible to accurately capture the characteristics of the target object, and it is stable even in outdoor environments where the intensity of ambient light changes from moment to moment Identification accuracy can be obtained.
- HOG feature quantity luminance gradient direction histogram
- an STHOG Se-Temporal Histograms of Oriented Gradient
- a three-dimensional gradient (Ix, Iy, It) of luminance is calculated for each pixel 84 constituting the block using a plurality of captured images Im acquired in time series, as in FIG. 14B. Is done.
- the gradient intensity I and the temporal luminance gradient angle ⁇ are calculated according to the following equations (4) and (5).
- I (Ix 2 + Iy 2 + It 2 ) 1/2
- ⁇ tan ⁇ 1 ⁇ It / (Ix 2 + Iy 2 ) 1/2 ⁇ (5)
- an STHOG feature quantity which is a brightness gradient histogram in space-time is obtained by further connecting a histogram of the temporal brightness gradient angle ⁇ to the HOG feature quantity.
- ⁇ luminance gradient direction
- ⁇ luminance gradient direction
- the object identification unit 44 identifies whether or not at least one type of object exists in the determined identification object region 122 (step S25). Thereby, not only the kind of the object including the human body M but also the moving direction including the transverse direction and the facing direction are identified.
- step S26 the identification target area determination unit 42 determines whether or not all the scans in the designated area 128 have been completed. When it is determined that it is not completed (step S26: NO), the process proceeds to the next step (S27).
- the identification target area determination unit 42 changes the position or size of the identification target area 122. Specifically, the identification target area determination unit 42 moves the reference position 120 that is the scan target by a predetermined amount (for example, one pixel) in a predetermined direction (for example, the right direction). When the distance Dis changes, the size of the identification target area 122 is also changed. Furthermore, considering that the typical value of the body length or the body width varies depending on the type of the object, the identification target region determination unit 42 may change the size of the identification target region 122 according to the type of the target.
- step S26 the arithmetic unit 30 sequentially repeats steps S25 to S27 until all the scans in the designated area 128 are completed.
- step S26 the calculation unit 30 ends the raster scan of the captured image Im (step S28).
- step S29 the object detection unit 46 detects an object present in the captured image Im.
- the identification result for a single frame may be used, or the motion vector for the same object can be calculated by considering the identification results for a plurality of frames.
- step S30 the ECU 22 causes the storage unit 34 to store data necessary for the next calculation process. For example, the distance Dis created in step S22, the attribute of the object (crossing pedestrian 64 in FIG. 3A, etc.) obtained in step S25, the reference position 120, and the like can be given.
- the vehicle periphery monitoring device 10 can monitor an object (for example, the human body M in FIG. 11) existing in front of the vehicle 12 at a predetermined time interval.
- the vehicle periphery monitoring apparatus 10 is extracted with the camera 14 that acquires the captured image Im, and the identification target region determination unit 42 that extracts the identification target regions 122 and 126 from the acquired captured image Im.
- the object identification unit for identifying, for each type of object, whether or not an object (for example, a crossing pedestrian 64) exists in the identification object areas 122 and 126 from the image feature amounts in the identification object areas 122 and 126. 44.
- the target object identification unit 44 is a classifier 50 generated using machine learning that receives a feature data group as an image feature amount and outputs the presence / absence information of the target object.
- the discriminator 50 as the object discriminating means includes at least one sub selected according to the type of the object out of the plurality of sub areas 82 constituting the learning sample images 74 and 76 used for machine learning. Since the feature data group is created and input from the image of the region 82 (non-mask regions 102 and 108), the image information of the sub-region 82 suitable for the shape of the projected image of the object is obtained for the learning process. The learning accuracy can be improved regardless of the type of object. Further, by excluding the remaining sub-region 82 (mask regions 104 and 110) from the learning process, it is possible to prevent over-learning with respect to image information other than the projected image of the object that acts as a disturbance factor during the identification process. Yes, the identification accuracy of the object can be improved.
- the camera 14 is mounted on the vehicle 12 and acquires the captured image Im by capturing an image while the vehicle 12 is moving. Since the scene (background, weather, road surface pattern, etc.) of the captured image Im obtained from the camera 14 mounted on the vehicle 12 changes every moment, it is particularly effective.
- the identification target area determination unit 42 extracts and determines the identification target area 122 based on the position on the road 60 (the reference position 120 in FIG. 12), thereby detecting the same as when the scene is fixed. Accuracy can be maintained.
- the principle of maintaining accuracy in the detection process using the STHOG feature amount will be described with reference to FIG.
- time-series data 132 used for calculation of the STHOG feature value is obtained by extracting the identification target region 122 on the xy plane in time series.
- the reference position 120 included in each identification target region 122 corresponds to a position on the road 60 (FIG. 3A and the like). That is, the identification target area 122 as the vicinity area of the reference position 120 corresponds to a set of points that are completely or substantially stationary.
- the identification target region determination unit 42 extracts and determines the identification target region 122 having a size corresponding to the distance Dis, thereby determining the relative position between the target and the background portion and The time series data 132 having substantially the same size and having a stable image shape can be obtained.
- the above-described identification process is performed on the captured image Im obtained by the monocular camera (camera 14), but it goes without saying that the same effect can also be obtained with a compound eye camera (stereo camera). .
- the learning process and the identification process by the object identification unit 44 are performed separately, but both processes may be provided so as to be executed in parallel.
- the entire vehicle periphery monitoring device 10 is mounted on the vehicle 12, but any configuration may be used as long as at least imaging means is mounted.
- any configuration may be used as long as at least imaging means is mounted.
- the imaging signal output from the imaging unit is transmitted to a separate arithmetic processing unit (including the ECU 22) via the wireless communication unit, the same effect as that of the present embodiment can be obtained.
- the object detection device is applied to the vehicle 12, but is not limited to this, and may be applied to other types of moving objects (for example, ships, aircraft, artificial satellites, etc.). Good. Needless to say, even when the object detection device is fixedly arranged, a certain effect of improving the learning / identification accuracy of the object can be obtained.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Signal Processing (AREA)
- Traffic Control Systems (AREA)
- Image Analysis (AREA)
- Closed-Circuit Television Systems (AREA)
Abstract
The present invention relates to a subject detection device. A subject identification unit is an identifier in which a feature data group representing an image feature amount is the input and information indicating whether or not a subject (crossing pedestrian) is present is the output, the identifier being generated by machine learning. The identifier produces a feature data group from an image of at least one subregion (82) (non-masked region (102)) selected according to the type of subject from a plurality of subregions (82) constituting each of the learning sample images used for machine learning, and inputs the feature data group.
Description
この発明は、撮像手段から取得した撮像画像に基づいて対象物を検知する対象物検知装置に関する。
The present invention relates to an object detection device that detects an object based on a captured image acquired from an imaging means.
従来から、カメラ等の撮像手段を用いて撮像し、得られた撮像画像を基に物体を検知等する方法が種々提案されている。例えば、画像の局所領域での輝度の強度及び勾配に関する特徴量(いわゆるHOG特徴量)を用いて、撮像画像に映りこんだ歩行者を検知等する手法が記載されている。
Conventionally, various methods have been proposed in which an image is picked up using an image pickup means such as a camera, and an object is detected based on the obtained picked-up image. For example, a technique for detecting a pedestrian reflected in a captured image using a feature amount (so-called HOG feature amount) related to intensity and gradient of luminance in a local region of the image is described.
この手法は、“Histograms of Oriented Gradients for Human Detection”、IEEE Conference on Computer Vision and Pattern Recognition、Vol.1、pp.886-893(2005)、に記載されている。
This technique is described in “Histograms of Oriented Gradients for Human Detection”, IEEE Conference on Computer Vision and Pattern Recognition, Vol. 1, pp. 886-893 (2005).
そして、上記の撮像手段と、撮像画像を基に物体を識別可能な識別器とを車両(移動体の一形態)に併せて搭載することで、その車両外に存在する物体を検知し、あるいは物体と自車両との相対的な位置関係等を監視する車両周辺監視装置が構築される。この装置を導入することにより、車両の乗員(特に運転者)の支援になる。
And, by mounting the above-described imaging means and an identifier capable of identifying an object based on the captured image together with the vehicle (one form of the moving body), the object existing outside the vehicle is detected, or A vehicle periphery monitoring device that monitors the relative positional relationship between the object and the host vehicle is constructed. By introducing this device, the vehicle occupant (especially the driver) is supported.
上記した識別器は、種々の機械学習をさせることで、歩行者(人体)のみならず、動物、人工構造物等を識別可能である。ところが、これらの対象物を少なくとも含む、識別対象の画像領域(以下、識別対象領域)を抽出する場合、対象物の種類によって識別精度が異なる場合があった。これは、対象物の種類に応じて投影像の形状が異なるので、背景部の画像情報を取り込む割合が変化するためと考えられる。換言すれば、背景部の画像情報は、学習処理又は識別処理の際に、対象物の学習・識別精度を低下させる外乱因子(ノイズ情報)として作用する。
The above discriminator can discriminate not only pedestrians (human bodies) but also animals, artificial structures, etc. by performing various machine learning. However, when extracting an image area (hereinafter referred to as an identification target area) that includes at least these objects, the identification accuracy may differ depending on the type of the object. This is probably because the shape of the projected image varies depending on the type of the object, and the ratio of capturing the image information of the background portion changes. In other words, the image information of the background portion acts as a disturbance factor (noise information) that lowers the learning / identification accuracy of the object during the learning process or the identification process.
特に、この種の装置では車両の走行中に逐次撮像するため、カメラから得た撮像画像のシーン(背景・天候等)は時々刻々と変化し得る。また、車両が路面上を走行した場合に取得される識別対象(歩行者等)の画像には、横断歩道、ガードレール等の路面パターンが含まれることが多い。このため、識別器に機械学習をさせる結果、この路面パターンが歩行者であるとして誤って学習する可能性が高くなる。すなわち、背景部における画像特徴を予測することは困難であり、外乱因子としての影響を無視できないという問題があった。
In particular, since this type of device sequentially captures images while the vehicle is running, the scene (background, weather, etc.) of the captured image obtained from the camera can change from moment to moment. Moreover, road surface patterns such as pedestrian crossings and guardrails are often included in images of identification objects (pedestrians and the like) acquired when a vehicle travels on a road surface. For this reason, as a result of causing the classifier to perform machine learning, there is a high possibility that the road surface pattern is erroneously learned as a pedestrian. That is, it is difficult to predict the image feature in the background portion, and there is a problem that the influence as a disturbance factor cannot be ignored.
本発明は上記した問題を解決するためになされたもので、対象物の種類にかかわらず学習・識別精度を向上可能な対象物検知装置を提供することを目的とする。
The present invention has been made to solve the above-described problems, and an object of the present invention is to provide an object detection apparatus that can improve learning / identification accuracy regardless of the type of the object.
本発明に係る対象物検知装置は、撮像画像を取得する撮像手段と、前記撮像手段により取得された前記撮像画像の中から識別対象領域を抽出する識別対象領域抽出手段と、前記識別対象領域抽出手段により抽出された前記識別対象領域における画像特徴量から、前記識別対象領域内に対象物が存在するか否かを前記対象物の種類毎に識別する対象物識別手段とを備え、前記対象物識別手段は、前記画像特徴量としての特徴データ群を入力とし前記対象物の存否情報を出力とする、機械学習を用いて生成された識別器であり、前記識別器は、前記機械学習に供される各学習サンプル画像を構成する複数のサブ領域のうち、前記対象物の種類に応じて選択した少なくとも1つの前記サブ領域の画像から、前記特徴データ群を作成し入力する。
An object detection apparatus according to the present invention includes an imaging unit that acquires a captured image, an identification target region extraction unit that extracts an identification target region from the captured image acquired by the imaging unit, and the identification target region extraction Object identifying means for identifying, for each type of the object, whether or not the object exists in the identification target area from the image feature amount extracted in the identification target area by the means, The discriminating means is a discriminator generated using machine learning that receives a feature data group as the image feature amount and outputs the presence / absence information of the object, and the discriminator is used for the machine learning. The feature data group is created and input from images of at least one of the sub-regions selected according to the type of the object among the plurality of sub-regions constituting each learning sample image.
このように、対象物識別手段としての識別器は、機械学習に供される各学習サンプル画像を構成する複数のサブ領域のうち、対象物の種類に応じて選択した少なくとも1つのサブ領域の画像から、前記特徴データ群を作成し入力するようにしたので、前記対象物の投影像の形状に適したサブ領域の画像情報を、学習処理に対して選択的に採用可能であり、対象物の種類にかかわらず学習精度を向上させることができる。そして、残余のサブ領域を学習処理から除外することで、識別処理の際に外乱因子として作用する、対象物の投影像以外の画像情報に対する過学習を防止可能であり、前記対象物の識別精度を向上させることができる。
Thus, the discriminator as the object discriminating means is an image of at least one sub-area selected according to the type of the object out of a plurality of sub-areas constituting each learning sample image used for machine learning. Since the feature data group is created and input, image information of sub-regions suitable for the shape of the projected image of the object can be selectively employed for the learning process, The learning accuracy can be improved regardless of the type. Then, by excluding the remaining sub-regions from the learning process, it is possible to prevent over-learning for image information other than the projected image of the object that acts as a disturbance factor during the identification process, and the identification accuracy of the object Can be improved.
また、前記対象物識別手段は、前記識別対象領域内に前記対象物が存在するか否かを該対象物の移動方向毎に識別することが好ましい。移動方向に応じて撮像画像上での像形状が変化する傾向を踏まえ、移動方向毎に識別することで対象物の学習・識別精度が一層向上する。
Further, it is preferable that the object identifying means identifies whether or not the object exists in the identification object area for each moving direction of the object. Based on the tendency that the image shape on the captured image changes according to the moving direction, the learning / identification accuracy of the object is further improved by identifying each moving direction.
また、前記識別対象領域抽出手段は、前記撮像手段から前記対象物までの距離に応じたサイズの前記識別対象領域を抽出することが好ましい。例えば、対象物と識別対象領域との間の相対的大小関係を距離によらず一定又は略一定にすることで、外乱因子(対象物の投影像以外の画像情報)の影響を一律に抑制可能になり、その結果、対象物の学習・識別精度が一層向上する。
Further, it is preferable that the identification target area extracting unit extracts the identification target area having a size corresponding to a distance from the imaging unit to the target. For example, by making the relative magnitude relationship between the target object and the identification target area constant or substantially constant regardless of the distance, the influence of disturbance factors (image information other than the projected image of the target object) can be suppressed uniformly. As a result, the learning / identification accuracy of the object is further improved.
また、前記画像特徴量には、空間上での輝度勾配方向ヒストグラムが含まれることが好ましい。撮像の露光量に起因する輝度勾配方向の変動は小さいので、対象物の特徴を的確に捉えることが可能であり、環境光の強度が時々刻々と変化する屋外環境であっても安定した識別精度が得られる。
The image feature amount preferably includes a luminance gradient direction histogram in space. Because the fluctuation in the brightness gradient direction due to the exposure amount of imaging is small, it is possible to accurately grasp the characteristics of the target object, and stable identification accuracy even in outdoor environments where the intensity of ambient light changes from moment to moment Is obtained.
また、前記画像特徴量には、時空間上での輝度勾配方向ヒストグラムが含まれることが好ましい。空間上のみならず、時間上の輝度勾配方向も併せて考慮することで、時系列で取得された複数の撮像画像にわたる対象物の検知・追跡が容易になる。
Also, it is preferable that the image feature amount includes a luminance gradient direction histogram in time and space. By considering the luminance gradient direction in time as well as in space, it becomes easy to detect and track an object over a plurality of captured images acquired in time series.
また、前記撮像手段は、移動体に搭載され、且つ、該移動体の移動中に撮像することで前記撮像画像を取得することが好ましい。移動体に搭載された撮像手段から得た撮像画像のシーンは時々刻々と変化するため、特に効果的である。
Further, it is preferable that the imaging unit is mounted on a moving body and acquires the captured image by capturing an image while the moving body is moving. Since the scene of the captured image obtained from the imaging means mounted on the moving body changes every moment, it is particularly effective.
本発明に係る対象物検知装置によれば、対象物識別手段としての識別器は、機械学習に供される各学習サンプル画像を構成する複数のサブ領域のうち、対象物の種類に応じて選択した少なくとも1つのサブ領域の画像から、前記特徴データ群を作成し入力するようにしたので、前記対象物の投影像の形状に適したサブ領域の画像情報を、学習処理に対して選択的に採用可能であり、対象物の種類にかかわらず学習精度を向上させることができる。そして、残余のサブ領域を学習処理から除外することで、識別処理の際に外乱因子として作用する、対象物の投影像以外の画像情報に対する過学習を防止可能であり、前記対象物の識別精度を向上させることができる。特に、移動体に搭載された撮像手段から得た撮像画像のシーンは時々刻々と変化するため、特に効果的である。
According to the object detection device of the present invention, the classifier as the object identification unit is selected according to the type of the object from among a plurality of sub-regions constituting each learning sample image provided for machine learning. Since the feature data group is created and input from the at least one sub-region image, image information of the sub-region suitable for the shape of the projected image of the object is selectively selected for the learning process. The learning accuracy can be improved regardless of the type of the object. Then, by excluding the remaining sub-regions from the learning process, it is possible to prevent over-learning for image information other than the projected image of the object that acts as a disturbance factor during the identification process, and the identification accuracy of the object Can be improved. In particular, since the scene of the captured image obtained from the imaging means mounted on the moving body changes every moment, it is particularly effective.
以下、本発明に係る対象物検知装置について好適な実施形態を挙げ、添付の図面を参照しながら説明する。
Hereinafter, preferred embodiments of the object detection device according to the present invention will be described with reference to the accompanying drawings.
[車両周辺監視装置10の構成]
図1は、本実施形態に係る対象物検知装置としての車両周辺監視装置10の構成を示すブロック図である。図2は、図1に示す車両周辺監視装置10が搭載された車両12の概略斜視図である。 [Configuration of Vehicle Perimeter Monitoring Device 10]
FIG. 1 is a block diagram illustrating a configuration of a vehicleperiphery monitoring device 10 as an object detection device according to the present embodiment. FIG. 2 is a schematic perspective view of the vehicle 12 on which the vehicle periphery monitoring device 10 shown in FIG. 1 is mounted.
図1は、本実施形態に係る対象物検知装置としての車両周辺監視装置10の構成を示すブロック図である。図2は、図1に示す車両周辺監視装置10が搭載された車両12の概略斜視図である。 [Configuration of Vehicle Perimeter Monitoring Device 10]
FIG. 1 is a block diagram illustrating a configuration of a vehicle
図1及び図2に示すように、車両周辺監視装置10は、複数のカラーチャンネルからなるカラー画像(以下、撮像画像Imという)を撮像するカラーカメラ(以下、単に「カメラ14」という)と、車両12の車速Vsを検出する車速センサ16と、車両12のヨーレートYrを検出するヨーレートセンサ18と、運転者によるブレーキペダルの操作量Brを検出するブレーキセンサ20と、この車両周辺監視装置10を制御する電子制御装置(以下、「ECU22」という)と、音声で警報等を発するためのスピーカ24と、カメラ14から出力された撮像画像等を表示する表示装置26と、を備える。
As shown in FIGS. 1 and 2, the vehicle periphery monitoring apparatus 10 includes a color camera (hereinafter simply referred to as “camera 14”) that captures a color image (hereinafter referred to as a captured image Im) including a plurality of color channels, and A vehicle speed sensor 16 for detecting the vehicle speed Vs of the vehicle 12, a yaw rate sensor 18 for detecting the yaw rate Yr of the vehicle 12, a brake sensor 20 for detecting the brake pedal operation amount Br by the driver, and the vehicle periphery monitoring device 10 An electronic control device (hereinafter referred to as “ECU 22”) to be controlled, a speaker 24 for issuing an alarm or the like by sound, and a display device 26 for displaying a captured image output from the camera 14 are provided.
カメラ14は、主に可視光領域の波長を有する光を利用するカメラであり、車両12の周辺を撮像する撮像手段として機能する。カメラ14は、被写体の表面を反射する光量が多いほど、その出力信号レベルが高くなり、画像の輝度(例えば、RGB値)が増加する特性を有する。図2に示すように、カメラ14は、車両12の前部バンパー部の略中心部に固定的に配置(搭載)されている。
The camera 14 is a camera that mainly uses light having a wavelength in the visible light region, and functions as an imaging unit that images the periphery of the vehicle 12. The camera 14 has a characteristic that the output signal level increases as the amount of light reflected on the surface of the subject increases, and the luminance (for example, RGB value) of the image increases. As shown in FIG. 2, the camera 14 is fixedly disposed (mounted) at a substantially central portion of the front bumper portion of the vehicle 12.
なお、車両12の周囲を撮像する撮像手段は、上記した構成例(いわゆる単眼カメラ)に限られることなく、例えば複眼カメラ(ステレオカメラ)であってもよい。また、カラーカメラに代替して赤外線カメラを用いてもよく、或いは両方を併せ備えてもよい。さらに、単眼カメラの場合、別の測距手段(レーダ装置)を併せて備えてもよい。
The imaging means for imaging the periphery of the vehicle 12 is not limited to the above configuration example (so-called monocular camera), and may be, for example, a compound eye camera (stereo camera). Further, an infrared camera may be used instead of the color camera, or both may be provided. Further, in the case of a monocular camera, another ranging means (radar apparatus) may be provided.
図1に戻って、スピーカ24は、ECU22からの指令に応じて、警報音等の出力を行う。スピーカ24は、車両12の図示しないダッシュボードに設けられる。或いは、スピーカ24に代替して、他の装置(例えば、オーディオ装置又はナビゲーション装置)が備える音声出力機能を用いてもよい。
Returning to FIG. 1, the speaker 24 outputs an alarm sound or the like in response to a command from the ECU 22. The speaker 24 is provided on a dashboard (not shown) of the vehicle 12. Alternatively, instead of the speaker 24, an audio output function provided in another device (for example, an audio device or a navigation device) may be used.
表示装置26(図1及び図2参照)は、車両12のフロントウインドシールド上、運転者の前方視界を妨げない位置に配されたHUD(ヘッドアップディスプレイ)である。表示装置26として、HUDに限らず、車両12に搭載されたナビゲーションシステムの地図等を表示するディスプレイや、メータユニット内等に設けられた燃費等を表示するディスプレイ(MID;マルチインフォメーションディスプレイ)を利用することができる。
The display device 26 (see FIGS. 1 and 2) is a HUD (head-up display) arranged on the front windshield of the vehicle 12 at a position that does not obstruct the driver's front view. The display device 26 is not limited to the HUD, but a display that displays a map or the like of a navigation system mounted on the vehicle 12 or a display (MID; multi-information display) that displays fuel consumption or the like provided in a meter unit or the like is used. can do.
ECU22は、入出力部28、演算部30、表示制御部32、及び記憶部34を基本的に備える。
The ECU 22 basically includes an input / output unit 28, a calculation unit 30, a display control unit 32, and a storage unit 34.
カメラ14、車速センサ16、ヨーレートセンサ18及びブレーキセンサ20からの各信号は、入出力部28を介してECU22側に入力される。また、ECU22からの各信号は、入出力部28を介してスピーカ24及び表示装置26側に出力される。入出力部28は、入力されたアナログ信号をデジタル信号に変換する図示しないA/D変換回路を備える。
Each signal from the camera 14, the vehicle speed sensor 16, the yaw rate sensor 18, and the brake sensor 20 is input to the ECU 22 side via the input / output unit 28. Each signal from the ECU 22 is output to the speaker 24 and the display device 26 via the input / output unit 28. The input / output unit 28 includes an A / D conversion circuit (not shown) that converts an input analog signal into a digital signal.
演算部30は、カメラ14、車速センサ16、ヨーレートセンサ18及びブレーキセンサ20からの各信号に基づく演算を実行し、演算結果に基づきスピーカ24及び表示装置26に対する信号を生成する。演算部30は、距離推定部40、識別対象領域決定部42(識別対象領域抽出手段)、対象物識別部44(対象物識別手段)、及び対象物検知部46として機能する。ここで、対象物識別部44は、画像特徴量としての特徴データ群を入力とし対象物の存否情報を出力とする、機械学習を用いて生成された識別器50で構成される。
The calculation unit 30 performs calculations based on the signals from the camera 14, the vehicle speed sensor 16, the yaw rate sensor 18, and the brake sensor 20, and generates signals for the speaker 24 and the display device 26 based on the calculation results. The calculation unit 30 functions as a distance estimation unit 40, an identification target region determination unit 42 (identification target region extraction unit), an object identification unit 44 (subject identification unit), and an object detection unit 46. Here, the target object identification unit 44 is configured by a classifier 50 generated using machine learning that receives a feature data group as an image feature amount and outputs the presence / absence information of the target object.
演算部30における各部の機能は、記憶部34に記憶されているプログラムを読み出して実行することにより実現される。或いは、前記プログラムは、図示しない無線通信装置(携帯電話機、スマートフォン等)を介して外部から供給されてもよい。
The function of each unit in the calculation unit 30 is realized by reading and executing a program stored in the storage unit 34. Alternatively, the program may be supplied from the outside via a wireless communication device (mobile phone, smartphone, etc.) not shown.
表示制御部32は、表示装置26を駆動制御する制御回路である。表示制御部32が、入出力部28を介して、表示制御に供される信号を表示装置26に出力することで、表示装置26が駆動する。これにより、表示装置26は各種画像(撮像画像Im、マーク等)を表示することができる。
The display control unit 32 is a control circuit that drives and controls the display device 26. The display control unit 32 drives the display device 26 by outputting a signal used for display control to the display device 26 via the input / output unit 28. Thereby, the display apparatus 26 can display various images (captured image Im, a mark, etc.).
記憶部34は、コンピュータ読み取り可能であって、且つ、非一過性の記憶媒体である。記憶部34は、デジタル信号に変換された撮像信号、各種演算処理に供される一時データ等を記憶するRAM(Random Access Memory)、及び実行プログラム、テーブル又はマップ等を記憶するROM(Read Only Memory)等で構成される。
The storage unit 34 is a computer-readable and non-transitory storage medium. The storage unit 34 includes an imaging signal converted into a digital signal, a RAM (Random Access Memory) that stores temporary data used for various arithmetic processes, and a ROM (Read Only Memory) that stores an execution program, a table, a map, or the like. ) Etc.
[車両周辺監視装置10の動作の概要]
本実施形態に係る車両周辺監視装置10は、基本的には、以上のように構成される。この車両周辺監視装置10の動作の概要について以下説明する。 [Outline of operation of vehicle periphery monitoring apparatus 10]
The vehicleperiphery monitoring apparatus 10 according to the present embodiment is basically configured as described above. An outline of the operation of the vehicle periphery monitoring device 10 will be described below.
本実施形態に係る車両周辺監視装置10は、基本的には、以上のように構成される。この車両周辺監視装置10の動作の概要について以下説明する。 [Outline of operation of vehicle periphery monitoring apparatus 10]
The vehicle
ECU22は、所定のフレームクロック間隔・周期(例えば、1秒あたり30フレーム)毎に、カメラ14から出力されるアナログの映像信号をデジタル信号に変換し、記憶部34に一時的に取り込む。
The ECU 22 converts an analog video signal output from the camera 14 into a digital signal at a predetermined frame clock interval / cycle (for example, 30 frames per second) and temporarily captures it in the storage unit 34.
図3Aに示すように、撮像時点T=T1において、第1フレームの撮像画像Imが得られたとする。本図例では、撮像画像Imにおいて、車両12が走行する道路領域(以下、単に「道路60」という)、道路60に沿って略等間隔に設置された複数の電柱領域(以下、単に「電柱62」という)、道路60上の横断歩行者領域(以下、単に「横断歩行者64」という)がそれぞれ存在する。
As shown in FIG. 3A, assume that a captured image Im of the first frame is obtained at an imaging time point T = T1. In the illustrated example, in the captured image Im, a road area (hereinafter simply referred to as “road 60”) on which the vehicle 12 travels, and a plurality of utility pole areas (hereinafter simply referred to as “electric poles”) installed along the road 60 at substantially equal intervals. 62 ”) and a crossing pedestrian area on the road 60 (hereinafter simply referred to as“ crossing pedestrian 64 ”).
図3Bに示すように、撮像時点T=T2(>T1)において、第2フレームの撮像画像Imが得られたとする。本図の撮像画像Imにおいて、図3Aに示す撮像画像Imの場合と同様に、道路60、複数の電柱62、及び横断歩行者64がそれぞれ存在する。ところが、フレーム間隔時間が経過するにつれて、カメラ14及び各対象物の間の相対的位置関係が時々刻々と変化する。これにより、第1及び第2フレーム(撮像画像Im)が同一の画角範囲であっても、各対象物は、それぞれ異なる形態(形状、大きさ又は色彩)で画像化される。
As shown in FIG. 3B, it is assumed that a captured image Im of the second frame is obtained at an imaging time point T = T2 (> T1). In the captured image Im of this figure, as in the captured image Im shown in FIG. 3A, a road 60, a plurality of utility poles 62, and a crossing pedestrian 64 exist. However, as the frame interval time elapses, the relative positional relationship between the camera 14 and each object changes every moment. Thereby, even if the first and second frames (captured image Im) are in the same field angle range, each object is imaged in a different form (shape, size, or color).
そして、ECU22は、記憶部34から読み出した撮像画像Im(車両12の前方画像)に対して各種演算処理を施す。ECU22(特に演算部30)は、撮像画像Imに対する処理結果、必要に応じて車両12の走行状態を示す各信号(車速Vs、ヨーレートYr及び操作量Br)を総合的に考慮し、車両12の前方に存在する人体、動物等を、監視対象となる物体(以下、「監視対象物」あるいは単に「対象物」という。)として検出する。
And ECU22 performs various arithmetic processing with respect to the captured image Im (front image of the vehicle 12) read from the memory | storage part 34. FIG. The ECU 22 (particularly the arithmetic unit 30) comprehensively considers the processing results for the captured image Im, and signals (vehicle speed Vs, yaw rate Yr, and operation amount Br) that indicate the traveling state of the vehicle 12 as necessary. A human body, an animal, or the like existing in front is detected as an object to be monitored (hereinafter referred to as “monitoring object” or simply “object”).
車両12が監視対象物に接触する可能性が高いと演算部30により判断された場合、ECU22は、運転者の注意を喚起するために車両周辺監視装置10の各出力部を制御する。ECU22は、例えば、スピーカ24を介して警報音(例えば、ピッ、ピッ、…と鳴る音)を出力させるともに、表示装置26上に可視化された撮像画像Imのうちその監視対象物の部位を強調表示させる。
When the calculation unit 30 determines that there is a high possibility that the vehicle 12 is in contact with the monitoring target, the ECU 22 controls each output unit of the vehicle periphery monitoring device 10 in order to call the driver's attention. For example, the ECU 22 outputs an alarm sound (for example, a beeping sound) via the speaker 24 and emphasizes the part of the monitored object in the captured image Im visualized on the display device 26. Display.
[識別器50の学習処理]
続いて、図1に示す識別器50による学習処理について、図4のフローチャートを参照しながら詳細に説明する。機械学習の手法として、教師あり学習、教師なし学習、強化学習のうちのいずれのアルゴリズムを採用してもよい。本実施形態では、事前に与えられたデータセットに基づいて学習を行う「教師あり学習」について具体例を挙げて説明する。 [Learning process of discriminator 50]
Next, the learning process by thediscriminator 50 shown in FIG. 1 will be described in detail with reference to the flowchart of FIG. As a machine learning method, any algorithm of supervised learning, unsupervised learning, and reinforcement learning may be employed. In the present embodiment, “supervised learning” in which learning is performed based on a data set given in advance will be described with a specific example.
続いて、図1に示す識別器50による学習処理について、図4のフローチャートを参照しながら詳細に説明する。機械学習の手法として、教師あり学習、教師なし学習、強化学習のうちのいずれのアルゴリズムを採用してもよい。本実施形態では、事前に与えられたデータセットに基づいて学習を行う「教師あり学習」について具体例を挙げて説明する。 [Learning process of discriminator 50]
Next, the learning process by the
ステップS11において、機械学習に供される学習データが収集される。ここで、学習データは、対象物を含む(又は含まない)学習サンプル画像と、この対象物の種類(「対象物無し」の属性を含む)とのデータセットである。対象物の種類として、例えば、人体、各種動物(具体的には、鹿、馬、羊、犬、猫等の哺乳動物、鳥類等)、人工構造物(具体的には、車両、標識、電柱、ガードレール、壁等)等が挙げられる。
In step S11, learning data used for machine learning is collected. Here, the learning data is a data set of a learning sample image including (or not including) the target object and the type of the target object (including the attribute “no target object”). Examples of types of objects include human bodies, various animals (specifically, mammals such as deer, horses, sheep, dogs and cats, birds, etc.), artificial structures (specifically, vehicles, signs, utility poles) , Guardrails, walls, etc.).
図5は、横断歩行者70、72を含む学習サンプル画像74、76の一例を示す画像図である。学習サンプル画像74の略中央には、左側から右側に向かって横断歩行する人体の投影像(以下、横断歩行者70)が映し出されている。また、別の学習サンプル画像76の略中央には、右手前側から左奥側に向かって横断歩行する人体の投影像(以下、横断歩行者72)が映し出されている。本図例の学習サンプル画像74、76はいずれも、対象物が含まれる正解画像に相当する。収集される各学習サンプル画像には、対象物が含まれない不正解画像が含まれてもよい。また、識別器50(図1参照)側に入力されるデータ形式を統一するため、各学習サンプル画像74、76は、それらの形状が互いに一致又は相似する画像領域80を有することが好ましい。
FIG. 5 is an image diagram showing an example of learning sample images 74 and 76 including crossing pedestrians 70 and 72. At the approximate center of the learning sample image 74, a projected image of a human body that crosswalks from the left side to the right side (hereinafter referred to as a crossing pedestrian 70) is displayed. In addition, a projected image of a human body that crosswalks from the right front side to the left back side (hereinafter referred to as crossing pedestrian 72) is displayed at the approximate center of another learning sample image 76. Each of the learning sample images 74 and 76 in this example corresponds to a correct image including the object. Each learning sample image collected may include an incorrect image that does not include an object. Further, in order to unify the data format input to the discriminator 50 (see FIG. 1) side, it is preferable that the learning sample images 74 and 76 have an image region 80 whose shapes match or are similar to each other.
ステップS12において、各学習サンプル画像74、76の画像領域80を分割することで、複数のサブ領域82がそれぞれ定義される。図6例では、矩形状の画像領域80は、そのサイズにかかわらず、行数が8つ、列数が6つとして、格子状に均等分割されている。すなわち、画像領域80内において、同一の形状を有する、48個のサブ領域82がそれぞれ定義される。
In Step S12, a plurality of sub-regions 82 are defined by dividing the image region 80 of each of the learning sample images 74 and 76, respectively. In the example of FIG. 6, the rectangular image region 80 is equally divided into a lattice shape with eight rows and six columns regardless of the size. That is, 48 sub-regions 82 having the same shape are defined in the image region 80, respectively.
ステップS13において、識別器50の学習アーキテクチャが構築される。学習アーキテクチャの例として、ブースティング法、SVM(Support Vector machine)、ニューラルネットワーク、EM(Expectation Maximization)アルゴリズム等が挙げられる。本実施形態では、ブースティング法の一種であるAdaBoostを適用する。
In step S13, the learning architecture of the classifier 50 is constructed. Examples of the learning architecture include boosting method, SVM (Support Vector Machine), neural network, EM (Expectation Maximization) algorithm and the like. In this embodiment, AdaBoost, which is a kind of boosting method, is applied.
図7に示すように、識別器50は、N個(Nは2以上の自然数)の特徴データ生成器90と、N個の弱学習器92と、重み付け更新器93と、重み付け演算器94と、サンプル荷重更新器95とから構成される。
As shown in FIG. 7, the discriminator 50 includes N (N is a natural number of 2 or more) feature data generators 90, N weak learners 92, a weight updater 93, and a weight calculator 94. And a sample load updater 95.
なお、本図において、説明の便宜のため、各特徴データ生成器90に対して上から順に、第1データ生成器、第2データ生成器、第3データ生成器、‥、第Nデータ生成器と表記している。同様に、各弱学習器92に対して上から順に、第1弱学習器、第2弱学習器、第3弱学習器、‥、第N弱学習器と表記している。
In this figure, for convenience of explanation, the first data generator, the second data generator, the third data generator,..., The Nth data generator are sequentially arranged from the top with respect to each feature data generator 90. It is written. Similarly, the first weak learner, the second weak learner, the third weak learner,..., And the Nth weak learner are written in order from the top for each weak learner 92.
本図に示すように、1つの特徴データ生成器90(例えば、第1データ生成器)及び1つの弱学習器92(例えば、第1弱学習器)が択一的に接続されることで、N個のサブシステムが構築されている。そして、重み付け更新器93の入力側にはN個のサブシステムの出力側(各弱学習器92)がそれぞれ接続され、その出力側には、重み付け演算器94及びサンプル荷重更新器95が直列に接続されている。なお、機械学習の際における、識別器50の詳細な動作については後述する。
As shown in the figure, one feature data generator 90 (for example, the first data generator) and one weak learner 92 (for example, the first weak learner) are alternatively connected, N subsystems are constructed. The output side (each weak learner 92) of N subsystems is connected to the input side of the weight updater 93, and the weighting calculator 94 and the sample load updater 95 are connected in series to the output side. It is connected. The detailed operation of the classifier 50 during machine learning will be described later.
ステップS14において、対象物の種類毎に、サブ領域82のマスク条件が決定される。ここで、マスク条件とは、1つの学習サンプル画像74からN個の特徴データ(以下、特徴データ群という)を生成する際に、各サブ領域82内の画像を採用するか否かについての選択条件を意味する。
In step S14, the mask conditions for the sub-region 82 are determined for each type of object. Here, the mask condition is a selection as to whether or not to adopt an image in each sub-region 82 when N feature data (hereinafter referred to as a feature data group) is generated from one learning sample image 74. Means a condition.
例えば、車両12にカメラ14を搭載して実際に得た撮像画像Imを、学習サンプル画像74、76(正解画像)として用いる場合、対象物の背後に種々の構造物が映り込むことがある。図5例に示す学習サンプル画像74において、横断歩行者70を除いた背景部78には、別の3名の人体、路面、建物の壁等が映し出されている。
For example, when the captured image Im actually obtained by mounting the camera 14 on the vehicle 12 is used as the learning sample images 74 and 76 (correct images), various structures may be reflected behind the object. In the learning sample image 74 shown in FIG. 5, the background portion 78 excluding the crossing pedestrian 70 shows another three human bodies, a road surface, a building wall, and the like.
機械学習の際、横断歩行者70のみならず、背景部78も含めて特徴データ群を作成すると、学習サンプル画像74等を蓄積するデータベースの傾向によって背景部78に関する過学習が起こることがある。この場合、背景部78の画像情報は、対象物としての横断歩行者70の学習・識別精度を低下させる外乱因子(ノイズ情報)として作用する。そこで、48個すべてのサブ領域82のうち、識別処理に適したサブ領域82のみを選択し、作成された特徴データ群を用いて学習させることが効果的である。
In machine learning, if a feature data group including not only the crossing pedestrian 70 but also the background portion 78 is created, over-learning related to the background portion 78 may occur due to the tendency of the database storing the learning sample images 74 and the like. In this case, the image information of the background part 78 acts as a disturbance factor (noise information) that reduces the learning / identification accuracy of the crossing pedestrian 70 as the object. Therefore, it is effective to select only the sub-region 82 suitable for the identification process from among all 48 sub-regions 82 and to learn using the created feature data group.
図8Aは、横断歩行者70、72を含む多数の学習サンプル画像74、76から得た、典型輪郭画像100を表す模式図である。具体的には、図示しない画像処理装置を用いて、各学習サンプル画像74等に対して公知のエッジ抽出処理を施すことで、各オブジェクトの輪郭が抽出された輪郭抽出画像(図示しない。)をそれぞれ作成する。ここで、各輪郭抽出画像は、オブジェクトの輪郭が抽出された部位を白色で表現するとともに、輪郭が抽出されなかった部位を黒色で表現する。
FIG. 8A is a schematic diagram showing a typical contour image 100 obtained from a large number of learning sample images 74 and 76 including crossing pedestrians 70 and 72. Specifically, a contour extraction image (not shown) in which the contour of each object is extracted by performing known edge extraction processing on each learning sample image 74 or the like using an image processing device (not shown). Create each one. Here, each contour extraction image represents a part from which the contour of the object has been extracted in white, and represents a part from which the outline has not been extracted in black.
そして、図示しない画像処理装置を用いて、各輪郭抽出画像に対して種々の統計処理(例えば、平均化処理)を施すことで、各学習サンプル画像74に典型的な輪郭画像(典型輪郭画像100)を得る。ここで、典型輪郭画像100は、各輪郭抽出画像と同様に、オブジェクトの輪郭が抽出された部位を白色で表現するとともに、輪郭が抽出されなかった部位を黒色で表現する。すなわち、典型輪郭画像100は、多数の学習サンプル画像74等に共通して含まれる対象物(横断歩行者70、72)の輪郭を表す画像に相当する。
Then, by using various image processing (for example, averaging processing) for each contour extraction image using an image processing device (not shown), a contour image typical for each learning sample image 74 (typical contour image 100). ) Here, similar to each contour extraction image, the typical contour image 100 represents a part from which the contour of the object has been extracted in white, and represents a part from which the contour has not been extracted in black. That is, the typical contour image 100 corresponds to an image representing the contour of an object (crossing pedestrians 70 and 72) included in common in a large number of learning sample images 74 and the like.
例えば、輪郭特徴量が所定の閾値を超えたサブ領域82を演算対象として採用し、輪郭特徴量が所定の閾値を下回ったサブ領域82を演算対象から除外する。その結果、図8Bに示すように、画像領域80内にそれぞれ定義された48個のサブ領域82のうち、20個のサブ領域82(典型輪郭画像100の部分画像を示した領域)の集合が非マスク領域102として決定されたとする。また、48個のサブ領域82のうち、残余の28個のサブ領域82(白く塗り潰した領域)の集合がマスク領域104として決定される。
For example, a sub-region 82 whose contour feature amount exceeds a predetermined threshold is adopted as a calculation target, and a sub-region 82 whose contour feature amount falls below a predetermined threshold is excluded from the calculation target. As a result, as shown in FIG. 8B, among the 48 sub-regions 82 defined in the image region 80, a set of 20 sub-regions 82 (region showing a partial image of the typical contour image 100) is obtained. It is assumed that the non-mask area 102 is determined. Of the 48 sub-regions 82, the remaining 28 sub-regions 82 (regions filled in with white) are determined as the mask region 104.
図9Aは、対面歩行者を含む多数の学習サンプル画像から得た、典型輪郭画像106を表す模式図である。典型輪郭画像106の作成方法については、図8Aの典型輪郭画像100と同様であるので、その説明を省略する。
FIG. 9A is a schematic diagram showing a typical contour image 106 obtained from a large number of learning sample images including facing pedestrians. The method for creating the typical contour image 106 is the same as that for the typical contour image 100 in FIG.
図9Bは、対面歩行者を含む各学習サンプル画像に共通する、非マスク領域108及びマスク領域110を示す概略説明図である。マスク領域110の決定方法については、図8Bのマスク領域104と同様であるので、その説明を省略する。
FIG. 9B is a schematic explanatory diagram showing a non-mask area 108 and a mask area 110 that are common to each learning sample image including a face-to-face pedestrian. Since the determination method of the mask area 110 is the same as that of the mask area 104 in FIG. 8B, the description thereof is omitted.
図8B及び図9Bから理解されるように、対象物は同一(歩行者)であるにもかかわらず、マスク領域104、110が異なっている。より詳細には、マスク領域104、110は、ハッチングを付した4つのサブ領域82(図9B参照)がマスク対象であるか否かで異なる。これは、歩行動作(手足を振る動き)に起因する像形状の変化の度合いが、歩行者の移動方向によって異なるためである。このように、移動方向に応じて画像上での像形状が変化する傾向を踏まえ、対象物の移動方向毎に学習させてもよい。移動方向としては、画像面に対して横断方向(より詳しくは右方向、左方向)、対面方向(より詳しくは手前方向、奥方向)、斜め方向のいずれであってもよい。
As can be understood from FIGS. 8B and 9B, the mask areas 104 and 110 are different even though the object is the same (pedestrian). More specifically, the mask areas 104 and 110 differ depending on whether or not the four sub-areas 82 (see FIG. 9B) with hatching are mask targets. This is because the degree of change in the image shape due to walking motion (movement of shaking hands and feet) varies depending on the moving direction of the pedestrian. In this way, learning may be performed for each moving direction of the object based on the tendency of the image shape on the image to change according to the moving direction. The moving direction may be any of a transverse direction (more specifically, right direction and left direction), a facing direction (more specifically, near side direction and back direction), and an oblique direction with respect to the image plane.
ステップS15において、ステップS11で収集した多数の学習データを、識別器50に対して遂次入力することで機械学習をさせる。以下、図7を参照しながら詳細に説明する。
In step S15, machine learning is performed by sequentially inputting a large number of learning data collected in step S11 to the discriminator 50. Hereinafter, it will be described in detail with reference to FIG.
先ず、識別器50は、収集された各学習データのうち、横断歩行者70を含む学習サンプル画像74を特徴データ生成器90側にそれぞれ入力する。そして、各特徴データ生成器90は、ステップS14で決定されたマスク条件に従って、学習サンプル画像74に対し特定の演算処理を施すことで各特徴データ(総称して、特徴データ群)を作成する。マスク処理の具体的方法として、マスク領域104(図8B参照)に属する全画素の値を使用せずに演算してもよいし、上記した全画素の値を所定値(例えば0)に置き換えた後に演算することで、特徴データを実質的に無効にしてもよい。
First, the discriminator 50 inputs the learning sample image 74 including the crossing pedestrian 70 among the collected learning data to the feature data generator 90 side. Then, each feature data generator 90 creates each feature data (collectively, feature data group) by performing specific arithmetic processing on the learning sample image 74 according to the mask condition determined in step S14. As a specific method of the mask processing, the calculation may be performed without using the values of all pixels belonging to the mask area 104 (see FIG. 8B), or the values of all the pixels described above are replaced with predetermined values (for example, 0). The feature data may be substantially invalidated by calculating later.
そして、各弱学習器92(第i弱学習器;1≦i≦N)は、特徴データ生成器90(第iデータ生成器)から取得した各特徴データ(第i特徴データ)に対して所定の演算を施すことで各出力結果(第i出力結果)を得る。そして、重み付け更新器93は、各弱学習器92から取得した第1~第N出力結果をそれぞれ入力するとともに、収集された1つの学習データのうちの対象物の存否情報である対象物情報96を入力する。ここで、対象物情報96は、学習サンプル画像74の中に横断歩行者70が含まれる旨を示す。
Each weak learner 92 (i-th weak learner; 1 ≦ i ≦ N) is predetermined for each feature data (i-th feature data) acquired from the feature data generator 90 (i-th data generator). Each output result (i-th output result) is obtained by performing the above calculation. The weight updater 93 receives the first to Nth output results acquired from the weak learners 92, and the object information 96 which is the presence / absence information of the object in the collected learning data. Enter. Here, the object information 96 indicates that the crossing pedestrian 70 is included in the learning sample image 74.
その後、重み付け更新器93は、対象物情報96に応じた出力値との誤差量が最小となる出力結果を得た弱学習器92を1つ選択し、その重み付け係数αが大きくなるように更新量Δαを決定する。そして、重み付け演算器94は、重み付け更新器93から供給された更新量Δαを加算することで重み付け係数αを更新する。これと併せて、サンプル荷重更新器95は、更新された重み付け係数α等に基づいて、学習サンプル画像74等に予め付与された荷重(以下、サンプル荷重97という)を更新する。
Thereafter, the weight updater 93 selects one weak learner 92 that has obtained an output result that minimizes the amount of error from the output value corresponding to the object information 96, and updates the weighting coefficient α so as to increase. The quantity Δα is determined. The weighting calculator 94 updates the weighting coefficient α by adding the update amount Δα supplied from the weight updater 93. At the same time, the sample load updater 95 updates a load (hereinafter referred to as a sample load 97) previously applied to the learning sample image 74 and the like based on the updated weighting coefficient α and the like.
このように、学習データの入力、重み付け係数αの更新、及びサンプル荷重97の更新を順次繰り返し、収束条件を満たすまで識別器50に機械学習をさせる(ステップS15)。
In this way, the learning data is input, the weighting coefficient α is updated, and the sample weight 97 is sequentially updated, and the discriminator 50 performs machine learning until the convergence condition is satisfied (step S15).
以上のようにして、機械学習された識別器50を車両周辺監視装置10(図1参照)内に実装することで、撮像画像Imの識別対象領域内に少なくとも1種類の対象物が存在するか否かを識別可能な対象物識別部44が構築される。なお、対象物識別部44は、対象物のみならず、対象物を除く物体(例えば、図3A等の道路60)を識別可能に構成されてもよい。
As described above, by mounting the machine-learned discriminator 50 in the vehicle periphery monitoring device 10 (see FIG. 1), whether at least one type of object exists in the discrimination target area of the captured image Im. An object identification unit 44 that can identify whether or not is constructed. The object identifying unit 44 may be configured to be able to identify not only the object but also an object excluding the object (for example, the road 60 in FIG. 3A and the like).
[車両周辺監視装置10の詳細な動作]
続いて、車両周辺監視装置10の詳細な動作について、図10のフローチャートを参照しながら説明する。なお、本処理の流れは、車両12が走行中である場合、撮像のフレーム毎に実行される。 [Detailed Operation of Vehicle Perimeter Monitoring Device 10]
Next, the detailed operation of the vehicleperiphery monitoring device 10 will be described with reference to the flowchart of FIG. Note that this processing flow is executed for each imaging frame when the vehicle 12 is traveling.
続いて、車両周辺監視装置10の詳細な動作について、図10のフローチャートを参照しながら説明する。なお、本処理の流れは、車両12が走行中である場合、撮像のフレーム毎に実行される。 [Detailed Operation of Vehicle Perimeter Monitoring Device 10]
Next, the detailed operation of the vehicle
ステップS21において、ECU22は、フレーム毎に、カメラ14により撮像された車両12の前方(所定画角範囲)の出力信号である撮像画像Imを取得する。例えば、図3Aに示すように、撮像時点T=T1においてフレーム単位の撮像画像Imが得られたとする。そして、ECU22は、取得した撮像画像Imを記憶部34に一時的に記憶させる。例えば、カメラ14としてRGBカメラを用いる場合、得られた撮像画像Imは、3つのカラーチャンネルからなる多階調画像である。
In step S21, the ECU 22 acquires a captured image Im that is an output signal in front of the vehicle 12 (predetermined angle of view range) captured by the camera 14 for each frame. For example, as shown in FIG. 3A, it is assumed that a captured image Im in units of frames is obtained at the time T = T1. Then, the ECU 22 temporarily stores the acquired captured image Im in the storage unit 34. For example, when an RGB camera is used as the camera 14, the obtained captured image Im is a multi-gradation image composed of three color channels.
ステップS22において、距離推定部40は、ステップS21で取得された撮像画像Imを用いて車両12の仰俯角を算出することで、車両12からの距離Disをそれぞれ推定する。
In step S22, the distance estimation unit 40 estimates the distance Dis from the vehicle 12 by calculating the elevation angle of the vehicle 12 using the captured image Im acquired in step S21.
図11は、車両12、カメラ14及び人体Mの位置関係を表す概略説明図である。ここで、カメラ14を搭載した車両12、及び、対象物としての人体Mが、平坦な路面S上に存在する場合を想定する。また、人体Mと路面Sとの接触点をPcとし、カメラ14の光軸をL1とし、カメラ14の光学中心Cと接触点Pcとを結ぶ直線をL2とする。
FIG. 11 is a schematic explanatory diagram showing the positional relationship between the vehicle 12, the camera 14, and the human body M. Here, it is assumed that the vehicle 12 on which the camera 14 is mounted and the human body M as an object are present on a flat road surface S. A contact point between the human body M and the road surface S is Pc, an optical axis of the camera 14 is L1, and a straight line connecting the optical center C of the camera 14 and the contact point Pc is L2.
例えば、カメラ14の光軸L1が路面Sに対してなす角(仰俯角)がβであり、直線L2が光軸L1に対してなす角がγであり、路面Sに対するカメラ14の高さがHcであったとする。この場合、距離Disは、幾何学的な考察から、角度β、γ、及び高さHcを用いて、次の(1)式で算出される。
Dis=Hc/tan(β+γ) ‥(1) For example, the angle (elevation angle) formed by the optical axis L1 of thecamera 14 with respect to the road surface S is β, the angle formed by the straight line L2 with respect to the optical axis L1 is γ, and the height of the camera 14 with respect to the road surface S is Let it be Hc. In this case, the distance Dis is calculated by the following equation (1) using the angles β, γ, and the height Hc from a geometrical consideration.
Dis = Hc / tan (β + γ) (1)
Dis=Hc/tan(β+γ) ‥(1) For example, the angle (elevation angle) formed by the optical axis L1 of the
Dis = Hc / tan (β + γ) (1)
このようにして、距離推定部40は、撮像画像Im上における路面S(図3Aの道路60)の各位置に対応する距離Disをそれぞれ推定できる。あるいは、距離推定部40は、車両12の運動に伴う、路面Sとカメラ14との間の姿勢変化を考慮し、SfM(Structure from Motion)等の公知の手法を用いて距離Disを推定してもよい。なお、車両周辺監視装置10が測距センサを備える場合、これを利用して距離Disを計測してもよい。
In this way, the distance estimation unit 40 can estimate the distance Dis corresponding to each position of the road surface S (the road 60 in FIG. 3A) on the captured image Im. Alternatively, the distance estimation unit 40 considers the posture change between the road surface S and the camera 14 due to the motion of the vehicle 12, and estimates the distance Dis using a known method such as SfM (Structure from Motion). Also good. In addition, when the vehicle periphery monitoring apparatus 10 is provided with a distance measuring sensor, you may measure the distance Dis using this.
ステップS23において、識別対象領域決定部42は、識別対象の画像領域である識別対象領域122のサイズ等を決定する。本実施形態では、識別対象領域決定部42は、ステップS22で推定された距離Dis、及び/又は、車速センサ16から取得した車速Vsに応じて、識別対象領域122のサイズ等を決定する。この具体例について図12を参照しながら説明する。
In step S23, the identification target area determination unit 42 determines the size and the like of the identification target area 122, which is the image area to be identified. In the present embodiment, the identification target area determination unit 42 determines the size or the like of the identification target area 122 according to the distance Dis estimated in step S22 and / or the vehicle speed Vs acquired from the vehicle speed sensor 16. A specific example will be described with reference to FIG.
図12に示すように、路面S(道路60)と人体M(横断歩行者64)との接触点Pc(図11参照)に対応する、撮像画像Im上の位置を基準位置120とする。そして、横断歩行者64をすべて含むように、矩形状の識別対象領域122が設定される。
12, the position on the captured image Im corresponding to the contact point Pc (see FIG. 11) between the road surface S (road 60) and the human body M (crossing pedestrian 64) is defined as a reference position 120. And the rectangular identification object area | region 122 is set so that all the crossing pedestrians 64 may be included.
識別対象領域決定部42は、線形関数、非線形関数を含む任意の算出式を用いて、カメラ14(図11の光学中心C)から対象物までの距離Disに応じた識別対象領域122のサイズを決定する。路面S(道路60)上の基準位置124に、破線で図示した横断歩行者64fが存在すると仮定した場合、識別対象領域122に相似する識別対象領域126が設定される。例えば、横断歩行者64(64f)と識別対象領域122(126)との間の相対的大小関係を距離Disによらず一定又は略一定にすることで、外乱因子(対象物の投影像以外の画像情報)の影響を一律に抑制可能になり、その結果、対象物の学習・識別精度が一層向上する。
The identification target area determination unit 42 determines the size of the identification target area 122 according to the distance Dis from the camera 14 (optical center C in FIG. 11) to the target using an arbitrary calculation formula including a linear function and a nonlinear function. decide. When it is assumed that the crossing pedestrian 64f illustrated by a broken line exists at the reference position 124 on the road surface S (the road 60), an identification target area 126 similar to the identification target area 122 is set. For example, by setting the relative magnitude relationship between the crossing pedestrian 64 (64f) and the identification target region 122 (126) to be constant or substantially constant regardless of the distance Dis, a disturbance factor (other than the projected image of the target) The influence of the image information) can be uniformly suppressed, and as a result, the learning / identification accuracy of the object is further improved.
そして、識別対象領域決定部42は、後述するラスタスキャンの対象範囲である指定領域128を併せて決定する。例えば、通常の動作範囲内において対象物が確実に検知される距離であるDis1を下限値として決定し、通常の動作範囲内において対象物に即時に衝突するおそれがない距離であるDis2を上限値として決定してもよい。このように、撮像画像Imのうちの一部の領域におけるスキャンを省略することで、識別処理の演算量及び演算時間を低減できるのみならず、指定領域128以外の領域において生じ得る誤検出自体を無くすることができる。
Then, the identification target area determination unit 42 also determines a designated area 128 that is a target range of a raster scan described later. For example, Dis1 that is a distance in which the object is reliably detected within the normal operation range is determined as the lower limit value, and Dis2 that is a distance that does not cause a collision with the object immediately within the normal operation range is set as the upper limit value. May be determined as As described above, by omitting scanning in a part of the captured image Im, not only can the calculation amount and calculation time of the identification process be reduced, but also erroneous detection itself that may occur in a region other than the designated region 128. Can be eliminated.
なお、識別対象領域決定部42は、対象物の種類に応じて、識別対象領域122の形状を適宜変更してもよい。この場合、本実施形態の場合と同様に、距離Disに応じたサイズ(例えば、距離Disに比例する値)に決定してもよい。
Note that the identification target region determination unit 42 may appropriately change the shape of the identification target region 122 according to the type of the target object. In this case, similarly to the case of the present embodiment, the size may be determined according to the distance Dis (for example, a value proportional to the distance Dis).
また、識別対象領域決定部42は、同一の距離Disであっても、対象物の身長に応じて識別対象領域122のサイズを変更してもよい。これにより、対象物の身長別に適切なサイズを設定可能であり、外乱因子の影響を一律に抑えることで対象物の学習・識別精度がさらに一層向上する。
Further, the identification target area determination unit 42 may change the size of the identification target area 122 according to the height of the target object even at the same distance Dis. Thereby, it is possible to set an appropriate size according to the height of the object, and the learning / identification accuracy of the object is further improved by uniformly suppressing the influence of the disturbance factor.
ステップS24において、演算部30は、ステップS23で決定された指定領域128内で、撮像画像Imのラスタスキャンを開始する。ここで、ラスタスキャンとは、基準位置120(撮像画像Im内の画素)を所定の方向に移動させながら、対象物の有無を遂次識別する手法をいう。以下、識別対象領域決定部42は、現在スキャン中の基準位置120、及び、基準位置120から特定される識別対象領域122の位置・サイズを遂次決定する。
In step S24, the arithmetic unit 30 starts a raster scan of the captured image Im within the designated area 128 determined in step S23. Here, the raster scan refers to a method of successively identifying the presence or absence of an object while moving the reference position 120 (pixels in the captured image Im) in a predetermined direction. Hereinafter, the identification target area determination unit 42 sequentially determines the reference position 120 currently being scanned and the position / size of the identification target area 122 identified from the reference position 120.
ステップS25において、対象物識別部44は、決定された識別対象領域122内に、少なくとも1種類の対象物が存在するか否かを識別する。
In step S25, the object identifying unit 44 identifies whether or not there is at least one kind of object in the determined identification target area 122.
図13に示すように、対象物識別部44は、機械学習を用いて生成された識別器50(図7参照)である。重み付け演算器94には、機械学習(図4のステップS15)により得た、適切な重み付け係数αfが予め設定されている。
As shown in FIG. 13, the object identifying unit 44 is a classifier 50 (see FIG. 7) generated using machine learning. In the weighting calculator 94, an appropriate weighting coefficient αf obtained by machine learning (step S15 in FIG. 4) is preset.
対象物識別部44は、識別対象領域122を含む画像領域80を有する評価画像130を、各特徴データ生成器90側に入力する。ここで、対象物識別部44による識別精度を高めるため、識別対象領域122の画像に対して正規化処理(階調処理・拡縮処理)、位置合わせ処理等の必要な画像処理を適宜施してもよい。そして、対象物識別部44は、各特徴データ生成器90、各弱学習器92、重み付け演算器94、及び、重み付け演算器94から取得した重み付け出力結果に対して階段関数を作用させる統合学習器98を介して、評価画像130を順次処理し、識別対象領域122内に横断歩行者64が存在する旨の識別結果を出力する。この場合、対象物識別部44は、N個の弱識別器(弱学習器92)を組み合わせることで高い識別性能を備えた強識別器として機能する。
The object identifying unit 44 inputs an evaluation image 130 having an image area 80 including the identification target area 122 to each feature data generator 90 side. Here, in order to increase the identification accuracy by the object identification unit 44, necessary image processing such as normalization processing (gradation processing / enlargement / reduction processing), alignment processing, etc. may be appropriately performed on the image of the identification target region 122. Good. The object identification unit 44 includes each feature data generator 90, each weak learner 92, a weighting calculator 94, and an integrated learner that applies a step function to the weighted output result acquired from the weighting calculator 94. The evaluation image 130 is sequentially processed via 98, and an identification result indicating that the crossing pedestrian 64 exists in the identification target area 122 is output. In this case, the object discriminating unit 44 functions as a strong discriminator having high discrimination performance by combining N weak discriminators (weak learners 92).
各特徴データ生成器90は、学習処理の場合(図7参照)と同一の演算方法を用いて、各サブ領域82における画像特徴量(すなわち、上述の特徴データ群)をそれぞれ算出する。
Each feature data generator 90 calculates an image feature amount (that is, the above-described feature data group) in each sub-region 82 using the same calculation method as in the learning process (see FIG. 7).
ところで、画像特徴量の算出方法は、公知の方法を種々用いることができる。以下、画像の局所領域での輝度の強度及び勾配の特徴を示すHOG(Histograms of Oriented Gradient;輝度勾配方向ヒストグラム)特徴量について説明する。
Incidentally, various known methods can be used as the image feature amount calculation method. Hereinafter, the HOG (Histograms of Oriented Gradient: luminance gradient direction histogram) feature amount indicating the intensity and gradient characteristics of the luminance in the local region of the image will be described.
図14Aに示すように、画像領域80の中から、ヒストグラムの作成単位であるブロックが1つ選択されたとする。技術の理解を容易にするため、以下、各ブロックをそれぞれのサブ領域82に対応させて定義する。例えば、ブロックとしてのサブ領域82は、縦に6画素、横に6画素、合計36個の画素84で構成されたとする。
As shown in FIG. 14A, it is assumed that one block, which is a unit for creating a histogram, is selected from the image area 80. In order to facilitate understanding of the technology, each block is defined below corresponding to each sub-region 82. For example, it is assumed that the sub-region 82 as a block is composed of a total of 36 pixels 84, 6 pixels vertically and 6 pixels horizontally.
図14Bに示すように、ブロックを構成する画素84毎に、輝度の二次元勾配(Ix,Iy)が算出される。この場合、勾配強度I及び空間輝度勾配角θは、次の(2)式及び(3)式に従って算出される。
I=(Ix2+Iy2)1/2 ‥(2)
θ=tan-1(Iy/Ix) ‥(3) As shown in FIG. 14B, a two-dimensional gradient (Ix, Iy) of luminance is calculated for eachpixel 84 constituting the block. In this case, the gradient intensity I and the spatial luminance gradient angle θ are calculated according to the following equations (2) and (3).
I = (Ix 2 + Iy 2 ) 1/2 (2)
θ = tan −1 (Iy / Ix) (3)
I=(Ix2+Iy2)1/2 ‥(2)
θ=tan-1(Iy/Ix) ‥(3) As shown in FIG. 14B, a two-dimensional gradient (Ix, Iy) of luminance is calculated for each
I = (Ix 2 + Iy 2 ) 1/2 (2)
θ = tan −1 (Iy / Ix) (3)
第1行目の各格子内に表記された矢印は、平面的な輝度勾配の方向を図示する。実際には、勾配強度I及び空間輝度勾配角θが、すべての画素84に対して算出されるが、第2行目以降における矢印の図示を省略する。
The arrows written in each grid in the first row illustrate the direction of the planar luminance gradient. Actually, the gradient intensity I and the spatial luminance gradient angle θ are calculated for all the pixels 84, but the illustration of arrows in the second and subsequent rows is omitted.
図14Cに示すように、1つのブロックにつき、空間輝度勾配角θに対するヒストグラムが作成される。ヒストグラムの横軸は空間輝度勾配角θ(本図例では、8つの区分)であり、ヒストグラムの縦軸は勾配強度Iである。この場合、(2)式に示す勾配強度Iに基づいてブロック毎のヒストグラムが作成される。
As shown in FIG. 14C, a histogram for the spatial luminance gradient angle θ is created for each block. The horizontal axis of the histogram is the spatial luminance gradient angle θ (eight divisions in this example), and the vertical axis of the histogram is the gradient intensity I. In this case, a histogram for each block is created based on the gradient intensity I shown in Equation (2).
図15Aに示すように、ブロック毎のヒストグラム(図14C例では空間輝度勾配角θ)を予め定めた順番、例えば昇順に連結することで、評価画像130のHOG特徴量が得られる。
As shown in FIG. 15A, the HOG feature amount of the evaluation image 130 is obtained by connecting the histogram for each block (spatial luminance gradient angle θ in the example of FIG. 14C) in a predetermined order, for example, ascending order.
このように、画像特徴量には、空間上での輝度勾配方向ヒストグラム(HOG特徴量)が含まれてもよい。撮像の露光量に起因する輝度勾配方向(θ)の変動は小さいので、対象物の特徴を的確に捉えることが可能であり、環境光の強度が時々刻々と変化する屋外環境であっても安定した識別精度が得られる。
Thus, the image feature quantity may include a luminance gradient direction histogram (HOG feature quantity) in space. Because the fluctuation of the luminance gradient direction (θ) due to the exposure amount of imaging is small, it is possible to accurately capture the characteristics of the target object, and it is stable even in outdoor environments where the intensity of ambient light changes from moment to moment Identification accuracy can be obtained.
また、画像特徴量として、HOG特徴量の定義を拡張させたSTHOG(Spatio-Temporal Histograms of Oriented Gradient)特徴量を用いてもよい。この場合、時系列で取得された複数の撮像画像Imを用いて、図14Bの場合と同様に、ブロックを構成する画素84毎に、輝度の三次元的勾配(Ix,Iy,It)が算出される。ここで、勾配強度I及び時間輝度勾配角φは、次の(4)式及び(5)式に従って算出される。
I=(Ix2+Iy2+It2)1/2 ‥(4)
φ=tan-1{It/(Ix2+Iy2)1/2} ‥(5) Further, as the image feature amount, an STHOG (Spatio-Temporal Histograms of Oriented Gradient) feature amount in which the definition of the HOG feature amount is expanded may be used. In this case, a three-dimensional gradient (Ix, Iy, It) of luminance is calculated for eachpixel 84 constituting the block using a plurality of captured images Im acquired in time series, as in FIG. 14B. Is done. Here, the gradient intensity I and the temporal luminance gradient angle φ are calculated according to the following equations (4) and (5).
I = (Ix 2 + Iy 2 + It 2 ) 1/2 (4)
φ = tan −1 {It / (Ix 2 + Iy 2 ) 1/2 } (5)
I=(Ix2+Iy2+It2)1/2 ‥(4)
φ=tan-1{It/(Ix2+Iy2)1/2} ‥(5) Further, as the image feature amount, an STHOG (Spatio-Temporal Histograms of Oriented Gradient) feature amount in which the definition of the HOG feature amount is expanded may be used. In this case, a three-dimensional gradient (Ix, Iy, It) of luminance is calculated for each
I = (Ix 2 + Iy 2 + It 2 ) 1/2 (4)
φ = tan −1 {It / (Ix 2 + Iy 2 ) 1/2 } (5)
図15Bに示すように、HOG特徴量に対し、時間輝度勾配角φのヒストグラムをさらに連結することで、時空間上での輝度勾配ヒストグラムであるSTHOG特徴量が得られる。このように、空間上の輝度勾配方向(θ)のみならず、時間上の輝度勾配方向(φ)も併せて考慮することで、時系列で取得された複数の撮像画像Imにわたる対象物の検知・追跡が容易になる。
As shown in FIG. 15B, an STHOG feature quantity which is a brightness gradient histogram in space-time is obtained by further connecting a histogram of the temporal brightness gradient angle φ to the HOG feature quantity. Thus, not only the luminance gradient direction (θ) in space but also the luminance gradient direction (φ) in time is taken into consideration, thereby detecting an object over a plurality of captured images Im acquired in time series.・ Easy tracking.
なお、STHOG特徴量の場合もHOG特徴量と同様に、(2)式に示す勾配強度Iに基づいてブロック毎のヒストグラムが作成される。
In the case of the STHOG feature value, a histogram for each block is created based on the gradient intensity I shown in the equation (2), similarly to the HOG feature value.
このようにして、対象物識別部44は、決定された識別対象領域122内に、少なくとも1種類の対象物が存在するか否かを識別する(ステップS25)。これにより、人体Mを含む対象物の種類のみならず、横断方向・対面方向を含む移動方向も併せて識別される。
In this way, the object identification unit 44 identifies whether or not at least one type of object exists in the determined identification object region 122 (step S25). Thereby, not only the kind of the object including the human body M but also the moving direction including the transverse direction and the facing direction are identified.
ステップS26において、識別対象領域決定部42は、指定領域128内のスキャンがすべて完了したか否かを判別する。未完了であると判別された場合(ステップS26:NO)、次のステップ(S27)に進む。
In step S26, the identification target area determination unit 42 determines whether or not all the scans in the designated area 128 have been completed. When it is determined that it is not completed (step S26: NO), the process proceeds to the next step (S27).
ステップS27において、識別対象領域決定部42は、識別対象領域122の位置又はサイズを変更する。具体的には、識別対象領域決定部42は、スキャン対象であった基準位置120を所定方向(例えば、右方向)に所定量(例えば、1画素分)だけ移動する。また、距離Disが変化する場合、識別対象領域122のサイズも併せて変更する。さらに、対象物の種類によって体長又は体幅の典型値が異なることを考慮して、識別対象領域決定部42は、対象物の種類に応じて識別対象領域122のサイズを変更してもよい。
In step S27, the identification target area determination unit 42 changes the position or size of the identification target area 122. Specifically, the identification target area determination unit 42 moves the reference position 120 that is the scan target by a predetermined amount (for example, one pixel) in a predetermined direction (for example, the right direction). When the distance Dis changes, the size of the identification target area 122 is also changed. Furthermore, considering that the typical value of the body length or the body width varies depending on the type of the object, the identification target region determination unit 42 may change the size of the identification target region 122 according to the type of the target.
その後、ステップS25に戻って、演算部30は、指定領域128内のスキャンがすべて完了するまでステップS25~S27を順次繰り返す。完了したと判別された場合(ステップS26:YES)、演算部30は、撮像画像Imのラスタスキャンを終了する(ステップS28)。
Thereafter, returning to step S25, the arithmetic unit 30 sequentially repeats steps S25 to S27 until all the scans in the designated area 128 are completed. When it is determined that the processing has been completed (step S26: YES), the calculation unit 30 ends the raster scan of the captured image Im (step S28).
ステップS29において、対象物検知部46は、撮像画像Im内に存在する対象物を検知する。フレーム単体での識別結果を用いてもよいし、複数のフレームでの識別結果を併せて考慮することで、同一の対象物についての動きベクトルを算出できる。
In step S29, the object detection unit 46 detects an object present in the captured image Im. The identification result for a single frame may be used, or the motion vector for the same object can be calculated by considering the identification results for a plurality of frames.
ステップS30において、ECU22は、次回の演算処理に必要なデータを記憶部34に記憶させる。例えば、ステップS22で作成された距離Dis、ステップS25で得られた対象物(図3A等の横断歩行者64)の属性、基準位置120等が挙げられる。
In step S30, the ECU 22 causes the storage unit 34 to store data necessary for the next calculation process. For example, the distance Dis created in step S22, the attribute of the object (crossing pedestrian 64 in FIG. 3A, etc.) obtained in step S25, the reference position 120, and the like can be given.
この動作を遂次実行することで、車両周辺監視装置10は、所定の時間間隔で、車両12の前方に存在する対象物(例えば、図11の人体M)を監視することができる。
By sequentially executing this operation, the vehicle periphery monitoring device 10 can monitor an object (for example, the human body M in FIG. 11) existing in front of the vehicle 12 at a predetermined time interval.
[本実施形態における効果]
以上のように、車両周辺監視装置10は、撮像画像Imを取得するカメラ14と、取得された撮像画像Imの中から識別対象領域122、126を抽出する識別対象領域決定部42と、抽出された識別対象領域122、126における画像特徴量から、識別対象領域122、126内に対象物(例えば、横断歩行者64)が存在するか否かを対象物の種類毎に識別する対象物識別部44とを備える。対象物識別部44は、画像特徴量としての特徴データ群を入力とし対象物の存否情報を出力とする、機械学習を用いて生成された識別器50である。 [Effect in this embodiment]
As described above, the vehicleperiphery monitoring apparatus 10 is extracted with the camera 14 that acquires the captured image Im, and the identification target region determination unit 42 that extracts the identification target regions 122 and 126 from the acquired captured image Im. The object identification unit for identifying, for each type of object, whether or not an object (for example, a crossing pedestrian 64) exists in the identification object areas 122 and 126 from the image feature amounts in the identification object areas 122 and 126. 44. The target object identification unit 44 is a classifier 50 generated using machine learning that receives a feature data group as an image feature amount and outputs the presence / absence information of the target object.
以上のように、車両周辺監視装置10は、撮像画像Imを取得するカメラ14と、取得された撮像画像Imの中から識別対象領域122、126を抽出する識別対象領域決定部42と、抽出された識別対象領域122、126における画像特徴量から、識別対象領域122、126内に対象物(例えば、横断歩行者64)が存在するか否かを対象物の種類毎に識別する対象物識別部44とを備える。対象物識別部44は、画像特徴量としての特徴データ群を入力とし対象物の存否情報を出力とする、機械学習を用いて生成された識別器50である。 [Effect in this embodiment]
As described above, the vehicle
そして、対象物識別手段としての識別器50は、機械学習に供される各学習サンプル画像74、76を構成する複数のサブ領域82のうち、対象物の種類に応じて選択した少なくとも1つのサブ領域82(非マスク領域102、108)の画像から、前記特徴データ群を作成し入力するようにしたので、対象物の投影像の形状に適したサブ領域82の画像情報を、学習処理に対して選択的に採用可能であり、対象物の種類にかかわらず学習精度を向上させることができる。そして、残余のサブ領域82(マスク領域104、110)を学習処理から除外することで、識別処理の際に外乱因子として作用する、対象物の投影像以外の画像情報に対する過学習を防止可能であり、対象物の識別精度を向上させることができる。
The discriminator 50 as the object discriminating means includes at least one sub selected according to the type of the object out of the plurality of sub areas 82 constituting the learning sample images 74 and 76 used for machine learning. Since the feature data group is created and input from the image of the region 82 (non-mask regions 102 and 108), the image information of the sub-region 82 suitable for the shape of the projected image of the object is obtained for the learning process. The learning accuracy can be improved regardless of the type of object. Further, by excluding the remaining sub-region 82 (mask regions 104 and 110) from the learning process, it is possible to prevent over-learning with respect to image information other than the projected image of the object that acts as a disturbance factor during the identification process. Yes, the identification accuracy of the object can be improved.
また、カメラ14は、車両12に搭載され、且つ、車両12の移動中に撮像することで撮像画像Imを取得することが好ましい。車両12に搭載されたカメラ14から得た撮像画像Imのシーン(背景、天候、路面パターン等)は時々刻々と変化するため、特に効果的である。
Further, it is preferable that the camera 14 is mounted on the vehicle 12 and acquires the captured image Im by capturing an image while the vehicle 12 is moving. Since the scene (background, weather, road surface pattern, etc.) of the captured image Im obtained from the camera 14 mounted on the vehicle 12 changes every moment, it is particularly effective.
[STHOG特徴量との関係]
ところで、撮像画像Imのシーンが動的に変化する状態下にSTHOG特徴量を用いた検知処理を実行すると、対象物の検知精度が低下する場合がある。なぜならば、車両12の走行状態に起因する動き成分を、実質的に静止中である背景部の動き成分として誤って検知することで、対象物の検知結果に影響を与え得るからである。 [Relationship with STHOG features]
By the way, if the detection process using the STHOG feature amount is executed under a state where the scene of the captured image Im changes dynamically, the detection accuracy of the target object may be lowered. This is because the detection result of the object can be affected by erroneously detecting the movement component caused by the traveling state of thevehicle 12 as the movement component of the background portion that is substantially stationary.
ところで、撮像画像Imのシーンが動的に変化する状態下にSTHOG特徴量を用いた検知処理を実行すると、対象物の検知精度が低下する場合がある。なぜならば、車両12の走行状態に起因する動き成分を、実質的に静止中である背景部の動き成分として誤って検知することで、対象物の検知結果に影響を与え得るからである。 [Relationship with STHOG features]
By the way, if the detection process using the STHOG feature amount is executed under a state where the scene of the captured image Im changes dynamically, the detection accuracy of the target object may be lowered. This is because the detection result of the object can be affected by erroneously detecting the movement component caused by the traveling state of the
そこで、識別対象領域決定部42は、道路60上の位置(図12の基準位置120)を基準とする識別対象領域122を抽出・決定することで、シーンが固定化された場合と同等の検知精度を維持できる。以下、STHOG特徴量を用いた検知処理における精度維持の原理について、図16を参照しながら説明する。
Therefore, the identification target area determination unit 42 extracts and determines the identification target area 122 based on the position on the road 60 (the reference position 120 in FIG. 12), thereby detecting the same as when the scene is fixed. Accuracy can be maintained. Hereinafter, the principle of maintaining accuracy in the detection process using the STHOG feature amount will be described with reference to FIG.
図16に示すように、xy平面上の識別対象領域122を時系列的に抽出することで、STHOG特徴量の算出に供される時系列データ132が得られる。ここで、各識別対象領域122内に含まれる基準位置120は、道路60(図3A等)上の位置に対応する。すなわち、基準位置120の近傍領域としての識別対象領域122は、完全に又は実質的に静止している点の集合に相当する。
As illustrated in FIG. 16, time-series data 132 used for calculation of the STHOG feature value is obtained by extracting the identification target region 122 on the xy plane in time series. Here, the reference position 120 included in each identification target region 122 corresponds to a position on the road 60 (FIG. 3A and the like). That is, the identification target area 122 as the vicinity area of the reference position 120 corresponds to a set of points that are completely or substantially stationary.
これにより、対象物(横断歩行者64)を検知する際に、道路60との相対速度が実質的にゼロである背景部の動き成分を相殺可能になる。その結果、シーンが動的に変化する状態下であっても、シーンが固定化された場合と同等の検知精度を維持できる。
Thereby, when detecting the object (crossing pedestrian 64), it becomes possible to cancel the motion component of the background portion where the relative speed with respect to the road 60 is substantially zero. As a result, even in a state where the scene changes dynamically, the detection accuracy equivalent to that when the scene is fixed can be maintained.
これと併せて又はこれとは別に、識別対象領域決定部42は、距離Disに応じたサイズの識別対象領域122を抽出・決定することで、対象物と背景部の間における相対的な位置及びサイズを略一致させることができ、安定した像形状を有する時系列データ132が得られる。
In addition to or separately from this, the identification target region determination unit 42 extracts and determines the identification target region 122 having a size corresponding to the distance Dis, thereby determining the relative position between the target and the background portion and The time series data 132 having substantially the same size and having a stable image shape can be obtained.
[補足]
なお、この発明は、上述した実施形態に限定されるものではなく、この発明の主旨を逸脱しない範囲で自由に変更できることは勿論である。 [Supplement]
In addition, this invention is not limited to embodiment mentioned above, Of course, it can change freely in the range which does not deviate from the main point of this invention.
なお、この発明は、上述した実施形態に限定されるものではなく、この発明の主旨を逸脱しない範囲で自由に変更できることは勿論である。 [Supplement]
In addition, this invention is not limited to embodiment mentioned above, Of course, it can change freely in the range which does not deviate from the main point of this invention.
本実施形態では、単眼カメラ(カメラ14)により得られた撮像画像Imに対して上記した識別処理を実行しているが、複眼カメラ(ステレオカメラ)でも同様の作用効果が得られることは言うまでもない。
In the present embodiment, the above-described identification process is performed on the captured image Im obtained by the monocular camera (camera 14), but it goes without saying that the same effect can also be obtained with a compound eye camera (stereo camera). .
また、本実施形態では、対象物識別部44(識別器50)による学習処理及び識別処理を分離して実行しているが、両者の処理を並列的に実行可能に設けてもよい。
In this embodiment, the learning process and the identification process by the object identification unit 44 (identifier 50) are performed separately, but both processes may be provided so as to be executed in parallel.
また、本実施形態では、車両周辺監視装置10全体を車両12に搭載しているが、少なくとも撮像手段を搭載する構成であればよい。例えば、撮像手段から出力された撮像信号を、無線通信手段を介して別個の演算処理装置(ECU22を含む)に送信する構成であっても、本実施形態と同様の作用効果が得られる。
In the present embodiment, the entire vehicle periphery monitoring device 10 is mounted on the vehicle 12, but any configuration may be used as long as at least imaging means is mounted. For example, even in the configuration in which the imaging signal output from the imaging unit is transmitted to a separate arithmetic processing unit (including the ECU 22) via the wireless communication unit, the same effect as that of the present embodiment can be obtained.
また、本実施形態では、対象物検知装置を車両12に適用しているが、これに限られることなく、その他の種類の移動体(例えば、船舶、航空機、人工衛星等)に適用してもよい。そもそも、対象物検知装置を固定配置した場合にも、対象物の学習・識別精度を向上させる一定の効果が得られることは言うまでもない。
In the present embodiment, the object detection device is applied to the vehicle 12, but is not limited to this, and may be applied to other types of moving objects (for example, ships, aircraft, artificial satellites, etc.). Good. Needless to say, even when the object detection device is fixedly arranged, a certain effect of improving the learning / identification accuracy of the object can be obtained.
Claims (6)
- 撮像画像(Im)を取得する撮像手段(14)と、
前記撮像手段(14)により取得された前記撮像画像(Im)の中から識別対象領域(122、126)を抽出する識別対象領域抽出手段(42)と、
前記識別対象領域抽出手段(42)により抽出された前記識別対象領域(122、126)における画像特徴量から、前記識別対象領域(122、126)内に対象物(64)が存在するか否かを前記対象物(64)の種類毎に識別する対象物識別手段(44)と
を備え、
前記対象物識別手段(44)は、前記画像特徴量としての特徴データ群を入力とし前記対象物(64)の存否情報を出力とする、機械学習を用いて生成された識別器(50)であり、
前記識別器(50)は、前記機械学習に供される各学習サンプル画像(74、76)を構成する複数のサブ領域(82)のうち、前記対象物(64)の種類に応じて選択した少なくとも1つの前記サブ領域(82)の画像から、前記特徴データ群を作成し入力する
ことを特徴とする対象物検知装置(10)。 Imaging means (14) for acquiring a captured image (Im);
An identification target area extraction means (42) for extracting an identification target area (122, 126) from the captured image (Im) acquired by the imaging means (14);
Whether or not the object (64) exists in the identification target area (122, 126) from the image feature amount in the identification target area (122, 126) extracted by the identification target area extraction means (42). And object identification means (44) for identifying the object for each type of the object (64),
The object identification means (44) is a classifier (50) generated using machine learning, which receives a feature data group as the image feature quantity and outputs presence / absence information of the object (64). Yes,
The discriminator (50) selected according to the type of the object (64) out of a plurality of sub-regions (82) constituting each learning sample image (74, 76) used for the machine learning The object detection device (10), wherein the feature data group is created and input from an image of at least one of the sub-regions (82). - 請求項1記載の装置(10)において、
前記対象物識別手段(44)は、前記識別対象領域(122、126)内に前記対象物(64)が存在するか否かを該対象物(64)の移動方向毎に識別することを特徴とする対象物検知装置(10)。 Device (10) according to claim 1,
The object identification means (44) identifies whether or not the object (64) exists in the identification object area (122, 126) for each moving direction of the object (64). An object detection device (10). - 請求項1記載の装置(10)において、
前記識別対象領域抽出手段(42)は、前記撮像手段(14)から前記対象物(64)までの距離に応じたサイズの前記識別対象領域(122、126)を抽出することを特徴とする対象物検知装置(10)。 Device (10) according to claim 1,
The identification target area extracting unit (42) extracts the identification target area (122, 126) having a size corresponding to a distance from the imaging unit (14) to the target (64). Object detection device (10). - 請求項1記載の装置(10)において、
前記画像特徴量には、空間上での輝度勾配方向ヒストグラムが含まれることを特徴とする対象物検知装置(10)。 Device (10) according to claim 1,
The object detection device (10), wherein the image feature amount includes a luminance gradient direction histogram in space. - 請求項1記載の装置(10)において、
前記画像特徴量には、時空間上での輝度勾配方向ヒストグラムが含まれることを特徴とする対象物検知装置(10)。 Device (10) according to claim 1,
The object detection device (10), wherein the image feature amount includes a luminance gradient direction histogram in time and space. - 請求項1~5のいずれか1項に記載の装置(10)において、
前記撮像手段(14)は、移動体(12)に搭載され、且つ、該移動体(12)の移動中に撮像することで前記撮像画像(Im)を取得することを特徴とする対象物検知装置(10)。 The device (10) according to any one of claims 1 to 5,
The imaging means (14) is mounted on the moving body (12) and acquires the captured image (Im) by capturing an image while the moving body (12) is moving. Device (10).
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2012-258488 | 2012-11-27 | ||
JP2012258488A JP2014106685A (en) | 2012-11-27 | 2012-11-27 | Vehicle periphery monitoring device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014084218A1 true WO2014084218A1 (en) | 2014-06-05 |
Family
ID=50827854
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2013/081808 WO2014084218A1 (en) | 2012-11-27 | 2013-11-26 | Subject detection device |
Country Status (2)
Country | Link |
---|---|
JP (1) | JP2014106685A (en) |
WO (1) | WO2014084218A1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110795985A (en) * | 2018-08-02 | 2020-02-14 | 松下电器(美国)知识产权公司 | Information processing method and information processing system |
CN112154492A (en) * | 2018-03-19 | 2020-12-29 | 德尔克股份有限公司 | Early warning and collision avoidance |
EP3664020A4 (en) * | 2017-07-31 | 2021-04-21 | Equos Research Co., Ltd. | Image data generation device, image recognition device, image data generation program, and image recognition program |
CN112997214A (en) * | 2018-11-13 | 2021-06-18 | 索尼公司 | Information processing apparatus, information processing method, and program |
US11443631B2 (en) | 2019-08-29 | 2022-09-13 | Derq Inc. | Enhanced onboard equipment |
US11741367B2 (en) | 2017-03-13 | 2023-08-29 | Fanuc Corporation | Apparatus and method for image processing to calculate likelihood of image of target object detected from input image |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9542626B2 (en) * | 2013-09-06 | 2017-01-10 | Toyota Jidosha Kabushiki Kaisha | Augmenting layer-based object detection with deep convolutional neural networks |
JP6511982B2 (en) * | 2015-06-19 | 2019-05-15 | 株式会社デンソー | Driving operation discrimination device |
JP6795379B2 (en) * | 2016-03-10 | 2020-12-02 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America | Operation control device, operation control method and operation control program |
US10392038B2 (en) * | 2016-05-16 | 2019-08-27 | Wi-Tronix, Llc | Video content analysis system and method for transportation system |
US10210418B2 (en) * | 2016-07-25 | 2019-02-19 | Mitsubishi Electric Research Laboratories, Inc. | Object detection system and object detection method |
JP2018136211A (en) * | 2017-02-22 | 2018-08-30 | オムロン株式会社 | Environment recognition system and learning device |
JP6782433B2 (en) * | 2017-03-22 | 2020-11-11 | パナソニックIpマネジメント株式会社 | Image recognition device |
US10007269B1 (en) | 2017-06-23 | 2018-06-26 | Uber Technologies, Inc. | Collision-avoidance system for autonomous-capable vehicle |
CN107390682B (en) * | 2017-07-04 | 2020-08-07 | 安徽省现代农业装备产业技术研究院有限公司 | Automatic driving path following method and system for agricultural vehicle |
JP6797860B2 (en) * | 2018-05-02 | 2020-12-09 | 株式会社日立国際電気 | Water intrusion detection system and its method |
JP7401199B2 (en) * | 2019-06-13 | 2023-12-19 | キヤノン株式会社 | Information processing device, information processing method, and program |
WO2021205616A1 (en) * | 2020-04-09 | 2021-10-14 | 三菱電機株式会社 | Moving body control device, moving body control method, and learning device |
JP7484587B2 (en) | 2020-08-31 | 2024-05-16 | 沖電気工業株式会社 | Traffic monitoring device, traffic monitoring system, traffic monitoring method, and traffic monitoring program |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011165170A (en) * | 2010-01-15 | 2011-08-25 | Toyota Central R&D Labs Inc | Object detection device and program |
WO2011161924A1 (en) * | 2010-06-23 | 2011-12-29 | 国立大学法人大阪大学 | Moving-object detection device |
WO2012124000A1 (en) * | 2011-03-17 | 2012-09-20 | 日本電気株式会社 | Image recognition system, image recognition method, and nontemporary computer- readable medium in which program for image recognition is stored |
JP2012185684A (en) * | 2011-03-07 | 2012-09-27 | Jvc Kenwood Corp | Object detection device and object detection method |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5707570B2 (en) * | 2010-03-16 | 2015-04-30 | パナソニックIpマネジメント株式会社 | Object identification device, object identification method, and learning method for object identification device |
JP5290229B2 (en) * | 2010-03-30 | 2013-09-18 | セコム株式会社 | Learning device and object detection device |
JP5214716B2 (en) * | 2010-12-14 | 2013-06-19 | 株式会社東芝 | Identification device |
JP5901054B2 (en) * | 2011-12-02 | 2016-04-06 | 国立大学法人九州工業大学 | Object detection method and object detection apparatus using the method |
-
2012
- 2012-11-27 JP JP2012258488A patent/JP2014106685A/en active Pending
-
2013
- 2013-11-26 WO PCT/JP2013/081808 patent/WO2014084218A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011165170A (en) * | 2010-01-15 | 2011-08-25 | Toyota Central R&D Labs Inc | Object detection device and program |
WO2011161924A1 (en) * | 2010-06-23 | 2011-12-29 | 国立大学法人大阪大学 | Moving-object detection device |
JP2012185684A (en) * | 2011-03-07 | 2012-09-27 | Jvc Kenwood Corp | Object detection device and object detection method |
WO2012124000A1 (en) * | 2011-03-17 | 2012-09-20 | 日本電気株式会社 | Image recognition system, image recognition method, and nontemporary computer- readable medium in which program for image recognition is stored |
Non-Patent Citations (1)
Title |
---|
HIDEFUMI YOSHIDA ET AL.: "A study on a method for stable pedestrian detection against pose changes with generative learning", IEICE TECHNICAL REPORT, vol. 111, no. 49, 12 May 2011 (2011-05-12), pages 127 - 132 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11741367B2 (en) | 2017-03-13 | 2023-08-29 | Fanuc Corporation | Apparatus and method for image processing to calculate likelihood of image of target object detected from input image |
DE102018105334B4 (en) | 2017-03-13 | 2024-01-25 | Fanuc Corporation | Image processing device and image processing method for calculating the image probability of a target object captured from an input image |
EP3664020A4 (en) * | 2017-07-31 | 2021-04-21 | Equos Research Co., Ltd. | Image data generation device, image recognition device, image data generation program, and image recognition program |
US11157724B2 (en) | 2017-07-31 | 2021-10-26 | Equos Research Co., Ltd. | Image data generation device, image recognition device, image data generation program, and image recognition program |
CN112154492A (en) * | 2018-03-19 | 2020-12-29 | 德尔克股份有限公司 | Early warning and collision avoidance |
US11749111B2 (en) | 2018-03-19 | 2023-09-05 | Derq Inc. | Early warning and collision avoidance |
US11763678B2 (en) | 2018-03-19 | 2023-09-19 | Derq Inc. | Early warning and collision avoidance |
CN110795985A (en) * | 2018-08-02 | 2020-02-14 | 松下电器(美国)知识产权公司 | Information processing method and information processing system |
CN112997214A (en) * | 2018-11-13 | 2021-06-18 | 索尼公司 | Information processing apparatus, information processing method, and program |
CN112997214B (en) * | 2018-11-13 | 2024-04-26 | 索尼公司 | Information processing device, information processing method, and program |
US11443631B2 (en) | 2019-08-29 | 2022-09-13 | Derq Inc. | Enhanced onboard equipment |
US11688282B2 (en) | 2019-08-29 | 2023-06-27 | Derq Inc. | Enhanced onboard equipment |
Also Published As
Publication number | Publication date |
---|---|
JP2014106685A (en) | 2014-06-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2014084218A1 (en) | Subject detection device | |
JP7052663B2 (en) | Object detection device, object detection method and computer program for object detection | |
US11741696B2 (en) | Advanced path prediction | |
WO2019223582A1 (en) | Target detection method and system | |
US9776564B2 (en) | Vehicle periphery monitoring device | |
JP4173902B2 (en) | Vehicle periphery monitoring device | |
JP4173901B2 (en) | Vehicle periphery monitoring device | |
DE112018007287T5 (en) | VEHICLE SYSTEM AND METHOD FOR DETECTING OBJECTS AND OBJECT DISTANCE | |
CN107133559B (en) | Mobile object detection method based on 360 degree of panoramas | |
US11170272B2 (en) | Object detection device, object detection method, and computer program for object detection | |
JP4171501B2 (en) | Vehicle periphery monitoring device | |
US20120070034A1 (en) | Method and apparatus for detecting and tracking vehicles | |
CN102073846A (en) | Method for acquiring traffic information based on aerial images | |
CN111967396A (en) | Processing method, device and equipment for obstacle detection and storage medium | |
JP4631036B2 (en) | Passer-by behavior analysis device, passer-by behavior analysis method, and program thereof | |
US10984264B2 (en) | Detection and validation of objects from sequential images of a camera | |
CN114359714A (en) | Unmanned body obstacle avoidance method and device based on event camera and intelligent unmanned body | |
WO2018138782A1 (en) | Information processing device, feature point extraction program, and feature point extraction method | |
US11120292B2 (en) | Distance estimation device, distance estimation method, and distance estimation computer program | |
KR20160015091A (en) | Trafic signal control using HOG-based pedestrian detection and behavior patterns | |
US20230245323A1 (en) | Object tracking device, object tracking method, and storage medium | |
EP4089649A1 (en) | Neuromorphic cameras for aircraft | |
JP4055785B2 (en) | Moving object height detection method and apparatus, and object shape determination method and apparatus | |
CN115131594B (en) | Millimeter wave radar data point classification method and device based on ensemble learning | |
GB2525587A (en) | Monocular camera cognitive imaging system for a vehicle |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13857904 Country of ref document: EP Kind code of ref document: A1 |
|
DPE1 | Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101) | ||
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13857904 Country of ref document: EP Kind code of ref document: A1 |