WO2017188017A1 - Detection device, detection method, and program - Google Patents

Detection device, detection method, and program Download PDF

Info

Publication number
WO2017188017A1
WO2017188017A1 PCT/JP2017/015212 JP2017015212W WO2017188017A1 WO 2017188017 A1 WO2017188017 A1 WO 2017188017A1 JP 2017015212 W JP2017015212 W JP 2017015212W WO 2017188017 A1 WO2017188017 A1 WO 2017188017A1
Authority
WO
WIPO (PCT)
Prior art keywords
unit
subject
detection
distance
distance information
Prior art date
Application number
PCT/JP2017/015212
Other languages
French (fr)
Japanese (ja)
Inventor
小野 博明
英史 山田
光永 知生
Original Assignee
ソニーセミコンダクタソリューションズ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニーセミコンダクタソリューションズ株式会社 filed Critical ソニーセミコンダクタソリューションズ株式会社
Publication of WO2017188017A1 publication Critical patent/WO2017188017A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume

Definitions

  • the present technology relates to a detection device, a detection method, and a program, and more particularly, to a detection device, a detection method, and a program that detect a predetermined subject from an image, for example.
  • the present technology has been made in view of such a situation, and makes it possible to reduce processing when a predetermined object is recognized (detected).
  • the object is imaged from an acquisition unit that acquires distance information regarding a distance to a subject, the distance information, and a feature amount of an object to be detected.
  • a setting unit configured to set a possible region; and a determination unit configured to determine whether an image in the region is the object.
  • a second detection device sets an area where a predetermined object may be captured using an acquisition unit that acquires distance information regarding a distance to a subject and the distance information A setting unit; and an estimation unit that estimates a category to which the object belongs from the size of the region and the distance information.
  • the first detection method acquires distance information regarding a distance to a subject, and the object may be captured from the distance information and a feature amount of the object to be detected.
  • a step of setting a certain region and determining whether or not an image in the region is the object.
  • distance information related to a distance to a subject is acquired, an area where a predetermined object may be captured is set using the distance information, and the area And estimating the category to which the object belongs from the distance information and the distance information.
  • the first program may acquire distance information related to a distance to a subject in a computer, and the object may be captured from the distance information and a feature amount of the object to be detected.
  • a region having a characteristic is set, and a process including a step of determining whether or not an image in the region is the object is executed.
  • the second program of one aspect of the present technology acquires distance information related to a distance to a subject in a computer, sets an area where a predetermined object may be imaged using the distance information, A process including a step of estimating a category to which the object belongs is executed based on the size of the area and the distance information.
  • distance information regarding the distance to the subject is acquired, and an object is imaged from the distance information and the feature amount of the object to be detected.
  • An area that may be present is set, and it is determined whether or not the image in the area is an object.
  • the distance information related to the distance to the subject is acquired, and the distance information may be used to capture a predetermined object Is set, and the category to which the object belongs is estimated from the size of the region and the distance information.
  • the detection device may be an independent device or an internal block constituting one device.
  • the program can be provided by being transmitted through a transmission medium or by being recorded on a recording medium.
  • processing when recognizing (detecting) a predetermined object can be reduced.
  • the present technology is applicable to recognizing (detecting) a predetermined object, for example, an object such as a person (face, upper body, whole body), automobile, bicycle, foodstuff, or the like.
  • a predetermined object for example, an object such as a person (face, upper body, whole body), automobile, bicycle, foodstuff, or the like.
  • the present technology detects such a predetermined object using distance information.
  • distance information In the following description, a case where a human face is detected using distance information will be described as an example.
  • FIG. 1 is a diagram illustrating a configuration of an embodiment of a detection device to which the present technology is applied.
  • the detection apparatus 100 illustrated in FIG. 1 includes a distance information acquisition unit 111, a subject feature extraction unit 112, a subject candidate area detection unit 113, and an actual size database 114.
  • the distance information acquisition unit 111 measures the distance to the subject, generates a measurement result (distance information), and outputs the result to the subject feature extraction unit 112.
  • the distance information acquisition unit 111 acquires distance information by a distance measuring sensor using active light (infrared rays or the like), for example.
  • a distance measuring sensor using active light a TOF (Time-of-Flight) method, a Structured Light method, or the like can be applied.
  • the distance information acquisition unit 111 may be configured to acquire distance information by a distance measuring sensor (ranging method) using reflected light of active light, for example, a TOF light source or a camera flash light.
  • the distance information acquisition unit 111 may be configured to acquire distance information with a stereo camera.
  • the distance information acquisition unit 111 may be configured to acquire distance information using an ultrasonic sensor.
  • the distance information acquisition unit 111 may be configured to acquire distance information by a method using a millimeter wave radar.
  • the subject feature extraction unit 112 sets a detection target, for example, a frame that may have a human face, from the distance information.
  • the frame is set by referring to a table stored in the actual size database 114.
  • the actual size database 114 manages a table in which the size on the image in consideration of the distance and the actual size of the detected subject is associated. For example, when there is a person's face at a position separated by a predetermined distance, the table describes how much the face of the person is on the image.
  • the subject candidate area detection unit 113 determines whether or not there is a detection target within the set frame. If there is, the subject candidate area detection unit 113 cuts out the frame and outputs it to a subsequent processing unit (not shown).
  • the subject feature extraction unit 112 sets a pixel (target pixel) to be processed.
  • the pixel of interest is sequentially set from the upper left pixel of the image.
  • the pixel of interest is sequentially set from the upper left pixel to the lower right pixel of the distance image 131.
  • the order of the target pixels that are sequentially set is represented by an arrow, but the target pixels may be set in an order other than such an order.
  • the distance image 131 is an image generated from distance information.
  • the same distance is an image represented by the same color and colored according to the distance.
  • the distance image 131 according to the present technology does not need to be an image colored according to the distance, but simply knows how far a predetermined pixel (subject) in the image 131 is from the detection device 100. Any image can be used.
  • the description will be continued assuming that the distance image 131 is generated as shown in FIG. Further, as illustrated in the upper diagram of FIG. 3, the description will be continued with an example in which a pixel at a predetermined position in the distance image 131 is set as the pixel of interest 132.
  • step S102 distance information on the target pixel is acquired.
  • step S103 the size of the detection frame is determined from the distance and the actual size of the subject to be detected. As shown in the middle diagram of FIG. 3, the subject feature extraction unit 112 sets a detection frame 133 from the distance of the target pixel 132 and the actual size of the subject to be detected.
  • the subject feature extraction unit 112 sets a detection frame 133 with reference to a table managed by the actual size database 114.
  • a table 151 as shown in FIG. 4 is stored.
  • the table 151 shown in FIG. 4 is a table in which the size on the image based on the distance and the actual size of the face is associated. For example, when the distance is 0 (cm), the size on the face image is 30 pixels ⁇ 30 pixels, and when the distance is 50 (cm), the size on the face image is 25 pixels ⁇ 25 pixels. Yes, when the distance is 100 (cm), the relationship that the size on the face image is 20 pixels ⁇ 20 pixels is described.
  • the distance is a distance between the detection device 100 and a subject (in this case, a human face).
  • the actual face size is the average human face size at a predetermined distance, for example, 50 cm away. Since human faces vary depending on gender and age, and there are individual differences, the actual face size will be described here as an average human face size.
  • a table 151 in which the size on the image based on the actual size of a plurality of faces is associated with one distance may be created, and processing may be performed using such a table 151. For example, for one distance, the size on the image based on the actual size of the male face, the size on the image based on the actual size of the female face, and the size on the image based on the actual size of the child May be associated.
  • a detection frame 133 corresponding to the size on the image based on each actual size may be set, and the process of step S104 described later may be executed for each detection frame 133.
  • FIG. 4 shows an example in which the distances are related to 0, 50, 100 and the size on the image based on the actual size in units of 50 centimeters.
  • the distance is not limited to 50 centimeters. The value can be changed depending on the accuracy of information and the accuracy required for detection.
  • step S103 the subject feature extraction unit 112 sets the detection frame 133 from the size on the image based on the distance of the target pixel 132 and the actual size of the subject to be detected, as shown in the middle diagram of FIG. To do.
  • the pixel of interest 132 is 25 pixels ⁇ 25 pixels centering on the target pixel 132.
  • a detection frame 133 is set.
  • the detection frame 133 is described as an example of a quadrangle as illustrated in FIG. 3, but the detection frame 133 is not limited to a rectangle such as a rectangle, but may be other shapes such as a circle.
  • the actual size of the subject is the feature (feature amount) of the subject and the detection frame 133 is set using the size on the image based on the actual size will be described. It is also possible to set the detection frame 133 using the feature (feature amount).
  • the subject feature extraction unit 112 functions as a setting unit that sets the size of the subject as the feature amount and sets the detection frame 133 set to the feature amount in this case.
  • the detection frame 133 is a frame that is set by calculating the size on the distance image 131 if there is a subject to be detected, for example, a human face, at the distance of the position of the target pixel 132. It is. Note that such calculation itself is omitted, and other forms such as those described in the table 151 can be applied.
  • the subject candidate area detection unit 113 determines whether the image in the detection frame 133 is a candidate for a subject to be detected. For example, a filter having the same size as the detection frame 133 is applied to the distance image 131, and the response value is used as the probability value of the subject candidate.
  • a filter having the same size as the detection frame 133 is applied to the distance image 131, and the response value is used as the probability value of the subject candidate.
  • a DOG (Difference-of-Gaussian) filter, a Laplacian filter, or the like can be applied.
  • Whether the image in the detection frame 133 is a candidate for a subject to be detected can be determined using distance information in the detection frame 133 and the detection frame 133.
  • the distance information in the detection frame 133 is also information that varies in distance. Also, for example, when a human face is captured in the detection frame 133, but in the case of a human face shown in a photograph (poster), the distance information in the detection frame 133 is a constant value, The information is not distant.
  • the determination result is output to a processing unit (not shown) at the subsequent stage of the detection apparatus 100. Only when it is determined that there is a subject to be detected in the detection frame 133, the image in the detection device 100 can be cut out from the image 131, and the cut out image can be output.
  • the image in the detection frame 133-1 is cut out from the image 131, Is output. If it is determined that there is no face in the detection frame 133-2 and there is no subject to be detected, the image in the detection frame 133-2 is not cut out.
  • the timing at which the cutout is performed can be performed after the processing in step S105 is completed. Further, the determination result in the determination process in step S104, that is, in this case, the value when the filter is applied is set as the probability value of the subject candidate, and after the process in step S105, clipping is performed based on the probability value. good. It is also possible to have a configuration in which only the probability value is output to the subsequent stage.
  • the subject candidate area detection unit 112 functions as a determination unit that determines whether the image in the detection frame 133 is a subject to be detected.
  • an image that is determined by the subject candidate area detection unit 112 to be a subject to be detected can be cut out and output to a subsequent processing unit or the like.
  • step S105 it is determined whether or not such processing has been completed for all pixels in the image 131. If it is determined that the processing has not been completed for all pixels, the processing returns to step S101. Then, a new target pixel is set, and the processing after step S102 is performed on the set target pixel.
  • step S105 if it is determined in step S105 that such processing has been completed for all the pixels in the image 131, the recognition processing is terminated.
  • the probability value of the subject candidate is obtained for all the pixels in the distance image 131. Then, the maximum value of the probability value is set as the center position of the detection subject, the pixel at the center position is set as the target pixel 132, and the image in the detection frame 133 is cut out.
  • the target pixel 132 since the target pixel 132 is set and the detection frame 133 is set as described above, the target pixel 132 may not be set for all the pixels in the image 131.
  • the detection frame 133 cannot be set. Therefore, even if the detection frame 133 is set, a part of the detection frame 133 (in this case, 3/4) is missing. Therefore, a pixel in such a region may not be set as the target pixel 132.
  • the region near the side of the image 131 is also a region where the detection frame 133 cannot be set. Therefore, a pixel in such a region may not be set as the target pixel 132.
  • the target pixel 132 may be sequentially set pixel by pixel, but may be set at a predetermined interval, for example, every five pixels.
  • an area where the distance is determined to be far away in other words, a pixel in an area where the background can be determined may not be set as the pixel of interest 132.
  • the target pixel 132 to be processed can be reduced, and the processing can be reduced.
  • FIG. 5 shows an example of the image 131, and shows a case where the detection frames 133-1 to 133-4 are set as regions where there is a possibility that there is a detection target (human face).
  • the subject candidate area detection unit 113 determines that there is a face in the detection frame 133-1 set according to the distance of the subject by the subject feature extraction unit 112 (FIG. 1). The image within the detection frame 133-1 is cut out and output.
  • the subject candidate area detection unit 113 determines that there is no face in the detection frame 133-2 set according to the distance of the subject by the subject feature extraction unit 112 (FIG. 1). The image within the detection frame 133-2 is not cut out.
  • the subject candidate area detection unit 113 determines that there is a face within the detection frame 133-3 set according to the distance of the subject by the subject feature extraction unit 112 (FIG. 1). The image within the detection frame 133-3 is cut out and output.
  • the face is a photograph or the like.
  • the subject candidate area detection unit 113 determines that there is no face, the image in the detection frame 133-4 is not cut out.
  • an object is detected using the distance and the size of the detected object at the distance.
  • the detection target is a human face
  • the detection can be reduced by performing detection using the present technology rather than performing detection by pattern matching or the like.
  • FIG. 6 is a diagram illustrating a configuration example of the detection device 200 according to the second embodiment.
  • the same portions are denoted by the same reference numerals, and description thereof is omitted.
  • the detection device 200 in the second embodiment is configured by adding an imaging unit 211 and a subject detail recognition unit 212 to the detection device 100 in the first embodiment.
  • the imaging unit 211 includes an imaging element such as a CCD or a CMOS image sensor, captures an image of ambient light (described as a normal image), and supplies the image to the subject detail recognition unit 212.
  • the detection result from the subject candidate area detection unit 113 is also supplied to the subject detail recognition unit 212.
  • the subject candidate region detection unit 113 cuts out and outputs a region determined to have a detection target, for example, a human face, using the distance image 131.
  • the subject detail recognition unit 212 performs more detailed recognition on the subject in the region supplied from the subject candidate region detection unit 113 using the normal image. For example, a recognition process for specifying an individual such as gender and age is performed.
  • Steps S201 to S205 are processes performed by the distance information acquisition unit 111 to the subject candidate area detection unit 113, and are performed in the same manner as steps S101 to S105 of the flowchart shown in FIG.
  • the subject detail recognition unit 212 performs detail recognition using the subject candidate detection frame. For example, the subject detail recognition unit 212 sets the detection frame 133 supplied from the subject candidate region detection unit 113 as a corresponding region of the normal image from the imaging unit 211, and cuts out the image within the set detection frame 133. Then, using the extracted normal image, a preset recognition process such as a recognition process for specifying an individual such as the sex or age of the subject is executed.
  • a preset recognition process such as a recognition process for specifying an individual such as the sex or age of the subject is executed.
  • the information supplied from the subject candidate area detecting unit 113 to the subject detail recognizing unit 212 includes the size on the image (detection frame 133) based on the actual size of the subject, the representative point (for example, the target pixel 132), and the distribution map. It can be information such as (eg, heat map, filter response value).
  • the subject detail recognition unit 212 performs detail recognition using information supplied from the subject candidate region detection unit 113.
  • FIG. 8 By performing such detection (recognition) processing, for example, a detection result as shown in FIG. 8 is obtained.
  • the upper and middle views of FIG. 8 are the same as FIG. That is, by the detection process using the distance image 131, the detection frame 133-1 and the detection frame 133-3 are supplied to the subject detail recognition unit 212 as information on the area where the subject is detected.
  • the subject detail recognizing unit 212 uses, for example, a method such as DNN (DeepningLearning) to detect an image cut out from the normal image when the detection frame 133-1 and the detection frame 133-3 are set for the normal image. To perform recognition processing.
  • DNN DeepningLearning
  • the distance and the size of the detected object at the distance are used to detect the detected object, so that the detection accuracy is improved and the processing load related to the detection is reduced. It becomes possible to make it. Furthermore, in the second embodiment, since a detailed recognition process is executed using a normal image (an image other than a distance image), the subject can be detected in more detail and the subject can be recognized.
  • FIG. 9 is a diagram illustrating a configuration example of the detection apparatus 300 according to the third embodiment.
  • the same portions are denoted by the same reference numerals, and description thereof is omitted.
  • the detection apparatus 300 according to the third embodiment is different from the detection apparatus 100 according to the first embodiment in that a subject direction detection unit 311 is added to the detection apparatus 100 according to the first embodiment. .
  • the subject direction detection unit 311 detects the direction in which the detected subject is facing.
  • the detection device 300 according to the third embodiment detects the position, size, and direction of the subject.
  • Steps S301 to S306 are processes performed by the distance information acquisition unit 111 to the subject candidate area detection unit 113, and are performed in the same manner as steps S101 to S105 in the flowchart shown in FIG. Description is omitted.
  • step S305 the subject candidate region detection unit 113 determines that the region (the region set by the detection frame 133) determined that there is a subject to be detected and the image cut out from the region are displayed in the subject direction detection unit 311. Supplied.
  • the subject direction detection unit 311 detects the direction of the detected subject.
  • a case where a screen as shown in FIG. 11 is acquired will be described as an example, and the direction detection will be described.
  • the detection target is described as being a hand.
  • the subject feature extraction unit 112 and the subject candidate region detection unit 113 execute the processes of steps S302 to S304, so that a detection frame 133 is set in the distance image 131, and the detection frame 133 includes a detection target. A hand is detected.
  • the subject direction detection unit 311 divides the inside of the detection frame 133 into a predetermined size, and uses the divided area as the subject surface, and obtains the normal direction of the surface.
  • the palm faces rightward in the figure. When the palm is directed to the right, as the distance information of the palm part, distance information gradually getting farther from the front toward the back is obtained.
  • a normal in the right direction in the figure is set as shown in FIG. From this set normal, it is determined that the palm is facing the right direction in the figure.
  • the detection object is detected using the distance and the size of the detection object at the distance, so the detection accuracy is improved. It becomes possible to make it. It is also possible to determine the direction of the detected subject.
  • FIG. 12 is a diagram illustrating a configuration example of the detection apparatus 400 according to the fourth embodiment.
  • the same portions are denoted by the same reference numerals, and description thereof is omitted.
  • the detection device 400 according to the fourth embodiment is configured by adding an imaging unit 411 and a subject detail recognition unit 412 to the detection device 300 according to the third embodiment.
  • the added imaging unit 411 and subject detail recognition unit 412 perform basically the same processing as the imaging unit 211 and subject detail recognition unit 212 (both in FIG. 6) of the detection apparatus 200 in the second embodiment.
  • the imaging unit 411 captures a normal image and supplies it to the subject detail recognition unit 412.
  • the subject detail recognition unit 412 is also supplied with the detection result from the subject direction detection unit 311.
  • the subject direction detection unit 311 outputs a detection target, for example, the position of a human face (position where the detection frame 133 is set), its size (size of the detection frame 133), and its direction.
  • the subject detail recognition unit 412 performs more detailed recognition on the subject in the area supplied from the subject direction detection unit 311 using a normal image. For example, a recognition process for specifying an individual such as gender and age is performed.
  • Steps S401 to S406 are processes performed by the distance information acquisition unit 111, the subject feature extraction unit 112, the subject candidate region detection unit 113, and the subject direction detection unit 311.
  • the subject detail recognition unit 412 performs detail recognition using the subject candidate detection frame and the direction of the subject.
  • the subject detail recognition unit 412 sets the detection frame 133 supplied from the subject direction detection unit 311 as a corresponding area of the normal image from the imaging unit 411, and cuts out the image in the set detection frame 133.
  • a preset recognition process such as a recognition process for specifying an individual such as the sex or age of the subject is executed. This recognition processing is performed in consideration of the direction of the subject.
  • FIG. 14 shows a diagram comparing the recognition method performed by the detection apparatus 400 according to the fourth embodiment with other recognition methods.
  • the left diagram in FIG. 14 is a diagram illustrating an example of another recognition method. For example, when a face is detected as a detection target from a normal image, first, it is assumed that the detected object is a face, and in order to determine whether the face is facing forward or backward or left and right The determination with reference to the front / rear / left / right determination dictionary 431 is performed.
  • the front / rear determination dictionary 432 is referred to and it is determined whether it is forward-facing or backward-facing. If it is determined to be forward, the forward dictionary 434 is referred to, and it is determined whether or not it is a human face, and if it is a human face, it is determined whether or not it is a forward-facing face. With this process, when data for identifying an individual is described in the forward dictionary 434, a person is identified by matching with the data.
  • the forward / backward determination dictionary 432 is referred to and determined to be backward
  • the backward dictionary 435 is referred to to determine whether or not the face is a person's face. It is determined whether or not there is.
  • the backward dictionary 435 is referred to to determine whether or not the face is a person's face. It is determined whether or not there is.
  • the left / right determination dictionary 433 is referred to and leftward. Or whether it is facing right.
  • the leftward dictionary 436 is referred to, and it is determined whether or not the face is a human face. If the face is a human face, it is determined whether or not the face is a leftward face.
  • the right-facing dictionary 437 is referred to to determine whether or not it is a human face. It is determined whether or not there is.
  • the recognition process is performed by referring to a plurality of dictionaries and making a determination.
  • recognition processing can be performed by preparing an X-direction dictionary 451 and referring to the X-direction dictionary 451.
  • the X-direction dictionary 451 is a dictionary including a forward-facing dictionary 434, a backward-facing dictionary 435, a left-facing dictionary 436, and a right-facing dictionary 437. Since the subject direction recognition unit 412 (FIG. 12) is also supplied with the direction of the subject, only the dictionary relating to the supplied direction can be referred to and the recognition process can be performed.
  • the number of dictionaries (data amount) can be reduced, and the determination process performed a plurality of times with reference to the dictionary can be omitted. Therefore, according to the detection apparatus 400 in the fourth embodiment, it is possible to reduce processing related to recognition processing.
  • the detection target for example, an image that has a high possibility of being a face (a clipped image) is a target for detailed recognition, a region to be processed in the image is narrowed. Therefore, it is possible to reduce processing related to recognition processing.
  • the detection object is detected by using the distance and the size of the detection object at the distance, the detection accuracy can be improved.
  • a detailed recognition process is executed using a normal image (an image other than a distance image)
  • the subject can be detected in more detail and the subject can be recognized.
  • the recognition process can be a process in which the direction of the subject is acquired in advance, and the process can be reduced.
  • the detection target is detected by estimating the size of the subject and estimating the category to which the subject belongs.
  • FIG. 15 is a diagram illustrating a configuration example of the detection apparatus 500 according to the fifth embodiment.
  • the detection apparatus 500 shown in FIG. 15 includes a distance information acquisition unit 111, a subject size estimation unit 511, and a subject category estimation unit 512.
  • the distance information acquisition unit 111 has a configuration similar to that of the distance information acquisition unit 111 included in the detection device 100, for example, and has a function of acquiring distance information for generating the distance image 131.
  • the subject size estimation unit 511 estimates the size of the subject and supplies the estimated size information to the subject category estimation unit 512.
  • the subject category estimation unit 512 estimates the category to which the subject belongs from the estimated size of the subject and the distance at which the subject is located.
  • the detection apparatus 500 estimates the size of the subject, and the category to which the subject belongs, for example, a category such as a human face category or a car category is determined from the size and distance.
  • step S501 the subject size estimation unit 511 sets a target pixel. This process can be performed, for example, in the same manner as step S101 in the flowchart shown in FIG.
  • the subject size estimation unit 511 acquires a distance around the set target pixel position.
  • the subject size estimation unit 511 estimates the subject size based on the peripheral distance distribution. For example, since the distance between the object and the background is greatly different, a region where the distance greatly changes (that is, an edge) is detected with reference to the surrounding distance distribution, so that the region where the object exists (the size up to the edge portion) Can be estimated.
  • step S504 the subject category estimation unit 512 estimates the subject category based on the distance and the subject size. As described above, since the category of the object at the position can be estimated from the distance and the size, such estimation is performed in step S504.
  • a distance image 131 as shown in FIG. 17 is acquired.
  • a distance image 131 illustrated in FIG. 17 is an image in which a hand is captured.
  • the distance distribution around the target pixel 132 is referred to.
  • the distance is greatly different between the hand part and the background. That is, in this case, the part with the hand is a short distance, but the background is a long distance.
  • the distance distribution around the pixel of interest 132 when the distance distribution is referenced in a direction gradually moving away from the pixel of interest 132, there is a portion where the distance changes greatly.
  • the target pixel 132 since the target pixel 132 is set at a position approximately in the center of the palm, when searching from the palm toward the fingertip, the distance information changes rapidly from the tip of the finger to the background. To do. From the pixel of interest 132 to the position where the distance information changes abruptly, the arrows are used in FIG. Note that the position where the distance information changes suddenly decreases when the difference between the distance of the target pixel 132 and the distance of the pixel being searched for changes to a predetermined threshold value or more. It is good also as a position where distance information changed suddenly.
  • a range where an object may exist is estimated from the target pixel 132.
  • a circle or a rectangle (not shown) having a radius from the target pixel 132 to the tip of the longest arrow is set, and the size of the circle or the rectangle is the subject size.
  • This subject size corresponds to the detection frame 133 in the first to fourth embodiments. In other words, the detection frame 133 is set by such processing.
  • the category of the detected subject is estimated from the distance of the target pixel 132 and the subject size (detection frame 133). In the example illustrated in FIG. 17, it is estimated that the detected subject size at the distance of the target pixel 132 belongs to the category “hand”.
  • step S505 It is determined in step S505 whether or not such processing has been executed for all the pixels in the distance image 131. Similar to step S105 in the flowchart shown in FIG. 2, the pixels set as the target pixel 132 may not be all the pixels in the distance image 131 but may be partially excluded.
  • the size of the subject can be estimated from the distance image, and the category can be estimated.
  • a plurality of subjects (categories) can be estimated, and for example, different objects such as a person and a car can be detected.
  • FIG. 18 is a diagram illustrating a configuration example of a detection device 600 according to the sixth embodiment.
  • the same portions are denoted by the same reference numerals, and description thereof is omitted.
  • the detection device 600 in the sixth embodiment is configured by adding an imaging unit 611 and a subject detail recognition unit 612 to the detection device 500 in the fifth embodiment.
  • the added imaging unit 611 performs basically the same processing as the imaging unit 211 (FIG. 6) of the detection apparatus 200 in the second embodiment.
  • the imaging unit 611 captures a normal image and supplies it to the subject detail recognition unit 612.
  • the subject detail recognition unit 612 is also supplied with the distance, subject size, and subject category from the subject category estimation unit 712.
  • the subject detail recognition unit 612 performs more detailed recognition using the normal image using the distance, subject size, and subject category supplied from the subject category estimation unit 712.
  • Steps S601 to S605 are processes performed by the distance information acquisition unit 111, the subject size estimation unit 511, and the subject category estimation unit 512, and are performed in the same manner as steps S501 to S505 in the flowchart shown in FIG. Is omitted.
  • the subject detail recognition unit 412 performs detail recognition using the subject candidate region and the subject category. For example, the subject detail recognition unit 412 sets a frame corresponding to the subject size supplied from the subject category estimation unit 512 as a corresponding region of the normal image from the imaging unit 411, and cuts out an image within the set frame.
  • recognition processing such as recognition processing for specifying an object belonging to the category is set in advance. Execute the process. For example, when the category is determined to be a person, matching with an image belonging to a person is performed. When an individual is specified, or when the category is determined to be a car, matching with an image belonging to a car is performed. Detailed recognition to identify the vehicle type is performed.
  • the size of the subject can be estimated from the distance image, and the category to which the subject belongs can be estimated. Further, according to the detection apparatus 600, a plurality of subjects (categories) can be estimated, and for example, different objects such as a person and a car can be detected. Furthermore, the detected object can be recognized in detail.
  • FIG. 20 is a diagram illustrating a configuration example of the detection apparatus 700 according to the seventh embodiment.
  • the same portions are denoted by the same reference numerals, and description thereof is omitted.
  • the detection device 700 in the seventh embodiment adds a subject shape estimation unit 711 to the detection device 500 in the fifth embodiment, and the subject category estimation unit 712 receives an output from the subject shape estimation unit 711. This configuration differs from the detection device 500 according to the fifth embodiment.
  • the subject shape estimation unit 711 estimates the shape of the subject. Reference is again made to FIG. 17, when the distance image 131 in which the hand is imaged is acquired, the distance information greatly changes from the target pixel 132, that is, by searching up to the edge portion, the shape of the hand is obtained. It is done.
  • the subject category estimation unit 712 performs basically the same processing as the subject category estimation unit 512 of the detection apparatus 500 illustrated in FIG. 15, but the subject category estimation unit 712 illustrated in FIG.
  • the category is estimated using the estimated shape of the subject. Therefore, the category can be estimated with higher accuracy.
  • Steps S701 to S703 are processes performed by the distance information acquisition unit 111 and the subject size estimation unit 511, and are performed in the same manner as steps S501 to S503 in the flowchart shown in FIG.
  • the subject shape estimation unit 711 estimates the shape of the subject based on the distance distribution around the target pixel 132. As described with reference to FIG. 17, the shape is estimated by searching for a portion (edge) where the distance greatly changes using the distance information. In other words, the region where the distance is gradually changing is assumed to be a part of the detected object, and the shape of the object is obtained while determining whether such a distance is changing gently. It is done.
  • step S705 the subject category estimation unit 712 estimates the category to which the subject belongs based on the distance, the subject size, and the shape. In this case, since the category is estimated using not only the distance and the subject size but also the shape information, the category can be estimated with higher accuracy.
  • the detection apparatus 700 in the seventh embodiment it is possible to estimate the size of the subject, estimate the category, and estimate the shape of the subject from the distance image. Further, according to the detection device 700, a plurality of subjects (categories) can be estimated, and for example, different objects such as a person and a car can be detected.
  • category estimation by the subject category estimation unit 712 may be omitted, and the subject shape estimation result by the subject shape estimation unit 711 may be output to a subsequent processing unit (not shown).
  • FIG. 22 is a diagram illustrating a configuration example of the detection apparatus 800 according to the eighth embodiment.
  • the same portions are denoted by the same reference numerals, and description thereof is omitted.
  • the detection apparatus 800 according to the eighth embodiment is configured by adding an imaging unit 811 and a subject detail recognition unit 812 to the detection apparatus 700 according to the seventh embodiment.
  • the added imaging unit 811 performs basically the same processing as the imaging unit 211 (FIG. 6) of the detection apparatus 200 in the second embodiment.
  • the imaging unit 811 captures a normal image and supplies it to the subject detail recognition unit 812.
  • the subject detail recognition unit 812 is also supplied with the distance, subject size, subject category, and subject shape from the subject category estimation unit 712.
  • the subject detail recognition unit 812 performs more detailed recognition using the normal image using the distance, subject size, subject category, and subject shape supplied from the subject category estimation unit 712.
  • Steps S801 to S806 are processing performed by the distance information acquisition unit 111, the subject size estimation unit 511, the subject shape estimation unit 711, and the subject category estimation unit 712, and are similar to steps S701 to S706 of the flowchart shown in FIG. Since it is performed, the description thereof is omitted.
  • the subject detail recognition unit 812 performs detail recognition using the subject candidate area, the subject category, and the subject shape. For example, the subject detail recognition unit 812 sets a frame corresponding to the subject size supplied from the subject category estimation unit 712 as a corresponding region of the normal image from the imaging unit 411, and cuts out an image within the set frame.
  • a recognition process for identifying an object that matches the subject shape from among the objects belonging to the category For example, a preset recognition process is executed. For example, when the category is determined to be a person, matching with an image belonging to the person is performed, and when performing the matching, the object shape is referred to, and an image close to the shape, for example, the shape is a human face. In some cases, processing is performed such as narrowing down recognition to a person's face, and then identifying an individual after narrowing down.
  • the detection apparatus 800 in the eighth embodiment it is possible to estimate the size of the subject, estimate the category, and estimate the shape of the subject from the distance image. Further, the detection apparatus 800 can estimate a plurality of subjects (categories), and can detect different objects such as a person and a car, for example.
  • the detected object can be recognized in detail. Since the detailed recognition can be performed using information such as the estimated size, category, and shape of the subject, the processing related to the detailed recognition can be reduced.
  • an object can be detected from a distance image.
  • the present technology can be applied to a surveillance camera or the like.
  • the present technology can be applied to a game machine, and can be applied to a device that detects a person who plays a game and detects a gesture of the person (detects a hand, a direction of the hand, and the like).
  • the detection device is mounted on a car, and a person, a bicycle, a car other than the own car is detected, and information on the detected object is notified to the user or a collision is prevented.
  • the present invention can also be applied to a part of a device that performs control for avoiding safety.
  • a stacked image sensor in which a plurality of substrates (dies) are stacked can be employed.
  • the detection device 200 is configured by a laminated image sensor will be described by taking the detection device 200 (FIG. 6) in the second embodiment as an example.
  • FIG. 24 is a diagram illustrating a first configuration example of a stacked image sensor in which the entire detection device 200 of FIG. 6 is incorporated.
  • the stacked image sensor of FIG. 24 has a two-layer structure in which a pixel substrate 901 and a signal processing substrate 902 are stacked.
  • the pixel substrate 901 is formed with a distance information acquisition unit 111 (part) and an imaging unit 211 (part).
  • the distance information acquisition unit 111 obtains distance information by the TOF method
  • the distance information acquisition unit 111 includes an irradiation unit that irradiates a subject with predetermined light and an imaging element that receives the irradiated light.
  • the distance information acquisition unit 111 can be formed on the pixel substrate 901 including a part of an imaging element or a part such as an irradiation unit.
  • the imaging unit 211 also includes an image sensor for capturing a normal image.
  • the part of the image sensor that constitutes the distance information acquisition unit 211 can be formed on the pixel substrate 901.
  • a subject feature extraction unit 112 On the eyelid signal processing board 82, a subject feature extraction unit 112, a subject candidate region detection unit 113, an actual size database 114, and a subject detail recognition unit 212 are formed.
  • the distance information acquisition unit 111 of the pixel substrate 901 performs imaging by receiving light incident thereon, and an image (distance image) obtained by the imaging. ) To detect an object to be detected.
  • the imaging unit 211 of the pixel substrate 901 performs imaging by receiving light incident thereon, and is set as a detection target from an image (normal image) obtained by the imaging. A subject image or the like is cut out and output.
  • FIG. 25 is a diagram illustrating a second configuration example of the stacked image sensor in which the entire detection device 200 of FIG. 6 is incorporated.
  • the stacked image sensor in FIG. 25 has a three-layer structure in which a pixel substrate 901, a signal processing substrate 902, and a memory substrate 903 are stacked.
  • a distance information acquisition unit 111 and an imaging unit 211 are formed on the pixel substrate 901, and a subject feature extraction unit 112, a subject candidate region detection unit 113, and a subject detail recognition unit 212 are formed on the signal processing substrate 902. .
  • a real size database 114 and an image storage unit 911 are formed on the memory substrate 903.
  • an image storage unit 911 is stored in the memory substrate 903 as a storage region for storing a detection result by the subject candidate region detection unit 113, for example, an image cut out from a distance image in which a subject to be detected is captured. Is formed.
  • An actual size database 114 storing the table 151 (FIG. 4) is also formed on the memory substrate 903.
  • the pixel substrate 901, the signal processing substrate 902, and the memory substrate 903 are stacked in that order from the top. However, for example, the order of the signal processing substrate 902 and the memory substrate 903 is changed.
  • the pixel substrate 901, the memory substrate 903, and the signal processing substrate 902 can be stacked in this order.
  • the laminated image sensor can be constituted by laminating four or more layers in addition to two or three layers of substrates.
  • a series of processes performed by each of the detection devices 100 to 800 can be performed by hardware or software.
  • a program constituting the software is installed in a general-purpose computer or the like.
  • FIG. 26 is a block diagram illustrating a configuration example of an embodiment of a computer in which a program for executing the above-described series of processes is installed.
  • the program can be recorded in advance in a hard disk 1005 or ROM 1003 as a recording medium built in the computer.
  • the program can be stored (recorded) in a removable recording medium 1011.
  • a removable recording medium 1011 can be provided as so-called package software.
  • examples of the removable recording medium 1011 include a flexible disk, a CD-ROM (Compact Disc Read Only Memory), a MO (Magneto Optical) disc, a DVD (Digital Versatile Disc), a magnetic disc, and a semiconductor memory.
  • the program can be installed on the computer from the removable recording medium 1011 as described above, or can be downloaded to the computer via the communication network or the broadcast network and installed on the built-in hard disk 1005. That is, the program is transferred from a download site to a computer wirelessly via a digital satellite broadcasting artificial satellite, or wired to a computer via a network such as a LAN (Local Area Network) or the Internet. be able to.
  • a network such as a LAN (Local Area Network) or the Internet.
  • the computer includes a CPU (Central Processing Unit) 1002, and an input / output interface 1010 is connected to the CPU 1002 via a bus 1001.
  • CPU Central Processing Unit
  • input / output interface 1010 is connected to the CPU 1002 via a bus 1001.
  • the CPU 1002 executes a program stored in a ROM (Read Only Memory) 1003 accordingly. .
  • the CPU 1002 loads a program stored in the hard disk 1005 to a RAM (Random Access Memory) 1004 and executes it.
  • the CPU 1002 performs the process according to the flowchart described above or the process performed by the configuration of the block diagram described above. Then, the CPU 1002 outputs the processing result as necessary, for example, via the input / output interface 1010, from the output unit 1006, or from the communication unit 1008, and further recorded on the hard disk 1005.
  • the input unit 1007 includes a keyboard, a mouse, a microphone, and the like.
  • the output unit 1006 includes an LCD (Liquid Crystal Display), a speaker, and the like.
  • the processing performed by the computer according to the program does not necessarily have to be performed in chronological order in the order described as the flowchart. That is, the processing performed by the computer according to the program includes processing executed in parallel or individually (for example, parallel processing or object processing).
  • the program may be processed by one computer (processor), or may be distributedly processed by a plurality of computers. Furthermore, the program may be transferred to a remote computer and executed.
  • the system means a set of a plurality of components (devices, modules (parts), etc.), and it does not matter whether all the components are in the same housing. Therefore, a plurality of devices housed in separate housings and connected via a network, and a single device housing a plurality of modules in one housing are all systems. .
  • the configuration examples of the detection devices 100 to 800 described above can be combined within a possible range.
  • the present technology can take a configuration of cloud computing in which one function is shared by a plurality of devices via a network and is jointly processed.
  • each step described in the above flowchart can be executed by one device or can be shared by a plurality of devices.
  • the plurality of processes included in the one step can be executed by being shared by a plurality of apparatuses in addition to being executed by one apparatus.
  • the technology according to the present disclosure can be applied to various products.
  • the technology according to the present disclosure may be any type of movement such as an automobile, an electric vehicle, a hybrid electric vehicle, a motorcycle, a bicycle, personal mobility, an airplane, a drone, a ship, a robot, a construction machine, and an agricultural machine (tractor). You may implement
  • FIG. 27 is a block diagram illustrating a schematic configuration example of a vehicle control system 7000 that is an example of a mobile control system to which the technology according to the present disclosure can be applied.
  • the vehicle control system 7000 includes a plurality of electronic control units connected via a communication network 7010.
  • the vehicle control system 7000 includes a drive system control unit 7100, a body system control unit 7200, a battery control unit 7300, a vehicle exterior information detection unit 7400, a vehicle interior information detection unit 7500, and an integrated control unit 7600. .
  • the communication network 7010 for connecting the plurality of control units conforms to an arbitrary standard such as CAN (Controller Area Network), LIN (Local Interconnect Network), LAN (Local Area Network), or FlexRay (registered trademark). It may be an in-vehicle communication network.
  • Each control unit includes a microcomputer that performs arithmetic processing according to various programs, a storage unit that stores programs executed by the microcomputer or parameters used for various calculations, and a drive circuit that drives various devices to be controlled. Is provided.
  • Each control unit includes a network I / F for communicating with other control units via a communication network 7010, and is connected to devices or sensors inside and outside the vehicle by wired communication or wireless communication. A communication I / F for performing communication is provided. In FIG.
  • a microcomputer 7610 As a functional configuration of the integrated control unit 7600, a microcomputer 7610, a general-purpose communication I / F 7620, a dedicated communication I / F 7630, a positioning unit 7640, a beacon receiving unit 7650, an in-vehicle device I / F 7660, an audio image output unit 7670, An in-vehicle network I / F 7680 and a storage unit 7690 are illustrated.
  • other control units include a microcomputer, a communication I / F, a storage unit, and the like.
  • the drive system control unit 7100 controls the operation of the device related to the drive system of the vehicle according to various programs.
  • the drive system control unit 7100 includes a driving force generator for generating a driving force of a vehicle such as an internal combustion engine or a driving motor, a driving force transmission mechanism for transmitting the driving force to wheels, and a steering angle of the vehicle. It functions as a control device such as a steering mechanism that adjusts and a braking device that generates a braking force of the vehicle.
  • the drive system control unit 7100 may have a function as a control device such as ABS (Antilock Brake System) or ESC (Electronic Stability Control).
  • a vehicle state detection unit 7110 is connected to the drive system control unit 7100.
  • the vehicle state detection unit 7110 includes, for example, a gyro sensor that detects the angular velocity of the rotational movement of the vehicle body, an acceleration sensor that detects the acceleration of the vehicle, an operation amount of an accelerator pedal, an operation amount of a brake pedal, and steering of a steering wheel. At least one of sensors for detecting an angle, an engine speed, a rotational speed of a wheel, or the like is included.
  • the drive system control unit 7100 performs arithmetic processing using a signal input from the vehicle state detection unit 7110, and controls an internal combustion engine, a drive motor, an electric power steering device, a brake device, or the like.
  • the body system control unit 7200 controls the operation of various devices mounted on the vehicle body according to various programs.
  • the body system control unit 7200 functions as a keyless entry system, a smart key system, a power window device, or a control device for various lamps such as a headlamp, a back lamp, a brake lamp, a blinker, or a fog lamp.
  • the body control unit 7200 can be input with radio waves or various switch signals transmitted from a portable device that substitutes for a key.
  • the body system control unit 7200 receives input of these radio waves or signals, and controls a door lock device, a power window device, a lamp, and the like of the vehicle.
  • the battery control unit 7300 controls the secondary battery 7310 that is a power supply source of the drive motor according to various programs. For example, information such as battery temperature, battery output voltage, or remaining battery capacity is input to the battery control unit 7300 from a battery device including the secondary battery 7310. The battery control unit 7300 performs arithmetic processing using these signals, and controls the temperature adjustment of the secondary battery 7310 or the cooling device provided in the battery device.
  • the outside information detection unit 7400 detects information outside the vehicle on which the vehicle control system 7000 is mounted.
  • the outside information detection unit 7400 is connected to at least one of the imaging unit 7410 and the outside information detection unit 7420.
  • the imaging unit 7410 includes at least one of a ToF (Time Of Flight) camera, a stereo camera, a monocular camera, an infrared camera, and other cameras.
  • the outside information detection unit 7420 detects, for example, current weather or an environmental sensor for detecting weather, or other vehicles, obstacles, pedestrians, etc. around the vehicle equipped with the vehicle control system 7000. At least one of the surrounding information detection sensors.
  • the environmental sensor may be, for example, at least one of a raindrop sensor that detects rainy weather, a fog sensor that detects fog, a sunshine sensor that detects sunlight intensity, and a snow sensor that detects snowfall.
  • the ambient information detection sensor may be at least one of an ultrasonic sensor, a radar device, and a LIDAR (Light Detection and Ranging, Laser Imaging Detection and Ranging) device.
  • the imaging unit 7410 and the outside information detection unit 7420 may be provided as independent sensors or devices, or may be provided as a device in which a plurality of sensors or devices are integrated.
  • FIG. 28 shows an example of installation positions of the imaging unit 7410 and the vehicle outside information detection unit 7420.
  • the imaging units 7910, 7912, 7914, 7916, and 7918 are provided at, for example, at least one of the front nose, the side mirror, the rear bumper, the back door, and the upper part of the windshield in the vehicle interior of the vehicle 7900.
  • An imaging unit 7910 provided in the front nose and an imaging unit 7918 provided in the upper part of the windshield in the vehicle interior mainly acquire an image in front of the vehicle 7900.
  • Imaging units 7912 and 7914 provided in the side mirror mainly acquire an image of the side of the vehicle 7900.
  • An imaging unit 7916 provided in the rear bumper or the back door mainly acquires an image behind the vehicle 7900.
  • the imaging unit 7918 provided on the upper part of the windshield in the passenger compartment is mainly used for detecting a preceding vehicle or a pedestrian, an obstacle, a traffic light, a traffic sign, a lane, or
  • FIG. 28 shows an example of the shooting range of each of the imaging units 7910, 7912, 7914, and 7916.
  • the imaging range a indicates the imaging range of the imaging unit 7910 provided in the front nose
  • the imaging ranges b and c indicate the imaging ranges of the imaging units 7912 and 7914 provided in the side mirrors, respectively
  • the imaging range d The imaging range of the imaging part 7916 provided in the rear bumper or the back door is shown. For example, by superimposing the image data captured by the imaging units 7910, 7912, 7914, and 7916, an overhead image when the vehicle 7900 is viewed from above is obtained.
  • the vehicle outside information detection units 7920, 7922, 7924, 7926, 7928, and 7930 provided on the front, rear, sides, corners of the vehicle 7900 and the upper part of the windshield in the vehicle interior may be, for example, an ultrasonic sensor or a radar device.
  • the vehicle outside information detection units 7920, 7926, and 7930 provided on the front nose, the rear bumper, the back door, and the windshield in the vehicle interior of the vehicle 7900 may be, for example, LIDAR devices.
  • These outside information detection units 7920 to 7930 are mainly used for detecting a preceding vehicle, a pedestrian, an obstacle, and the like.
  • the vehicle exterior information detection unit 7400 causes the imaging unit 7410 to capture an image outside the vehicle and receives the captured image data. Further, the vehicle exterior information detection unit 7400 receives detection information from the vehicle exterior information detection unit 7420 connected thereto. When the vehicle exterior information detection unit 7420 is an ultrasonic sensor, a radar device, or a LIDAR device, the vehicle exterior information detection unit 7400 transmits ultrasonic waves, electromagnetic waves, or the like, and receives received reflected wave information.
  • the outside information detection unit 7400 may perform an object detection process or a distance detection process such as a person, a car, an obstacle, a sign, or a character on a road surface based on the received information.
  • the vehicle exterior information detection unit 7400 may perform environment recognition processing for recognizing rainfall, fog, road surface conditions, or the like based on the received information.
  • the vehicle outside information detection unit 7400 may calculate a distance to an object outside the vehicle based on the received information.
  • the outside information detection unit 7400 may perform image recognition processing or distance detection processing for recognizing a person, a car, an obstacle, a sign, a character on a road surface, or the like based on the received image data.
  • the vehicle exterior information detection unit 7400 performs processing such as distortion correction or alignment on the received image data, and combines the image data captured by the different imaging units 7410 to generate an overhead image or a panoramic image. Also good.
  • the vehicle exterior information detection unit 7400 may perform viewpoint conversion processing using image data captured by different imaging units 7410.
  • the vehicle interior information detection unit 7500 detects vehicle interior information.
  • a driver state detection unit 7510 that detects the driver's state is connected to the in-vehicle information detection unit 7500.
  • Driver state detection unit 7510 may include a camera that captures an image of the driver, a biosensor that detects biometric information of the driver, a microphone that collects sound in the passenger compartment, and the like.
  • the biometric sensor is provided, for example, on a seat surface or a steering wheel, and detects biometric information of an occupant sitting on the seat or a driver holding the steering wheel.
  • the vehicle interior information detection unit 7500 may calculate the degree of fatigue or concentration of the driver based on the detection information input from the driver state detection unit 7510, and determines whether the driver is asleep. May be.
  • the vehicle interior information detection unit 7500 may perform a process such as a noise canceling process on the collected audio signal.
  • the integrated control unit 7600 controls the overall operation in the vehicle control system 7000 according to various programs.
  • An input unit 7800 is connected to the integrated control unit 7600.
  • the input unit 7800 is realized by a device that can be input by a passenger, such as a touch panel, a button, a microphone, a switch, or a lever.
  • the integrated control unit 7600 may be input with data obtained by recognizing voice input through a microphone.
  • the input unit 7800 may be, for example, a remote control device using infrared rays or other radio waves, or may be an external connection device such as a mobile phone or a PDA (Personal Digital Assistant) that supports the operation of the vehicle control system 7000. May be.
  • the input unit 7800 may be, for example, a camera.
  • the passenger can input information using a gesture.
  • data obtained by detecting the movement of the wearable device worn by the passenger may be input.
  • the input unit 7800 may include, for example, an input control circuit that generates an input signal based on information input by a passenger or the like using the input unit 7800 and outputs the input signal to the integrated control unit 7600.
  • a passenger or the like operates the input unit 7800 to input various data or instruct a processing operation to the vehicle control system 7000.
  • the storage unit 7690 may include a ROM (Read Only Memory) that stores various programs executed by the microcomputer, and a RAM (Random Access Memory) that stores various parameters, calculation results, sensor values, and the like.
  • the storage unit 7690 may be realized by a magnetic storage device such as an HDD (Hard Disc Drive), a semiconductor storage device, an optical storage device, a magneto-optical storage device, or the like.
  • General-purpose communication I / F 7620 is a general-purpose communication I / F that mediates communication with various devices existing in the external environment 7750.
  • General-purpose communication I / F7620 is a cellular communication protocol such as GSM (Global System of Mobile communications), WiMAX, LTE (Long Term Evolution) or LTE-A (LTE-Advanced), or wireless LAN (Wi-Fi (registered trademark)). Other wireless communication protocols such as Bluetooth (registered trademark) may also be implemented.
  • the general-purpose communication I / F 7620 is connected to a device (for example, an application server or a control server) existing on an external network (for example, the Internet, a cloud network, or an operator-specific network) via, for example, a base station or an access point.
  • the general-purpose communication I / F 7620 is a terminal (for example, a driver, a pedestrian or a store terminal, or an MTC (Machine Type Communication) terminal) that exists in the vicinity of the vehicle using, for example, P2P (Peer To Peer) technology. You may connect with.
  • a terminal for example, a driver, a pedestrian or a store terminal, or an MTC (Machine Type Communication) terminal
  • P2P Peer To Peer
  • the dedicated communication I / F 7630 is a communication I / F that supports a communication protocol formulated for use in vehicles.
  • the dedicated communication I / F 7630 is a standard protocol such as WAVE (Wireless Access in Vehicle Environment), DSRC (Dedicated Short Range Communications), or cellular communication protocol, which is a combination of the lower layer IEEE 802.11p and the upper layer IEEE 1609. May be implemented.
  • the dedicated communication I / F 7630 typically includes vehicle-to-vehicle communication, vehicle-to-infrastructure communication, vehicle-to-home communication, and vehicle-to-pedestrian communication. ) Perform V2X communication, which is a concept that includes one or more of the communications.
  • the positioning unit 7640 receives, for example, a GNSS signal from a GNSS (Global Navigation Satellite System) satellite (for example, a GPS signal from a GPS (Global Positioning System) satellite), performs positioning, and performs latitude, longitude, and altitude of the vehicle.
  • the position information including is generated.
  • the positioning unit 7640 may specify the current position by exchanging signals with the wireless access point, or may acquire position information from a terminal such as a mobile phone, PHS, or smartphone having a positioning function.
  • the beacon receiving unit 7650 receives, for example, radio waves or electromagnetic waves transmitted from a radio station installed on the road, and acquires information such as the current position, traffic jam, closed road, or required time. Note that the function of the beacon receiving unit 7650 may be included in the dedicated communication I / F 7630 described above.
  • the in-vehicle device I / F 7660 is a communication interface that mediates the connection between the microcomputer 7610 and various in-vehicle devices 7760 present in the vehicle.
  • the in-vehicle device I / F 7660 may establish a wireless connection using a wireless communication protocol such as a wireless LAN, Bluetooth (registered trademark), NFC (Near Field Communication), or WUSB (Wireless USB).
  • the in-vehicle device I / F 7660 is connected to a USB (Universal Serial Bus), HDMI (High-Definition Multimedia Interface), or MHL (Mobile High-definition Link) via a connection terminal (and a cable if necessary). ) Etc. may be established.
  • the in-vehicle device 7760 may include, for example, at least one of a mobile device or a wearable device that a passenger has, or an information device that is carried into or attached to the vehicle.
  • In-vehicle device 7760 may include a navigation device that searches for a route to an arbitrary destination.
  • In-vehicle device I / F 7660 exchanges control signals or data signals with these in-vehicle devices 7760.
  • the in-vehicle network I / F 7680 is an interface that mediates communication between the microcomputer 7610 and the communication network 7010.
  • the in-vehicle network I / F 7680 transmits and receives signals and the like in accordance with a predetermined protocol supported by the communication network 7010.
  • the microcomputer 7610 of the integrated control unit 7600 is connected via at least one of a general-purpose communication I / F 7620, a dedicated communication I / F 7630, a positioning unit 7640, a beacon receiving unit 7650, an in-vehicle device I / F 7660, and an in-vehicle network I / F 7680.
  • the vehicle control system 7000 is controlled according to various programs based on the acquired information. For example, the microcomputer 7610 calculates a control target value of the driving force generation device, the steering mechanism, or the braking device based on the acquired information inside and outside the vehicle, and outputs a control command to the drive system control unit 7100. Also good.
  • the microcomputer 7610 realizes ADAS (Advanced Driver Assistance System) functions including vehicle collision avoidance or impact mitigation, following traveling based on inter-vehicle distance, vehicle speed maintaining traveling, vehicle collision warning, or vehicle lane departure warning. You may perform the cooperative control for the purpose. Further, the microcomputer 7610 controls the driving force generator, the steering mechanism, the braking device, or the like based on the acquired information on the surroundings of the vehicle, so that the microcomputer 7610 automatically travels independently of the driver's operation. You may perform the cooperative control for the purpose of driving.
  • ADAS Advanced Driver Assistance System
  • the microcomputer 7610 is information acquired via at least one of the general-purpose communication I / F 7620, the dedicated communication I / F 7630, the positioning unit 7640, the beacon receiving unit 7650, the in-vehicle device I / F 7660, and the in-vehicle network I / F 7680.
  • the three-dimensional distance information between the vehicle and the surrounding structure or an object such as a person may be generated based on the above and local map information including the peripheral information of the current position of the vehicle may be created.
  • the microcomputer 7610 may generate a warning signal by predicting a danger such as a collision of a vehicle, approach of a pedestrian or the like or an approach to a closed road based on the acquired information.
  • the warning signal may be, for example, a signal for generating a warning sound or lighting a warning lamp.
  • the audio image output unit 7670 transmits an output signal of at least one of audio and image to an output device capable of visually or audibly notifying information to a vehicle occupant or the outside of the vehicle.
  • an audio speaker 7710, a display unit 7720, and an instrument panel 7730 are illustrated as output devices.
  • Display unit 7720 may include at least one of an on-board display and a head-up display, for example.
  • the display portion 7720 may have an AR (Augmented Reality) display function.
  • the output device may be other devices such as headphones, wearable devices such as glasses-type displays worn by passengers, projectors, and lamps.
  • the display device can display the results obtained by various processes performed by the microcomputer 7610 or information received from other control units in various formats such as text, images, tables, and graphs. Display visually. Further, when the output device is an audio output device, the audio output device converts an audio signal made up of reproduced audio data or acoustic data into an analog signal and outputs it aurally.
  • At least two control units connected via the communication network 7010 may be integrated as one control unit.
  • each control unit may be configured by a plurality of control units.
  • the vehicle control system 7000 may include another control unit not shown.
  • some or all of the functions of any of the control units may be given to other control units. That is, as long as information is transmitted and received via the communication network 7010, the predetermined arithmetic processing may be performed by any one of the control units.
  • a sensor or device connected to one of the control units may be connected to another control unit, and a plurality of control units may transmit / receive detection information to / from each other via the communication network 7010. .
  • a computer program for realizing each function of the detection devices 100 to 800 according to the present embodiment described with reference to FIGS. 1, 6, 9, 12, 15, 18, 20, and 22 is stored in any control unit. Etc. can be implemented. It is also possible to provide a computer-readable recording medium in which such a computer program is stored.
  • the recording medium is, for example, a magnetic disk, an optical disk, a magneto-optical disk, a flash memory, or the like. Further, the above computer program may be distributed via a network, for example, without using a recording medium.
  • the detection apparatuses 100 to 800 according to the present embodiment described with reference to FIGS. 1, 6, 9, 12, 15, 18, 20, and 22 are applied to the application example illustrated in FIG.
  • the present invention can be applied to the integrated control unit 7600.
  • the components of the detection devices 100 to 800 according to the present embodiment described with reference to FIGS. 1, 6, 9, 12, 15, 18, 20, and 22 are the integrated control unit illustrated in FIG. It may be realized in a module for 7600 (for example, an integrated circuit module composed of one die).
  • the detection devices 100 to 800 according to the present embodiment described with reference to FIGS. 1, 6, 9, 12, 15, 18, 20, and 22 are performed by a plurality of control units of the vehicle control system 7000 illustrated in FIG. It may be realized.
  • this technique can take the following structures.
  • a detection device comprising: a determination unit that determines whether an image in the region is the object.
  • the setting unit The feature amount of the object is the size of the object at a predetermined distance, Set a frame corresponding to the size of the object according to the distance in the pixel set as the processing target, The determination unit The detection device according to (1), wherein it is determined whether an image in the frame is the object.
  • An imaging unit that captures an image of ambient light; The image picked up by the image pickup unit, the size of the object set by the setting unit, an image in an area determined to be the object by the determination unit, and detected by the direction detection unit
  • the detection apparatus according to (3), further comprising: a recognition unit that performs detailed recognition on the object using at least one of the directions of the object.
  • a detection apparatus comprising: an estimation unit configured to estimate a category to which the object belongs from the size of the region and the distance information.
  • the setting unit Set up to the part where the distance information changes as an area where the object may be imaged The estimation unit includes The detection apparatus according to (5), wherein a category to which an object having a size of the area belongs at a distance represented by the distance information in the area is a category to which the object belongs. (7) The detection apparatus according to (5) or (6), further including a shape estimation unit that estimates the shape of the object in the region set by the setting unit using the distance information. (8) The detection device according to (7), wherein the estimation unit estimates the category using at least one of the distance information, the size of the region, and the shape.
  • An imaging unit that captures an image of ambient light; At least one of the image captured by the imaging unit, the size of the region set by the setting unit, the category estimated by the estimation unit, and the shape estimated by the shape estimation unit.
  • the detection device according to (7), further comprising: a recognition unit that performs detailed recognition on the object.
  • the detection device according to any one of (1) to (9), wherein the acquisition unit acquires the distance information using a TOF type sensor, a stereo camera, an ultrasonic sensor, or a millimeter wave radar.
  • (11) Get distance information about the distance to the subject, From the distance information and the feature amount of the object to be detected, set an area where the object may be imaged, A detection method including a step of determining whether an image in the region is the object.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Geometry (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

The present technology pertains to a detection device, a detection method, and a program, with which a prescribed object can be detected. The detection device is equipped with: an acquisition unit that acquires distance information pertaining to the distance to a photographic subject; a setting unit that, on the basis of the distance information and a feature amount of an object to be detected, sets a region in which there is a possibility that the object is being photographed; and a determination unit that determines whether an image in the region is the object. Alternatively, the detection device is equipped with: an acquisition unit that acquires distance information pertaining to the distance to a photographic subject; a setting unit that uses the distance information to set a region in which there is a possibility that a prescribed object is being photographed; and an estimation unit that, on the basis of the size of the region and the distance information, estimates a category to which the object belongs. The present technology can be applied to detection devices that detect prescribed objects.

Description

検出装置、検出方法、およびプログラムDetection apparatus, detection method, and program
 本技術は、検出装置、検出方法、およびプログラムに関し、特に、例えば、画像から所定の被写体を検出する検出装置、検出方法、およびプログラムに関する。 The present technology relates to a detection device, a detection method, and a program, and more particularly, to a detection device, a detection method, and a program that detect a predetermined subject from an image, for example.
 撮影された画像から、パターンマッチング技術やエッジ検出技術を適用して商品と背景との境界を検出し、商品領域を切り出すことで、商品を認識することが提案されている(例えば、特許文献1参照)。 It has been proposed to recognize a product by detecting a boundary between the product and the background by applying a pattern matching technique or an edge detection technique from a photographed image and cutting out a product region (for example, Patent Document 1). reference).
 また、全ての大きさの被写体を認識するために、探索画像の解像度を変更して複数回のスキャンを実行することで、探索対象を認識することが提案されている(例えば、特許文献2参照)。 Further, in order to recognize a subject of all sizes, it has been proposed to recognize a search target by changing the resolution of the search image and executing a plurality of scans (see, for example, Patent Document 2). ).
特開2016-31599号公報JP 2016-31599 A 特開2011-14148号公報JP 2011-14148 A
 所定の物体を認識(検出)するときの処理を軽減することが望まれている。 It is desired to reduce processing when recognizing (detecting) a predetermined object.
 本技術は、このような状況に鑑みてなされたものであり、所定の物体を認識(検出)するときの処理を軽減することができるようにするものである。 The present technology has been made in view of such a situation, and makes it possible to reduce processing when a predetermined object is recognized (detected).
 本技術の一側面の第1の検出装置は、被写体までの距離に関する距離情報を取得する取得部と、前記距離情報と、検出対象とされる物体の特徴量から、前記物体が撮像されている可能性がある領域を設定する設定部と、前記領域内の画像が、前記物体であるか否かを判定する判定部とを備える。 In a first detection device according to one aspect of the present technology, the object is imaged from an acquisition unit that acquires distance information regarding a distance to a subject, the distance information, and a feature amount of an object to be detected. A setting unit configured to set a possible region; and a determination unit configured to determine whether an image in the region is the object.
 本技術の一側面の第2の検出装置は、被写体までの距離に関する距離情報を取得する取得部と、前記距離情報を用いて、所定の物体が撮像されている可能性がある領域を設定する設定部と、前記領域の大きさと、前記距離情報とから、前記物体が属するカテゴリを推定する推定部とを備える。 A second detection device according to one aspect of the present technology sets an area where a predetermined object may be captured using an acquisition unit that acquires distance information regarding a distance to a subject and the distance information A setting unit; and an estimation unit that estimates a category to which the object belongs from the size of the region and the distance information.
 本技術の一側面の第1の検出方法は、被写体までの距離に関する距離情報を取得し、前記距離情報と、検出対象とされる物体の特徴量から、前記物体が撮像されている可能性がある領域を設定し、前記領域内の画像が、前記物体であるか否かを判定するステップを含む。 The first detection method according to one aspect of the present technology acquires distance information regarding a distance to a subject, and the object may be captured from the distance information and a feature amount of the object to be detected. A step of setting a certain region and determining whether or not an image in the region is the object.
 本技術の一側面の第2の検出方法は、被写体までの距離に関する距離情報を取得し、前記距離情報を用いて、所定の物体が撮像されている可能性がある領域を設定し、前記領域の大きさと、前記距離情報とから、前記物体が属するカテゴリを推定するステップを含む。 According to a second detection method of an aspect of the present technology, distance information related to a distance to a subject is acquired, an area where a predetermined object may be captured is set using the distance information, and the area And estimating the category to which the object belongs from the distance information and the distance information.
 本技術の一側面の第1のプログラムは、コンピュータに、被写体までの距離に関する距離情報を取得し、前記距離情報と、検出対象とされる物体の特徴量から、前記物体が撮像されている可能性がある領域を設定し、前記領域内の画像が、前記物体であるか否かを判定するステップを含む処理を実行させる。 The first program according to an aspect of the present technology may acquire distance information related to a distance to a subject in a computer, and the object may be captured from the distance information and a feature amount of the object to be detected. A region having a characteristic is set, and a process including a step of determining whether or not an image in the region is the object is executed.
 本技術の一側面の第2のプログラムは、コンピュータに、被写体までの距離に関する距離情報を取得し、前記距離情報を用いて、所定の物体が撮像されている可能性がある領域を設定し、前記領域の大きさと、前記距離情報とから、前記物体が属するカテゴリを推定するステップを含む処理を実行させる。 The second program of one aspect of the present technology acquires distance information related to a distance to a subject in a computer, sets an area where a predetermined object may be imaged using the distance information, A process including a step of estimating a category to which the object belongs is executed based on the size of the area and the distance information.
 本技術の一側面の第1の検出装置、検出方法、プログラムにおいては、被写体までの距離に関する距離情報が取得され、距離情報と、検出対象とされる物体の特徴量から、物体が撮像されている可能性がある領域が設定され、領域内の画像が、物体であるか否かが判定される。 In the first detection device, the detection method, and the program according to one aspect of the present technology, distance information regarding the distance to the subject is acquired, and an object is imaged from the distance information and the feature amount of the object to be detected. An area that may be present is set, and it is determined whether or not the image in the area is an object.
 本技術の一側面の第2の検出装置、検出方法、プログラムにおいては、被写体までの距離に関する距離情報が取得され、距離情報が用いられて、所定の物体が撮像されている可能性がある領域が設定され、領域の大きさと、距離情報とから、物体が属するカテゴリが推定される。 In the second detection device, the detection method, and the program according to one aspect of the present technology, the distance information related to the distance to the subject is acquired, and the distance information may be used to capture a predetermined object Is set, and the category to which the object belongs is estimated from the size of the region and the distance information.
 なお、検出装置は、独立した装置であっても良いし、1つの装置を構成している内部ブロックであっても良い。 Note that the detection device may be an independent device or an internal block constituting one device.
 また、プログラムは、伝送媒体を介して伝送することにより、または、記録媒体に記録して、提供することができる。 Also, the program can be provided by being transmitted through a transmission medium or by being recorded on a recording medium.
 本技術の一側面によれば、所定の物体を認識(検出)するときの処理を軽減することができる。 According to one aspect of the present technology, processing when recognizing (detecting) a predetermined object can be reduced.
 なお、ここに記載された効果は必ずしも限定されるものではなく、本開示中に記載されたいずれかの効果であってもよい。 It should be noted that the effects described here are not necessarily limited, and may be any of the effects described in the present disclosure.
本技術を適用した検出装置の一実施の形態の構成を示す図である。It is a figure showing the composition of the 1 embodiment of the detecting device to which this art is applied. 第1の認識処理について説明するためのフローチャートである。It is a flowchart for demonstrating a 1st recognition process. 所定の物体の検出に係わる処理について説明するための図である。It is a figure for demonstrating the process regarding the detection of a predetermined | prescribed object. テーブルの一例を示す図である。It is a figure which shows an example of a table. 所定の物体の検出に係わる処理について説明するための図である。It is a figure for demonstrating the process regarding the detection of a predetermined | prescribed object. 第2の実施の形態における検出装置の構成を示す図である。It is a figure which shows the structure of the detection apparatus in 2nd Embodiment. 第2の認識処理について説明するためのフローチャートである。It is a flowchart for demonstrating a 2nd recognition process. 所定の物体の検出に係わる処理について説明するための図である。It is a figure for demonstrating the process regarding the detection of a predetermined | prescribed object. 第3の実施の形態における検出装置の構成を示す図である。It is a figure which shows the structure of the detection apparatus in 3rd Embodiment. 第3の認識処理について説明するためのフローチャートである。It is a flowchart for demonstrating a 3rd recognition process. 被写体の方向検出について説明するための図である。It is a figure for demonstrating a to-be-photographed object's direction detection. 第4の実施の形態における検出装置の構成を示す図である。It is a figure which shows the structure of the detection apparatus in 4th Embodiment. 第4の認識処理について説明するためのフローチャートである。It is a flowchart for demonstrating a 4th recognition process. 被写体の方向検出について説明するための図である。It is a figure for demonstrating a to-be-photographed object's direction detection. 第5の実施の形態における検出装置の構成を示す図である。It is a figure which shows the structure of the detection apparatus in 5th Embodiment. 第5の認識処理について説明するためのフローチャートである。It is a flowchart for demonstrating a 5th recognition process. 物体の検出について説明するための図である。It is a figure for demonstrating the detection of an object. 第6の実施の形態における検出装置の構成を示す図である。It is a figure which shows the structure of the detection apparatus in 6th Embodiment. 第6の認識処理について説明するためのフローチャートである。It is a flowchart for demonstrating a 6th recognition process. 第7の実施の形態における検出装置の構成を示す図である。It is a figure which shows the structure of the detection apparatus in 7th Embodiment. 第7の認識処理について説明するためのフローチャートである。It is a flowchart for demonstrating a 7th recognition process. 第8の実施の形態における検出装置の構成を示す図である。It is a figure which shows the structure of the detection apparatus in 8th Embodiment. 第8の認識処理について説明するためのフローチャートである。It is a flowchart for demonstrating the 8th recognition process. 積層構造について説明するための図である。It is a figure for demonstrating a laminated structure. 積層構造について説明するための図である。It is a figure for demonstrating a laminated structure. 記録媒体について説明するための図である。It is a figure for demonstrating a recording medium. 車両制御システムの概略的な構成の一例を示すブロック図である。It is a block diagram which shows an example of a schematic structure of a vehicle control system. 車外情報検出部及び撮像部の設置位置の一例を示す説明図である。It is explanatory drawing which shows an example of the installation position of a vehicle exterior information detection part and an imaging part.
 以下に、本技術を実施するための形態(以下、実施の形態という)について説明する。本技術は、所定の物体、例えば、人(顔、上半身、全身)、自動車、自転車、食材などの物体を認識(検出)するのに適用できる。また、本技術は、そのような所定の物体の検出を、距離情報を用いて行う。以下の説明においては、距離情報を用いて、人の顔を検出する場合を例に挙げて説明する。 Hereinafter, modes for carrying out the present technology (hereinafter referred to as embodiments) will be described. The present technology is applicable to recognizing (detecting) a predetermined object, for example, an object such as a person (face, upper body, whole body), automobile, bicycle, foodstuff, or the like. In addition, the present technology detects such a predetermined object using distance information. In the following description, a case where a human face is detected using distance information will be described as an example.
 <第1の実施の形態>
 図1は、本技術を適用した検出装置の一実施の形態の構成を示す図である。図1に示した検出装置100は、距離情報取得部111、被写体特徴抽出部112、被写体候補領域検出部113、および実サイズデータベース114を備える。
<First Embodiment>
FIG. 1 is a diagram illustrating a configuration of an embodiment of a detection device to which the present technology is applied. The detection apparatus 100 illustrated in FIG. 1 includes a distance information acquisition unit 111, a subject feature extraction unit 112, a subject candidate area detection unit 113, and an actual size database 114.
 距離情報取得部111は、被写体までの距離を測定し、その測定した結果(距離情報)を生成し、被写体特徴抽出部112に出力する。距離情報取得部111は、例えば、アクティブ光(赤外線など)を利用した測距センサにより距離情報を取得する。アクティブ光を利用した測距センサとしては、TOF(Time-of-Flight)方式や、Structured Light方式などを適用することができる。 The distance information acquisition unit 111 measures the distance to the subject, generates a measurement result (distance information), and outputs the result to the subject feature extraction unit 112. The distance information acquisition unit 111 acquires distance information by a distance measuring sensor using active light (infrared rays or the like), for example. As a distance measuring sensor using active light, a TOF (Time-of-Flight) method, a Structured Light method, or the like can be applied.
 また、距離情報取得部111は、アクティブ光の反射光、例えば、TOFの光源やカメラのフラッシュ光などを用いた測距センサ(測距方式)により距離情報を取得する構成としても良い。また、距離情報取得部111は、ステレオカメラにより距離情報を取得する構成としても良い。また、距離情報取得部111は、超音波センサにより距離情報を取得する構成としても良い。また、距離情報取得部111は、ミリ波レーダーを用いた方式で距離情報を取得する構成としても良い。 Also, the distance information acquisition unit 111 may be configured to acquire distance information by a distance measuring sensor (ranging method) using reflected light of active light, for example, a TOF light source or a camera flash light. The distance information acquisition unit 111 may be configured to acquire distance information with a stereo camera. The distance information acquisition unit 111 may be configured to acquire distance information using an ultrasonic sensor. The distance information acquisition unit 111 may be configured to acquire distance information by a method using a millimeter wave radar.
 被写体特徴抽出部112は、距離情報から、検出対象、例えば、人の顔がある可能性がある枠を設定する。その枠は、実サイズデータベース114に記憶されているテーブルが参照されて設定される。実サイズデータベース114には、距離と検出被写体の実サイズを考慮した画像上の大きさが関連付けられたテーブルが管理されている。例えば、所定の距離だけ離れた位置に、人の顔がある場合、その人の顔の大きさは画像上でどの程度であるかが記載されているテーブルである。 The subject feature extraction unit 112 sets a detection target, for example, a frame that may have a human face, from the distance information. The frame is set by referring to a table stored in the actual size database 114. The actual size database 114 manages a table in which the size on the image in consideration of the distance and the actual size of the detected subject is associated. For example, when there is a person's face at a position separated by a predetermined distance, the table describes how much the face of the person is on the image.
 被写体候補領域検出部113は、設定された枠内に、検出対象があるか否かを判定し、ある場合には、その枠内を切り出し、図示していない後段の処理部に出力する。 The subject candidate area detection unit 113 determines whether or not there is a detection target within the set frame. If there is, the subject candidate area detection unit 113 cuts out the frame and outputs it to a subsequent processing unit (not shown).
 図2に示したフローチャートを参照し、検出装置100の動作について説明を加える。 Referring to the flowchart shown in FIG. 2, the operation of the detection apparatus 100 will be described.
 ステップS101において、被写体特徴抽出部112は、処理対象とする画素(注目画素)を設定する。例えば、画像の左上の画素から、順次注目画素に設定される。例えば、図3上図に示すように、画像131(距離画像131)が取得されたとき、その距離画像131の左上の画素から右下の画素まで順次注目画素に設定される。図3上図では、矢印で、順次設定される注目画素の順を表したが、このような順以外の順で、注目画素が設定されても良い。 In step S101, the subject feature extraction unit 112 sets a pixel (target pixel) to be processed. For example, the pixel of interest is sequentially set from the upper left pixel of the image. For example, as shown in the upper diagram of FIG. 3, when an image 131 (distance image 131) is acquired, the pixel of interest is sequentially set from the upper left pixel to the lower right pixel of the distance image 131. In the upper diagram of FIG. 3, the order of the target pixels that are sequentially set is represented by an arrow, but the target pixels may be set in an order other than such an order.
 距離画像131とは、距離情報から生成される画像であるとする。例えば、同一の距離は同一色で表され、距離に応じた色付けがされた画像である。なお、本技術における距離画像131は、距離に応じた色付けがされた画像である必要はなく、単に、画像131内の所定の画素(被写体)が、検出装置100からどの程度離れているかがわかる画像であればよい。 The distance image 131 is an image generated from distance information. For example, the same distance is an image represented by the same color and colored according to the distance. Note that the distance image 131 according to the present technology does not need to be an image colored according to the distance, but simply knows how far a predetermined pixel (subject) in the image 131 is from the detection device 100. Any image can be used.
 ここでは、図3に示すように距離画像131が生成されるとして説明を続ける。また図3の上図に示したように、距離画像131内の所定の位置にある画素が、注目画素132に設定された場合を例に挙げて説明を続ける。 Here, the description will be continued assuming that the distance image 131 is generated as shown in FIG. Further, as illustrated in the upper diagram of FIG. 3, the description will be continued with an example in which a pixel at a predetermined position in the distance image 131 is set as the pixel of interest 132.
 ステップS102において、注目画素における距離情報が取得される。ステップS103において、距離と検出対象の被写体の実サイズから検出枠のサイズが決定される。被写体特徴抽出部112は、図3の中図に示すように、注目画素132の距離と、検出対象の被写体の実サイズから、検出枠133を設定する。 In step S102, distance information on the target pixel is acquired. In step S103, the size of the detection frame is determined from the distance and the actual size of the subject to be detected. As shown in the middle diagram of FIG. 3, the subject feature extraction unit 112 sets a detection frame 133 from the distance of the target pixel 132 and the actual size of the subject to be detected.
 被写体特徴抽出部112は、実サイズデータベース114で管理されているテーブルを参照して、検出枠133を設定する。実サイズデータベース114には、例えば、図4に示したようなテーブル151が記憶されている。 The subject feature extraction unit 112 sets a detection frame 133 with reference to a table managed by the actual size database 114. In the actual size database 114, for example, a table 151 as shown in FIG. 4 is stored.
 図4に示したテーブル151は、距離と顔の実サイズに基づいた画像上の大きさが関連付けられているテーブルである。例えば、距離が0(cm)のときは、顔の画像上のサイズが30画素×30画素であり、距離が50(cm)のときは、顔の画像上のサイズが25画素×25画素であり、距離が100(cm)のときは、顔の画像上のサイズが20画素×20画素であるといった関係が記載されている。 The table 151 shown in FIG. 4 is a table in which the size on the image based on the distance and the actual size of the face is associated. For example, when the distance is 0 (cm), the size on the face image is 30 pixels × 30 pixels, and when the distance is 50 (cm), the size on the face image is 25 pixels × 25 pixels. Yes, when the distance is 100 (cm), the relationship that the size on the face image is 20 pixels × 20 pixels is described.
 距離は、検出装置100と被写体(この場合、人の顔)との距離である。顔の実サイズとは、所定の距離、例えば、50センチ離れていた位置での平均的な人の顔の大きさである。人の顔は、性別、年齢により異なるし、個人差もあるため、ここでは、顔の実サイズは、平均的な人の顔のサイズであるとして説明を続ける。 The distance is a distance between the detection device 100 and a subject (in this case, a human face). The actual face size is the average human face size at a predetermined distance, for example, 50 cm away. Since human faces vary depending on gender and age, and there are individual differences, the actual face size will be described here as an average human face size.
 なお、1つの距離に対して複数の顔の実サイズに基づいた画像上の大きさが関連付けられているテーブル151を作成し、そのようなテーブル151で処理が行われるようにしても良い。例えば、1つの距離に対して、男性の顔の実サイズに基づいた画像上の大きさ、女性の顔の実サイズに基づいた画像上の大きさ、子供の実サイズに基づいた画像上の大きさを関連付けても良い。この場合、各実サイズに基づいた画像上の大きさに対応する検出枠133を設定し、それぞれの検出枠133毎に、後述するステップS104の処理を実行すれば良い。 Note that a table 151 in which the size on the image based on the actual size of a plurality of faces is associated with one distance may be created, and processing may be performed using such a table 151. For example, for one distance, the size on the image based on the actual size of the male face, the size on the image based on the actual size of the female face, and the size on the image based on the actual size of the child May be associated. In this case, a detection frame 133 corresponding to the size on the image based on each actual size may be set, and the process of step S104 described later may be executed for each detection frame 133.
 また、図4では、距離として、0、50、100と、50センチ単位で実サイズに基づいた画像上の大きさと関連付けた例を示したが、50センチ単位に限定されるわけでなく、距離情報の精度や、検出に求められる精度などにより、変更可能な値である。 FIG. 4 shows an example in which the distances are related to 0, 50, 100 and the size on the image based on the actual size in units of 50 centimeters. However, the distance is not limited to 50 centimeters. The value can be changed depending on the accuracy of information and the accuracy required for detection.
 ステップS103において、被写体特徴抽出部112は、図3の中図に示すように、注目画素132の距離と、検出対象の被写体の実サイズに基づいた画像上の大きさから、検出枠133を設定する。例えば、図4に示したようなテーブル151が参照されて処理が行われ、注目画素132の距離が50センチであると判定された場合、注目画素132を中心とし、25画素×25画素となる検出枠133が設定される。 In step S103, the subject feature extraction unit 112 sets the detection frame 133 from the size on the image based on the distance of the target pixel 132 and the actual size of the subject to be detected, as shown in the middle diagram of FIG. To do. For example, when processing is performed with reference to the table 151 as illustrated in FIG. 4 and the distance of the target pixel 132 is determined to be 50 centimeters, the pixel of interest 132 is 25 pixels × 25 pixels centering on the target pixel 132. A detection frame 133 is set.
 なお、ここでは、検出枠133は、図3に示したように四角形である場合を例に挙げて説明するが、四角形などの矩形に限らず、円形などの他の形状であっても良い。またここでは、被写体の実サイズを、被写体の特徴(特徴量)とし、その実サイズに基づいた画像上の大きさを用いて検出枠133を設定する例を挙げて説明するが、他の被写体の特徴(特徴量)を用いて、検出枠133が設定されるようにすることも可能である。 In addition, here, the detection frame 133 is described as an example of a quadrangle as illustrated in FIG. 3, but the detection frame 133 is not limited to a rectangle such as a rectangle, but may be other shapes such as a circle. Here, an example in which the actual size of the subject is the feature (feature amount) of the subject and the detection frame 133 is set using the size on the image based on the actual size will be described. It is also possible to set the detection frame 133 using the feature (feature amount).
 このように被写体特徴抽出部112は、この場合、被写体の大きさを特徴量とし、その特徴量に設定される検出枠133を設定する設定部として機能する。 Thus, the subject feature extraction unit 112 functions as a setting unit that sets the size of the subject as the feature amount and sets the detection frame 133 set to the feature amount in this case.
 このように、撮像された画像131内で、距離に応じた顔の大きさに該当する検出枠133が設定される。検出枠133は、注目画素132の位置の距離に、検出対象の被写体、例えば、人の顔があるとしたら、距離画像131上でどのくらいのサイズになるかが計算されることで設定される枠である。なお、このような計算自体は省略し、テーブル151に記載されているようにするなど、他の形態を適用することも可能である。 In this way, the detection frame 133 corresponding to the face size corresponding to the distance is set in the captured image 131. The detection frame 133 is a frame that is set by calculating the size on the distance image 131 if there is a subject to be detected, for example, a human face, at the distance of the position of the target pixel 132. It is. Note that such calculation itself is omitted, and other forms such as those described in the table 151 can be applied.
 ステップS104において、被写体候補領域検出部113により、検出枠133内の画像は、検出対象の被写体の候補であるか否かが判定される。例えば、距離画像131に、検出枠133と同等のサイズのフィルタをかけ、その応答値を被写体候補の確率値として用いる。フィルタとしては、DOG(Difference-of-Gaussian)フィルタやラプラシアンフィルタなどを適用することができる。 In step S104, the subject candidate area detection unit 113 determines whether the image in the detection frame 133 is a candidate for a subject to be detected. For example, a filter having the same size as the detection frame 133 is applied to the distance image 131, and the response value is used as the probability value of the subject candidate. As the filter, a DOG (Difference-of-Gaussian) filter, a Laplacian filter, or the like can be applied.
 検出枠133内の画像は、検出対象の被写体の候補であるか否かの判定は、検出枠133と検出枠133内の距離情報を用いて判定することができる。 Whether the image in the detection frame 133 is a candidate for a subject to be detected can be determined using distance information in the detection frame 133 and the detection frame 133.
 例えば、検出枠133内に人の顔が撮像されていた場合、人の顔には、凹凸があるため、検出枠133内の距離情報も遠近がばらけた情報となる。また、例えば、検出枠133内に人の顔が撮像されていた場合であるが、写真(ポスター)に写った人の顔である場合には、検出枠133内の距離情報は一定値となり、遠近がばらけた情報とはならない。 For example, when a person's face is imaged in the detection frame 133, since the person's face has irregularities, the distance information in the detection frame 133 is also information that varies in distance. Also, for example, when a human face is captured in the detection frame 133, but in the case of a human face shown in a photograph (poster), the distance information in the detection frame 133 is a constant value, The information is not distant.
 このような遠近のばらけ具合を、フィルタをかけて検出することで、検出枠133に検出対象の被写体があるか否かが判定される。判定結果は、検出装置100の後段の処理部(不図示)に出力される。なお、検出枠133に検出対象の被写体があると判定されたときだけ、検出装置100内の画像が、画像131から切り出され、その切り出された画像が出力されるようにすることができる。 It is determined whether or not there is a subject to be detected in the detection frame 133 by detecting such a degree of dispersal by applying a filter. The determination result is output to a processing unit (not shown) at the subsequent stage of the detection apparatus 100. Only when it is determined that there is a subject to be detected in the detection frame 133, the image in the detection device 100 can be cut out from the image 131, and the cut out image can be output.
 例えば、図3の下図に示すように、検出枠133-1内には顔があり、検出対象の被写体があると判定された場合、検出枠133-1内の画像が画像131から切り出され、出力される。また検出枠133-2内には顔がなく、検出対象の被写体はないと判定された場合、検出枠133-2内の画像は切り出されない。 For example, as shown in the lower diagram of FIG. 3, when it is determined that there is a face in the detection frame 133-1 and there is a subject to be detected, the image in the detection frame 133-1 is cut out from the image 131, Is output. If it is determined that there is no face in the detection frame 133-2 and there is no subject to be detected, the image in the detection frame 133-2 is not cut out.
 切り出しが行われるタイミングは、ステップS105における処理が終了した後に行われるようにすることができる。またステップS104の判定処理における判定結果、すなわちこの場合、フィルタをかけたときの値を、被写体候補の確率値とし、ステップS105の処理後に、その確率値に基づき、切り出しが行われるようにしても良い。また、確率値のみを後段に出力する構成とすることも可能である。 The timing at which the cutout is performed can be performed after the processing in step S105 is completed. Further, the determination result in the determination process in step S104, that is, in this case, the value when the filter is applied is set as the probability value of the subject candidate, and after the process in step S105, clipping is performed based on the probability value. good. It is also possible to have a configuration in which only the probability value is output to the subsequent stage.
 このように被写体候補領域検出部112は、検出枠133内の画像が、検出対象の被写体であるか否かを判定する判定部として機能する。また、被写体候補領域検出部112により、検出対象の被写体であると判定された画像は、切り出されて後段の処理部などに出力されるようにすることができる。 In this way, the subject candidate area detection unit 112 functions as a determination unit that determines whether the image in the detection frame 133 is a subject to be detected. In addition, an image that is determined by the subject candidate area detection unit 112 to be a subject to be detected can be cut out and output to a subsequent processing unit or the like.
 ステップS105において、画像131内の全画素に対して、このような処理が終了したか否かが判定され、全画素に対しては終了していないと判定された場合、ステップS101に処理が戻され、新たな注目画素が設定され、その設定された注目画素に対してステップS102以降の処理が行われる。 In step S105, it is determined whether or not such processing has been completed for all pixels in the image 131. If it is determined that the processing has not been completed for all pixels, the processing returns to step S101. Then, a new target pixel is set, and the processing after step S102 is performed on the set target pixel.
 一方で、ステップS105において、画像131内の全画素に対して、このような処理は終了したと判定された場合、認識処理は終了される。 On the other hand, if it is determined in step S105 that such processing has been completed for all the pixels in the image 131, the recognition processing is terminated.
 ステップS101乃至S105の処理が繰り返されることで、距離画像131内の全画素において、被写体候補の確率値が求められる。そして、その確率値の極大値を検出被写体の中心位置として、その中心位置の画素を注目画素132として設定して検出枠133内の画像が切り出される。 By repeating the processing of steps S101 to S105, the probability value of the subject candidate is obtained for all the pixels in the distance image 131. Then, the maximum value of the probability value is set as the center position of the detection subject, the pixel at the center position is set as the target pixel 132, and the image in the detection frame 133 is cut out.
 なお、このように、注目画素132が設定され、検出枠133が設定されるため、注目画素132は、画像131内の全ての画素に対して設定されなくても良い。 Note that, since the target pixel 132 is set and the detection frame 133 is set as described above, the target pixel 132 may not be set for all the pixels in the image 131.
 例えば、画像131の左上に位置する画素を注目画素132とした場合、検出枠133は設定できないため、仮に設定したとしても、検出枠133の一部(この場合、3/4)が欠けた状態でしか設定できないため、そのような領域にある画素は、注目画素132として設定しないようにしても良い。画像131の辺付近の領域も、同様に、検出枠133が設定できない領域であるため、そのような領域にある画素は、注目画素132として設定しないようにしても良い。 For example, when the pixel located at the upper left of the image 131 is set as the target pixel 132, the detection frame 133 cannot be set. Therefore, even if the detection frame 133 is set, a part of the detection frame 133 (in this case, 3/4) is missing. Therefore, a pixel in such a region may not be set as the target pixel 132. Similarly, the region near the side of the image 131 is also a region where the detection frame 133 cannot be set. Therefore, a pixel in such a region may not be set as the target pixel 132.
 また、注目画素132は、1画素ずつ順次設定されるようにしても良いが、所定の間隔、例えば、5画素おきに設定されるようにしても良い。 Further, the target pixel 132 may be sequentially set pixel by pixel, but may be set at a predetermined interval, for example, every five pixels.
 また、距離が、離れていると判定される領域、換言すれば、背景と判定できる領域内の画素は、注目画素132として設定しないようにしても良い。このようにすることで、処理対象とされる注目画素132を少なくすることができ、処理を軽減することが可能となる。 In addition, an area where the distance is determined to be far away, in other words, a pixel in an area where the background can be determined may not be set as the pixel of interest 132. In this way, the target pixel 132 to be processed can be reduced, and the processing can be reduced.
 このような検出がされることで、例えば、図5に示すような検出結果が得られる。図5の上図は、画像131の一例を表し、検出対象(人の顔)がある可能性がある領域として、検出枠133-1乃至133-4が設定された場合を示している。 By performing such detection, for example, a detection result as shown in FIG. 5 is obtained. The upper diagram of FIG. 5 shows an example of the image 131, and shows a case where the detection frames 133-1 to 133-4 are set as regions where there is a possibility that there is a detection target (human face).
 検出枠133-1においては、被写体特徴抽出部112(図1)により被写体の距離に応じて設定された検出枠133-1内に、被写体候補領域検出部113により、顔があると判定されたため、検出枠133-1内の画像は、切り出され、出力される。 In the detection frame 133-1, the subject candidate area detection unit 113 determines that there is a face in the detection frame 133-1 set according to the distance of the subject by the subject feature extraction unit 112 (FIG. 1). The image within the detection frame 133-1 is cut out and output.
 検出枠133-2においては、被写体特徴抽出部112(図1)により被写体の距離に応じて設定された検出枠133-2内に、被写体候補領域検出部113により、顔はないと判定されたため、検出枠133-2内の画像は、切り出されない。 In the detection frame 133-2, the subject candidate area detection unit 113 determines that there is no face in the detection frame 133-2 set according to the distance of the subject by the subject feature extraction unit 112 (FIG. 1). The image within the detection frame 133-2 is not cut out.
 検出枠133-3においては、被写体特徴抽出部112(図1)により被写体の距離に応じて設定された検出枠133-3内に、被写体候補領域検出部113により、顔があると判定されたため、検出枠133-3内の画像は、切り出され、出力される。 In the detection frame 133-3, the subject candidate area detection unit 113 determines that there is a face within the detection frame 133-3 set according to the distance of the subject by the subject feature extraction unit 112 (FIG. 1). The image within the detection frame 133-3 is cut out and output.
 検出枠133-4においては、被写体特徴抽出部112(図1)により被写体の距離に応じて設定された検出枠133-1内に、顔があったとしても、その顔が、写真などである場合、被写体候補領域検出部113により、顔はないと判定されるため、検出枠133-4内の画像は、切り出されない。 In the detection frame 133-4, even if there is a face in the detection frame 133-1 set according to the distance of the subject by the subject feature extraction unit 112 (FIG. 1), the face is a photograph or the like. In this case, since the subject candidate area detection unit 113 determines that there is no face, the image in the detection frame 133-4 is not cut out.
 このように、本技術においては、距離と、その距離における検出物体の大きさを用いて、物体を検出する。このように検出を行うことで、所定の距離において、その距離における検出物体の大きさ以外の物体は、検出対象から外されるため、誤検出が行われる可能性を低下させることが可能となる。 Thus, in the present technology, an object is detected using the distance and the size of the detected object at the distance. By performing detection in this manner, objects other than the size of the detected object at that distance are removed from the detection target at a predetermined distance, and thus the possibility of erroneous detection can be reduced. .
 また、例えば、検出対象が人の顔であるような場合、写真に写っている人の顔など、距離に遠近がないような物体を誤って検出することがなく、この点からも、誤検出が行われる可能性を低下させることが可能となる。 Also, for example, when the detection target is a human face, there is no false detection of objects that are not far away, such as a human face in a photograph. Can be reduced.
 また、例えば、パターンマッチングなどによる検出を行う場合よりも、本技術による検出を行う方が、処理を軽減することができる。 In addition, for example, the detection can be reduced by performing detection using the present technology rather than performing detection by pattern matching or the like.
 <第2の実施の形態>
 次に、第2の実施の形態について説明を加える。図6は、第2の実施の形態における検出装置200の構成例を示す図である。図6に示した検出装置200と、図1に示した検出装置100において、同一の部分には、同一の符号を付し、その説明は省略する。
<Second Embodiment>
Next, a second embodiment will be described. FIG. 6 is a diagram illustrating a configuration example of the detection device 200 according to the second embodiment. In the detection apparatus 200 illustrated in FIG. 6 and the detection apparatus 100 illustrated in FIG. 1, the same portions are denoted by the same reference numerals, and description thereof is omitted.
 第2の実施の形態における検出装置200は、第1の実施の形態における検出装置100に、撮像部211と被写体詳細認識部212とを追加した構成とされている。 The detection device 200 in the second embodiment is configured by adding an imaging unit 211 and a subject detail recognition unit 212 to the detection device 100 in the first embodiment.
 撮像部211は、CCDやCMOSイメージセンサなどの撮像素子を含む構成とされ、環境光による画像(通常画像と記述する)を撮像し、被写体詳細認識部212に供給する。被写体詳細認識部212には、被写体候補領域検出部113からの検出結果も供給される。 The imaging unit 211 includes an imaging element such as a CCD or a CMOS image sensor, captures an image of ambient light (described as a normal image), and supplies the image to the subject detail recognition unit 212. The detection result from the subject candidate area detection unit 113 is also supplied to the subject detail recognition unit 212.
 被写体候補領域検出部113は、第1の実施の形態として説明したように、距離画像131を用いて検出対象、例えば人の顔が存在すると判定した領域を切り出し、出力する。被写体詳細認識部212は、被写体候補領域検出部113から供給された領域内の被写体に対して、さらに詳細な認識を、通常画像を用いて行う。例えば、性別や年齢といった個人を特定するような認識処理が行われる。 As described in the first embodiment, the subject candidate region detection unit 113 cuts out and outputs a region determined to have a detection target, for example, a human face, using the distance image 131. The subject detail recognition unit 212 performs more detailed recognition on the subject in the region supplied from the subject candidate region detection unit 113 using the normal image. For example, a recognition process for specifying an individual such as gender and age is performed.
 図7に示したフローチャートを参照して、図6に示した検出装置200の処理について説明を加える。 Referring to the flowchart shown in FIG. 7, the processing of the detection apparatus 200 shown in FIG. 6 will be described.
 ステップS201乃至S205は、距離情報取得部111乃至被写体候補領域検出部113により行われる処理であり、図2に示したフローチャートのステップS101乃至S105と同様に行われるため、その説明は省略する。 Steps S201 to S205 are processes performed by the distance information acquisition unit 111 to the subject candidate area detection unit 113, and are performed in the same manner as steps S101 to S105 of the flowchart shown in FIG.
 ステップS206において、被写体詳細認識部212は、被写体候補の検出枠を利用して詳細認識を行う。例えば、被写体詳細認識部212は、被写体候補領域検出部113から供給された検出枠133を撮像部211からの通常画像の該当領域に設定し、その設定した検出枠133内の画像を切り出す。そして、切り出された通常画像を用いて、被写体の性別や年齢といった個人を特定するような認識処理など、予め設定されている認識処理を実行する。 In step S206, the subject detail recognition unit 212 performs detail recognition using the subject candidate detection frame. For example, the subject detail recognition unit 212 sets the detection frame 133 supplied from the subject candidate region detection unit 113 as a corresponding region of the normal image from the imaging unit 211, and cuts out the image within the set detection frame 133. Then, using the extracted normal image, a preset recognition process such as a recognition process for specifying an individual such as the sex or age of the subject is executed.
 このように処理が行われることで、さらに詳細に被写体を検出することができる。 By performing the processing in this way, the subject can be detected in more detail.
 なお被写体候補領域検出部113から被写体詳細認識部212に供給される情報としては、被写体の実サイズに基づいた画像上の大きさ(検出枠133)、代表点(例えば注目画素132)、分布マップ(例えば、ヒートマップ、フィルタ応答値)等の情報とすることができる。また、被写体詳細認識部212は、被写体候補領域検出部113から供給される情報を用いた詳細認識を行う。 The information supplied from the subject candidate area detecting unit 113 to the subject detail recognizing unit 212 includes the size on the image (detection frame 133) based on the actual size of the subject, the representative point (for example, the target pixel 132), and the distribution map. It can be information such as (eg, heat map, filter response value). The subject detail recognition unit 212 performs detail recognition using information supplied from the subject candidate region detection unit 113.
 このような検出(認識)処理が行われることで、例えば、図8に示すような検出結果が得られる。図8の上図、中図は、図5と同様である。すなわち、距離画像131を用いた検出処理により、検出枠133-1と検出枠133-3が、被写体が検出された領域の情報として、被写体詳細認識部212に供給される。 By performing such detection (recognition) processing, for example, a detection result as shown in FIG. 8 is obtained. The upper and middle views of FIG. 8 are the same as FIG. That is, by the detection process using the distance image 131, the detection frame 133-1 and the detection frame 133-3 are supplied to the subject detail recognition unit 212 as information on the area where the subject is detected.
 被写体詳細認識部212では、例えば、DNN(Deep Learning)などの方式に用いて、検出枠133-1と検出枠133-3を通常画像に対して設定したときに、通常画像から切り出される画像を用いて、認識処理を実行する。 The subject detail recognizing unit 212 uses, for example, a method such as DNN (DeepningLearning) to detect an image cut out from the normal image when the detection frame 133-1 and the detection frame 133-3 are set for the normal image. To perform recognition processing.
 このように、第2の実施の形態においても、距離と、その距離における検出物体の大きさが用いられて、検出物体が検出されるため、検出精度を向上させ、検出に係る処理負荷を低減させることが可能となる。さらに、第2の実施の形態においては通常画像(距離画像以外の画像)を用いて、詳細な認識処理を実行するため、より詳細に被写体を検出し、その被写体を認識することができる。 As described above, also in the second embodiment, the distance and the size of the detected object at the distance are used to detect the detected object, so that the detection accuracy is improved and the processing load related to the detection is reduced. It becomes possible to make it. Furthermore, in the second embodiment, since a detailed recognition process is executed using a normal image (an image other than a distance image), the subject can be detected in more detail and the subject can be recognized.
 <第3の実施の形態>
 次に、第3の実施の形態について説明を加える。図9は、第3の実施の形態における検出装置300の構成例を示す図である。図9に示した検出装置300と、図1に示した検出装置100において、同一の部分には、同一の符号を付し、その説明は省略する。
<Third Embodiment>
Next, a third embodiment will be described. FIG. 9 is a diagram illustrating a configuration example of the detection apparatus 300 according to the third embodiment. In the detection apparatus 300 illustrated in FIG. 9 and the detection apparatus 100 illustrated in FIG. 1, the same portions are denoted by the same reference numerals, and description thereof is omitted.
 第3の実施の形態における検出装置300は、第1の実施の形態における検出装置100に、被写体方向検出部311を追加した構成とした点が、第1の実施の形態における検出装置100と異なる。 The detection apparatus 300 according to the third embodiment is different from the detection apparatus 100 according to the first embodiment in that a subject direction detection unit 311 is added to the detection apparatus 100 according to the first embodiment. .
 被写体方向検出部311は、検出された被写体が向いている方向を検出する。第3の実施の形態における検出装置300は、被写体の位置、サイズ、および方向を検出する。 The subject direction detection unit 311 detects the direction in which the detected subject is facing. The detection device 300 according to the third embodiment detects the position, size, and direction of the subject.
 図10に示したフローチャートを参照して、図9に示した検出装置300の処理について説明を加える。 Referring to the flowchart shown in FIG. 10, the processing of the detection apparatus 300 shown in FIG. 9 will be described.
 ステップS301乃至S306(ステップS305を除く)は、距離情報取得部111乃至被写体候補領域検出部113により行われる処理であり、図2に示したフローチャートのステップS101乃至S105と同様に行われるため、その説明は省略する。 Steps S301 to S306 (excluding step S305) are processes performed by the distance information acquisition unit 111 to the subject candidate area detection unit 113, and are performed in the same manner as steps S101 to S105 in the flowchart shown in FIG. Description is omitted.
 ステップS305において、被写体候補領域検出部113により、検出対象の被写体があると判定された領域(検出枠133で設定される領域)とその領域内から切り出された画像が、被写体方向検出部311に供給される。被写体方向検出部311は、検出された被写体の方向を検出する。 In step S305, the subject candidate region detection unit 113 determines that the region (the region set by the detection frame 133) determined that there is a subject to be detected and the image cut out from the region are displayed in the subject direction detection unit 311. Supplied. The subject direction detection unit 311 detects the direction of the detected subject.
 例えば、図11に示したような画面が取得された場合を例に挙げ、方向の検出について説明を加える。図11に示した例では、検出対象は、手であるとして説明する。被写体特徴抽出部112と被写体候補領域検出部113において、ステップS302乃至S304の処理が実行されることで、距離画像131内に、検出枠133が設定され、その検出枠133内に、検出対象である手が検出される。 For example, a case where a screen as shown in FIG. 11 is acquired will be described as an example, and the direction detection will be described. In the example illustrated in FIG. 11, the detection target is described as being a hand. The subject feature extraction unit 112 and the subject candidate region detection unit 113 execute the processes of steps S302 to S304, so that a detection frame 133 is set in the distance image 131, and the detection frame 133 includes a detection target. A hand is detected.
 被写体方向検出部311は、検出枠133内を、所定の大きさに分割し、分割された領域内を被写体の面とし、その面の法線方向を求める。図11に示した画像では、手のひらは、図中、右方向に向いている。手のひらが右方向に向いている場合、手のひらの部分の距離情報としては、手前から奥に向かって、徐々に遠くなる距離情報が得られる。 The subject direction detection unit 311 divides the inside of the detection frame 133 into a predetermined size, and uses the divided area as the subject surface, and obtains the normal direction of the surface. In the image shown in FIG. 11, the palm faces rightward in the figure. When the palm is directed to the right, as the distance information of the palm part, distance information gradually getting farther from the front toward the back is obtained.
 そのような距離情報が得られる手のひらの部分(面)に対して法線を設定すると、図11に示したように、図中右方向の法線が設定される。この設定された法線から、手のひらは、図中右方向を向いていると判定される。 When a normal is set for the palm part (surface) from which such distance information can be obtained, a normal in the right direction in the figure is set as shown in FIG. From this set normal, it is determined that the palm is facing the right direction in the figure.
 このように、距離情報を用いることで、被写体が向いている方向も判定することができる。よって、第3の実施の形態によれば、第1、第2の実施の形態と同じく、距離と、その距離における検出物体の大きさを用いて、検出物体を検出するため、検出精度を向上させることが可能となる。また、検出した被写体の方向を判定することもできる。 Thus, by using the distance information, the direction in which the subject is facing can also be determined. Therefore, according to the third embodiment, as in the first and second embodiments, the detection object is detected using the distance and the size of the detection object at the distance, so the detection accuracy is improved. It becomes possible to make it. It is also possible to determine the direction of the detected subject.
 <第4の実施の形態>
 次に、第4の実施の形態について説明を加える。図12は、第4の実施の形態における検出装置400の構成例を示す図である。図12に示した検出装置400と、図9に示した検出装置300において、同一の部分には、同一の符号を付し、その説明は省略する。
<Fourth embodiment>
Next, a fourth embodiment will be described. FIG. 12 is a diagram illustrating a configuration example of the detection apparatus 400 according to the fourth embodiment. In the detection apparatus 400 illustrated in FIG. 12 and the detection apparatus 300 illustrated in FIG. 9, the same portions are denoted by the same reference numerals, and description thereof is omitted.
 第4の実施の形態における検出装置400は、第3の実施の形態における検出装置300に、撮像部411と被写体詳細認識部412とを追加した構成とされている。追加された撮像部411と被写体詳細認識部412は、第2の実施の形態における検出装置200の撮像部211と被写体詳細認識部212(いずれも図6)と基本的に同様の処理を行う。 The detection device 400 according to the fourth embodiment is configured by adding an imaging unit 411 and a subject detail recognition unit 412 to the detection device 300 according to the third embodiment. The added imaging unit 411 and subject detail recognition unit 412 perform basically the same processing as the imaging unit 211 and subject detail recognition unit 212 (both in FIG. 6) of the detection apparatus 200 in the second embodiment.
 撮像部411は、通常画像を撮像し、被写体詳細認識部412に供給する。被写体詳細認識部412には、被写体方向検出部311からの検出結果も供給される。被写体方向検出部311からは、検出対象、例えば人の顔の位置(検出枠133が設定されている位置)、その大きさ(検出枠133の大きさ)、およびその方向が出力される。 The imaging unit 411 captures a normal image and supplies it to the subject detail recognition unit 412. The subject detail recognition unit 412 is also supplied with the detection result from the subject direction detection unit 311. The subject direction detection unit 311 outputs a detection target, for example, the position of a human face (position where the detection frame 133 is set), its size (size of the detection frame 133), and its direction.
 被写体詳細認識部412は、被写体方向検出部311から供給された領域内の被写体に対して、さらに詳細な認識を、通常画像を用いて行う。例えば、性別や年齢といった個人を特定するような認識処理が行われる。 The subject detail recognition unit 412 performs more detailed recognition on the subject in the area supplied from the subject direction detection unit 311 using a normal image. For example, a recognition process for specifying an individual such as gender and age is performed.
 図13に示したフローチャートを参照して、図6に示した検出装置400の処理について説明を加える。 Referring to the flowchart shown in FIG. 13, the processing of the detection apparatus 400 shown in FIG. 6 will be described.
 ステップS401乃至S406は、距離情報取得部111、被写体特徴抽出部112、被写体候補領域検出部113、および被写体方向検出部311により行われる処理であり、図10に示したフローチャートのステップS301乃至S306と同様に行われるため、その説明は省略する。 Steps S401 to S406 are processes performed by the distance information acquisition unit 111, the subject feature extraction unit 112, the subject candidate region detection unit 113, and the subject direction detection unit 311. The processing in steps S301 to S306 in the flowchart illustrated in FIG. Since it is performed similarly, the description is abbreviate | omitted.
 ステップS407において、被写体詳細認識部412は、被写体候補の検出枠と被写体の方向を利用して詳細認識を行う。例えば、被写体詳細認識部412は、被写体方向検出部311から供給された検出枠133を撮像部411からの通常画像の該当領域に設定し、その設定した検出枠133内の画像を切り出す。そして、切り出された通常画像を用いて、被写体の性別や年齢といった個人を特定するような認識処理など、予め設定されている認識処理を実行する。この認識処理には、被写体の方向も考慮して行われる。 In step S407, the subject detail recognition unit 412 performs detail recognition using the subject candidate detection frame and the direction of the subject. For example, the subject detail recognition unit 412 sets the detection frame 133 supplied from the subject direction detection unit 311 as a corresponding area of the normal image from the imaging unit 411, and cuts out the image in the set detection frame 133. Then, using the extracted normal image, a preset recognition process such as a recognition process for specifying an individual such as the sex or age of the subject is executed. This recognition processing is performed in consideration of the direction of the subject.
 図14に、第4の実施の形態おける検出装置400で行う認識方法と、他の認識方法とを比較した図を示す。図14の左図は、他の認識方法の一例を示す図である。例えば、通常画像から、検出対象として顔が検出される場合、まず、検出された物体が顔であると仮定され、その顔が、前後または左右のどちらの方向を向いているかを判定するために、前後/左右判定辞書431が参照された判定が行われる。 FIG. 14 shows a diagram comparing the recognition method performed by the detection apparatus 400 according to the fourth embodiment with other recognition methods. The left diagram in FIG. 14 is a diagram illustrating an example of another recognition method. For example, when a face is detected as a detection target from a normal image, first, it is assumed that the detected object is a face, and in order to determine whether the face is facing forward or backward or left and right The determination with reference to the front / rear / left / right determination dictionary 431 is performed.
 顔が前後方向を向いている(左右方向ではない方向を向いている)と判定された場合、前/後判定辞書432が参照され、前向きであるか、後向きであるかが判定される。前向きであると判定された場合、前向き辞書434が参照され、人の顔であるか否か、また、人の顔である場合、前向きの顔であるか否かが判定される。この処理により、前向き辞書434に、個人を特定するデータが記載されている場合、そのデータとマッチングをとることで人物が特定される。 When it is determined that the face is facing in the front-rear direction (facing in a direction other than the left-right direction), the front / rear determination dictionary 432 is referred to and it is determined whether it is forward-facing or backward-facing. If it is determined to be forward, the forward dictionary 434 is referred to, and it is determined whether or not it is a human face, and if it is a human face, it is determined whether or not it is a forward-facing face. With this process, when data for identifying an individual is described in the forward dictionary 434, a person is identified by matching with the data.
 一方、前/後判定辞書432が参照され、後向きであると判定された場合、後向き辞書435が参照され、人の顔であるか否か、また、人の顔である場合、後向きの顔であるか否かが判定される。この処理により、後向き辞書435に、個人を特定するデータが記載されている場合、そのデータとマッチングをとることで人物が特定される。 On the other hand, when the forward / backward determination dictionary 432 is referred to and determined to be backward, the backward dictionary 435 is referred to to determine whether or not the face is a person's face. It is determined whether or not there is. As a result of this processing, when data specifying an individual is described in the backward dictionary 435, a person is specified by matching the data.
 一方、前後/左右判定辞書431が参照され、顔が左右方向を向いている(前後方向ではない方向を向いている)と判定された場合、左/右判定辞書433が参照され、左向きであるか、右向きであるかが判定される。左向きであると判定された場合、左向き辞書436が参照され、人の顔であるか否か、また、人の顔である場合、左向きの顔であるか否かが判定される。この処理により、左向き辞書436に、個人を特定するデータが記載されている場合、そのデータとマッチングをとることで人物が特定される。 On the other hand, when the front / rear / left / right determination dictionary 431 is referred to and the face is determined to face in the left / right direction (ie, facing in a direction other than the front / rear direction), the left / right determination dictionary 433 is referred to and leftward. Or whether it is facing right. When it is determined that the face is leftward, the leftward dictionary 436 is referred to, and it is determined whether or not the face is a human face. If the face is a human face, it is determined whether or not the face is a leftward face. With this process, when data for identifying an individual is described in the left-facing dictionary 436, a person is identified by matching the data.
 一方、左/右判定辞書433が参照され、右向きであると判定された場合、右向き辞書437が参照され、人の顔であるか否か、また、人の顔である場合、右向きの顔であるか否かが判定される。この処理により、右向き辞書437に、個人を特定するデータが記載されている場合、そのデータとマッチングをとることで人物が特定される。 On the other hand, when the left / right determination dictionary 433 is referred to and determined to be right-facing, the right-facing dictionary 437 is referred to to determine whether or not it is a human face. It is determined whether or not there is. By this processing, when data for specifying an individual is described in the right-facing dictionary 437, a person is specified by matching with the data.
 このようにして、従来の認識処理の場合、複数の辞書を参照し、判定を行うことで、認識処理が行われていた。 Thus, in the case of the conventional recognition process, the recognition process is performed by referring to a plurality of dictionaries and making a determination.
 第4の実施の形態おける検出装置400では、距離画像131から、被写体がある領域、大きさ、および方向が検出され、それらの情報を用いて、被写体詳細認識部412(図12)は、認識処理を行う。よって、図14の右図に示したように、X向き辞書451を用意し、そのX向き辞書451を参照することで、認識処理を行うことができる。 In the detection apparatus 400 according to the fourth embodiment, a region, a size, and a direction where a subject is present is detected from the distance image 131, and the subject detail recognition unit 412 (FIG. 12) recognizes using the information. Process. Accordingly, as shown in the right diagram of FIG. 14, recognition processing can be performed by preparing an X-direction dictionary 451 and referring to the X-direction dictionary 451.
 X向き辞書451は、前向き辞書434、後向き辞書435、左向き辞書436、および右向き辞書437を含む辞書とされている。被写体詳細認識部412(図12)には、被写体の方向も供給されるため、その供給された方向に関する辞書のみが参照され、認識処理が行われる構成とすることができる。 The X-direction dictionary 451 is a dictionary including a forward-facing dictionary 434, a backward-facing dictionary 435, a left-facing dictionary 436, and a right-facing dictionary 437. Since the subject direction recognition unit 412 (FIG. 12) is also supplied with the direction of the subject, only the dictionary relating to the supplied direction can be referred to and the recognition process can be performed.
 第4の実施の形態おける検出装置400によれば、辞書の数(データ量)を少なくすることができ、辞書を参照して複数回行われる判定処理を省略することができる。よって、第4の実施の形態における検出装置400によれば、認識処理に係る処理を軽減することが可能となる。また、検出対象、例えば、顔である可能性が高い画像(切り出された画像)のみが、詳細認識の対象とされるため、画像内の処理対象とされる領域が絞られており、この点からも、認識処理に係る処理を軽減することが可能となる。 According to the detection apparatus 400 in the fourth embodiment, the number of dictionaries (data amount) can be reduced, and the determination process performed a plurality of times with reference to the dictionary can be omitted. Therefore, according to the detection apparatus 400 in the fourth embodiment, it is possible to reduce processing related to recognition processing. In addition, since only the detection target, for example, an image that has a high possibility of being a face (a clipped image) is a target for detailed recognition, a region to be processed in the image is narrowed. Therefore, it is possible to reduce processing related to recognition processing.
 このように、第4の実施の形態においても、距離と、その距離における検出物体の大きさが用いられて、検出物体が検出されるため、検出精度を向上させることが可能となる。また、第4の実施の形態においては通常画像(距離画像以外の画像)を用いて、詳細な認識処理を実行するため、より詳細に被写体を検出し、その被写体を認識することができる。また、その認識処理は、被写体の方向を予め取得した処理とすることができ、処理を軽減することができる。 As described above, also in the fourth embodiment, since the detection object is detected by using the distance and the size of the detection object at the distance, the detection accuracy can be improved. In the fourth embodiment, since a detailed recognition process is executed using a normal image (an image other than a distance image), the subject can be detected in more detail and the subject can be recognized. The recognition process can be a process in which the direction of the subject is acquired in advance, and the process can be reduced.
 <第5の実施の形態>
 次に、第5の実施の形態について説明する。第5の実施の形態、および以下に説明する第6乃至第8の実施の形態においては、被写体のサイズを推定し、被写体が属するカテゴリを推定することで、検出対象を検出する。
<Fifth embodiment>
Next, a fifth embodiment will be described. In the fifth embodiment and the sixth to eighth embodiments described below, the detection target is detected by estimating the size of the subject and estimating the category to which the subject belongs.
 図15は、第5の実施の形態における検出装置500の構成例を示す図である。図15に示した検出装置500は、距離情報取得部111、被写体サイズ推定部511、および被写体カテゴリ推定部512から構成されている。 FIG. 15 is a diagram illustrating a configuration example of the detection apparatus 500 according to the fifth embodiment. The detection apparatus 500 shown in FIG. 15 includes a distance information acquisition unit 111, a subject size estimation unit 511, and a subject category estimation unit 512.
 距離情報取得部111は、例えば検出装置100に含まれていた距離情報取得部111と同様の構成を有し、距離画像131を生成するための距離情報を取得する機能である。 The distance information acquisition unit 111 has a configuration similar to that of the distance information acquisition unit 111 included in the detection device 100, for example, and has a function of acquiring distance information for generating the distance image 131.
 被写体サイズ推定部511は、被写体のサイズを推定し、推定したサイズの情報を、被写体カテゴリ推定部512に供給する。被写体カテゴリ推定部512は、推定された被写体のサイズと、その被写体が位置する距離とから、被写体が属するカテゴリを推定する。 The subject size estimation unit 511 estimates the size of the subject and supplies the estimated size information to the subject category estimation unit 512. The subject category estimation unit 512 estimates the category to which the subject belongs from the estimated size of the subject and the distance at which the subject is located.
 例えば、上記したように、人の顔であれば、所定の距離だけ離れた位置にある人の顔は、どの程度のサイズであるかがわかる。このことを換言すれば、所定の距離だけ離れた位置に、人の顔と判定できるサイズの物体があった場合、人の顔があると推定できる。 For example, as described above, in the case of a person's face, it is possible to know how large the person's face is at a predetermined distance away. In other words, if there is an object of a size that can be determined as a human face at a position separated by a predetermined distance, it can be estimated that there is a human face.
 このようなことを利用し、検出装置500では、被写体のサイズを推定し、そのサイズと距離から、被写体が属するカテゴリ、例えば、人の顔のカテゴリ、車のカテゴリといったカテゴリが判定される。 Utilizing this, the detection apparatus 500 estimates the size of the subject, and the category to which the subject belongs, for example, a category such as a human face category or a car category is determined from the size and distance.
 図16に示したフローチャートを参照し、図15に示した検出装置500の処理について説明を加える。 Referring to the flowchart shown in FIG. 16, the processing of the detection apparatus 500 shown in FIG. 15 will be described.
 ステップS501において、被写体サイズ推定部511は、注目画素を設定する。この処理は、例えば、図2に示したフローチャートのステップS101と同様にして行うことができる。 In step S501, the subject size estimation unit 511 sets a target pixel. This process can be performed, for example, in the same manner as step S101 in the flowchart shown in FIG.
 ステップS502において、被写体サイズ推定部511は、設定した注目画素位置の周辺の距離を取得する。そして、ステップS503において、被写体サイズ推定部511は、周辺の距離分布に基づいて、被写体サイズを推定する。例えば、物体と背景とでは、距離が大きく異なるため、距離が大きく変化する部分(すなわちエッジ)を、周辺の距離分布を参照して検出することで、物体が存在する領域(エッジ部分までの大きさ)を推定することができる。 In step S502, the subject size estimation unit 511 acquires a distance around the set target pixel position. In step S503, the subject size estimation unit 511 estimates the subject size based on the peripheral distance distribution. For example, since the distance between the object and the background is greatly different, a region where the distance greatly changes (that is, an edge) is detected with reference to the surrounding distance distribution, so that the region where the object exists (the size up to the edge portion) Can be estimated.
 ステップS504において、被写体カテゴリ推定部512は、距離と被写体サイズに基づいて、被写体カテゴリを推定する。上記したように、距離とサイズから、その位置にある物体のカテゴリを推定することができるため、そのような推定が、ステップS504において実行される。 In step S504, the subject category estimation unit 512 estimates the subject category based on the distance and the subject size. As described above, since the category of the object at the position can be estimated from the distance and the size, such estimation is performed in step S504.
 例えば、図17に示すような距離画像131が取得されたとする。図17に示した距離画像131は、手が撮像されている画像である。例えば、手の所定の位置が注目画素132に設定された場合、この注目画素132の周辺の距離分布が、参照される。 For example, assume that a distance image 131 as shown in FIG. 17 is acquired. A distance image 131 illustrated in FIG. 17 is an image in which a hand is captured. For example, when a predetermined position of the hand is set as the target pixel 132, the distance distribution around the target pixel 132 is referred to.
 手の部分と、背景とでは、距離が大きく異なる。すなわちこの場合、手がある部分は、近い距離であるが、背景は遠い距離となる。注目画素132の周辺の距離分布を参照することで、注目画素132から徐々に遠ざかる方向で距離分布を参照したとき、その距離が、大きく変わる部分がある。 The distance is greatly different between the hand part and the background. That is, in this case, the part with the hand is a short distance, but the background is a long distance. By referring to the distance distribution around the pixel of interest 132, when the distance distribution is referenced in a direction gradually moving away from the pixel of interest 132, there is a portion where the distance changes greatly.
 この場合、注目画素132は、手のひらのほぼ中央の位置に設定されているときなので、手のひらから指先の方に探索していくと、指の先端部分から背景になる部分で急激に距離情報が変化する。注目画素132から、急激に距離情報が変化する位置までを、図17では矢印を用いて表している。なお、急減に距離情報が変化する位置は、注目画素132の距離と、探索している画素の距離との差分が、所定の閾値以上に変化した場合、その探索している画素の位置を、急減に距離情報が変化した位置としても良い。 In this case, since the target pixel 132 is set at a position approximately in the center of the palm, when searching from the palm toward the fingertip, the distance information changes rapidly from the tip of the finger to the background. To do. From the pixel of interest 132 to the position where the distance information changes abruptly, the arrows are used in FIG. Note that the position where the distance information changes suddenly decreases when the difference between the distance of the target pixel 132 and the distance of the pixel being searched for changes to a predetermined threshold value or more. It is good also as a position where distance information changed suddenly.
 このようにして、注目画素132から、物体が存在する可能性がある範囲が推定される。図17に示した例では、例えば、注目画素132から一番長い矢印の先端までを半径とする円や四角形(不図示)が設定され、その円や四角形の大きさが、被写体サイズとされる。この被写体サイズは、第1乃至第4の実施の形態における検出枠133に該当する。換言すれば、このような処理により、検出枠133が設定される。 In this way, a range where an object may exist is estimated from the target pixel 132. In the example shown in FIG. 17, for example, a circle or a rectangle (not shown) having a radius from the target pixel 132 to the tip of the longest arrow is set, and the size of the circle or the rectangle is the subject size. . This subject size corresponds to the detection frame 133 in the first to fourth embodiments. In other words, the detection frame 133 is set by such processing.
 そして、注目画素132の距離と被写体サイズ(検出枠133)から、検出された被写体のカテゴリが推定される。図17に示した例では、注目画素132の距離において、検出された被写体のサイズでは、“手”というカテゴリに属すると推定される。 The category of the detected subject is estimated from the distance of the target pixel 132 and the subject size (detection frame 133). In the example illustrated in FIG. 17, it is estimated that the detected subject size at the distance of the target pixel 132 belongs to the category “hand”.
 このような処理が、距離画像131内の全ての画素において実行されたか否かが、ステップS505において判定される。なお、図2に示したフローチャートのステップS105と同じく、注目画素132に設定される画素は、距離画像131内の全ての画素ではなく、一部除外される画素があっても良い。 It is determined in step S505 whether or not such processing has been executed for all the pixels in the distance image 131. Similar to step S105 in the flowchart shown in FIG. 2, the pixels set as the target pixel 132 may not be all the pixels in the distance image 131 but may be partially excluded.
 このように、第5の実施の形態における検出装置500によれば、距離画像から、被写体のサイズを推定し、カテゴリを推定することができる。また、検出装置500によれば、複数の被写体(カテゴリ)を推定することができ、例えば、人と車といった異なる物体を検出することができる。 Thus, according to the detection apparatus 500 in the fifth embodiment, the size of the subject can be estimated from the distance image, and the category can be estimated. In addition, according to the detection device 500, a plurality of subjects (categories) can be estimated, and for example, different objects such as a person and a car can be detected.
 <第6の実施の形態>
 次に、第6の実施の形態について説明を加える。図18は、第6の実施の形態における検出装置600の構成例を示す図である。図18に示した検出装置600と、図15に示した検出装置500において、同一の部分には、同一の符号を付し、その説明は省略する。
<Sixth Embodiment>
Next, the sixth embodiment will be described. FIG. 18 is a diagram illustrating a configuration example of a detection device 600 according to the sixth embodiment. In the detection apparatus 600 illustrated in FIG. 18 and the detection apparatus 500 illustrated in FIG. 15, the same portions are denoted by the same reference numerals, and description thereof is omitted.
 第6の実施の形態における検出装置600は、第5の実施の形態における検出装置500に、撮像部611と被写体詳細認識部612とを追加した構成とされている。追加された撮像部611は、第2の実施の形態における検出装置200の撮像部211(図6)と基本的に同様の処理を行う。 The detection device 600 in the sixth embodiment is configured by adding an imaging unit 611 and a subject detail recognition unit 612 to the detection device 500 in the fifth embodiment. The added imaging unit 611 performs basically the same processing as the imaging unit 211 (FIG. 6) of the detection apparatus 200 in the second embodiment.
 撮像部611は、通常画像を撮像し、被写体詳細認識部612に供給する。被写体詳細認識部612には、被写体カテゴリ推定部712から、距離、被写体サイズ、および被写体カテゴリも供給される。被写体詳細認識部612は、被写体カテゴリ推定部712から供給された距離、被写体サイズ、および被写体カテゴリを用いて、さらに詳細な認識を、通常画像を用いて行う。 The imaging unit 611 captures a normal image and supplies it to the subject detail recognition unit 612. The subject detail recognition unit 612 is also supplied with the distance, subject size, and subject category from the subject category estimation unit 712. The subject detail recognition unit 612 performs more detailed recognition using the normal image using the distance, subject size, and subject category supplied from the subject category estimation unit 712.
 図19に示したフローチャートを参照して、図18に示した検出装置600の処理について説明を加える。 Referring to the flowchart shown in FIG. 19, the processing of the detection apparatus 600 shown in FIG. 18 will be described.
 ステップS601乃至S605は、距離情報取得部111、被写体サイズ推定部511、被写体カテゴリ推定部512により行われる処理であり、図16に示したフローチャートのステップS501乃至S505と同様に行われるため、その説明は省略する。 Steps S601 to S605 are processes performed by the distance information acquisition unit 111, the subject size estimation unit 511, and the subject category estimation unit 512, and are performed in the same manner as steps S501 to S505 in the flowchart shown in FIG. Is omitted.
 ステップS606において、被写体詳細認識部412は、被写体候補領域と被写体カテゴリを利用して詳細認識を行う。例えば、被写体詳細認識部412は、被写体カテゴリ推定部512から供給された被写体サイズに該当する枠を撮像部411からの通常画像の該当領域に設定し、その設定した枠内の画像を切り出す。 In step S606, the subject detail recognition unit 412 performs detail recognition using the subject candidate region and the subject category. For example, the subject detail recognition unit 412 sets a frame corresponding to the subject size supplied from the subject category estimation unit 512 as a corresponding region of the normal image from the imaging unit 411, and cuts out an image within the set frame.
 そして、切り出された通常画像と、被写体カテゴリ推定部512から供給された被写体カテゴリを用いて、カテゴリを絞り込んだ後、そのカテゴリに属する物体を特定するような認識処理など、予め設定されている認識処理を実行する。例えば、カテゴリが人と判定されている場合には、人に属する画像とのマッチングを行い、個人を特定したり、カテゴリが車と判定されている場合には、車に属する画像とのマッチングを行い、車種を特定したりする詳細認識が行われる。 Then, using the extracted normal image and the subject category supplied from the subject category estimation unit 512, after narrowing down the category, recognition processing such as recognition processing for specifying an object belonging to the category is set in advance. Execute the process. For example, when the category is determined to be a person, matching with an image belonging to a person is performed. When an individual is specified, or when the category is determined to be a car, matching with an image belonging to a car is performed. Detailed recognition to identify the vehicle type is performed.
 このように、第6の実施の形態における検出装置600によれば、距離画像から、被写体のサイズを推定し、被写体が属するカテゴリを推定することができる。また、検出装置600によれば、複数の被写体(カテゴリ)を推定することができ、例えば、人と車といった異なる物体を検出することができる。さらに、検出した物体を、詳細に認識することができる。 Thus, according to the detection apparatus 600 in the sixth embodiment, the size of the subject can be estimated from the distance image, and the category to which the subject belongs can be estimated. Further, according to the detection apparatus 600, a plurality of subjects (categories) can be estimated, and for example, different objects such as a person and a car can be detected. Furthermore, the detected object can be recognized in detail.
 <第7の実施の形態>
 次に、第7の実施の形態について説明を加える。図20は、第7の実施の形態における検出装置700の構成例を示す図である。図20に示した検出装置700と、図15に示した検出装置500において、同一の部分には、同一の符号を付し、その説明は省略する。
<Seventh embodiment>
Next, the seventh embodiment will be described. FIG. 20 is a diagram illustrating a configuration example of the detection apparatus 700 according to the seventh embodiment. In the detection apparatus 700 illustrated in FIG. 20 and the detection apparatus 500 illustrated in FIG. 15, the same portions are denoted by the same reference numerals, and description thereof is omitted.
 第7の実施の形態における検出装置700は、第5の実施の形態における検出装置500に、被写体形状推定部711を追加し、被写体カテゴリ推定部712は、被写体形状推定部711からの出力を入力する構成とした点が、第5の実施の形態における検出装置500と異なる。 The detection device 700 in the seventh embodiment adds a subject shape estimation unit 711 to the detection device 500 in the fifth embodiment, and the subject category estimation unit 712 receives an output from the subject shape estimation unit 711. This configuration differs from the detection device 500 according to the fifth embodiment.
 被写体形状推定部711は、被写体の形状を推定する。図17を再度参照する。図17に示したように、手が撮像された距離画像131が取得されたとき、注目画素132から、距離情報が大きく変化する、すなわちエッジの部分までを探索することで、手の形状が得られる。 The subject shape estimation unit 711 estimates the shape of the subject. Reference is again made to FIG. As shown in FIG. 17, when the distance image 131 in which the hand is imaged is acquired, the distance information greatly changes from the target pixel 132, that is, by searching up to the edge portion, the shape of the hand is obtained. It is done.
 被写体カテゴリ推定部712は、図15に示した検出装置500の被写体カテゴリ推定部512と基本的に同様の処理を行うが、図20に示した被写体カテゴリ推定部712は、被写体形状推定部711からの推定された被写体の形状も用いてカテゴリの推定を行う。よって、より精度良くカテゴリの推定ができる。 The subject category estimation unit 712 performs basically the same processing as the subject category estimation unit 512 of the detection apparatus 500 illustrated in FIG. 15, but the subject category estimation unit 712 illustrated in FIG. The category is estimated using the estimated shape of the subject. Therefore, the category can be estimated with higher accuracy.
 図21に示したフローチャートを参照して、図20に示した検出装置700の処理について説明を加える。 Referring to the flowchart shown in FIG. 21, the processing of the detection apparatus 700 shown in FIG. 20 will be described.
 ステップS701乃至S703は、距離情報取得部111、被写体サイズ推定部511により行われる処理であり、図16に示したフローチャートのステップS501乃至S503と同様に行われるため、その説明は省略する。 Steps S701 to S703 are processes performed by the distance information acquisition unit 111 and the subject size estimation unit 511, and are performed in the same manner as steps S501 to S503 in the flowchart shown in FIG.
 ステップS704において、被写体形状推定部711は、注目画素132周辺の距離分布に基づいて被写体の形状を推定する。図17を参照して説明したように、距離情報を用いて、距離が大きく変化する部分(エッジ)を探索することで、形状が推定される。換言すれば、距離がなだらかに変化している領域は、検出している物体の一部であるとし、そのような距離がなだらかに変化しているか否かを判定しながら、物体の形状が求められる。 In step S704, the subject shape estimation unit 711 estimates the shape of the subject based on the distance distribution around the target pixel 132. As described with reference to FIG. 17, the shape is estimated by searching for a portion (edge) where the distance greatly changes using the distance information. In other words, the region where the distance is gradually changing is assumed to be a part of the detected object, and the shape of the object is obtained while determining whether such a distance is changing gently. It is done.
 ステップS705において、被写体カテゴリ推定部712は、距離、被写体サイズ、形状に基づいて、被写体が属するカテゴリを推定する。この場合、距離と被写体サイズだけでなく、形状の情報も用いてカテゴリの推定が行われるため、より精度良くカテゴリの推定を行うことができる。 In step S705, the subject category estimation unit 712 estimates the category to which the subject belongs based on the distance, the subject size, and the shape. In this case, since the category is estimated using not only the distance and the subject size but also the shape information, the category can be estimated with higher accuracy.
 第7の実施の形態における検出装置700によれば、距離画像から、被写体のサイズを推定し、カテゴリを推定し、被写体の形状を推定することができる。また、検出装置700によれば、複数の被写体(カテゴリ)を推定することができ、例えば、人と車といった異なる物体を検出することができる。 According to the detection apparatus 700 in the seventh embodiment, it is possible to estimate the size of the subject, estimate the category, and estimate the shape of the subject from the distance image. Further, according to the detection device 700, a plurality of subjects (categories) can be estimated, and for example, different objects such as a person and a car can be detected.
 なお、被写体カテゴリ推定部712によるカテゴリの推定を省略し、被写体形状推定部711での被写体形状の推定結果が、後段の処理部(不図示)に出力されるようにしても良い。 Note that category estimation by the subject category estimation unit 712 may be omitted, and the subject shape estimation result by the subject shape estimation unit 711 may be output to a subsequent processing unit (not shown).
 <第8の実施の形態>
 次に、第8の実施の形態について説明を加える。図22は、第8の実施の形態における検出装置800の構成例を示す図である。図22に示した検出装置800と、図20に示した検出装置700において、同一の部分には、同一の符号を付し、その説明は省略する。
<Eighth Embodiment>
Next, an eighth embodiment will be described. FIG. 22 is a diagram illustrating a configuration example of the detection apparatus 800 according to the eighth embodiment. In the detection apparatus 800 illustrated in FIG. 22 and the detection apparatus 700 illustrated in FIG. 20, the same portions are denoted by the same reference numerals, and description thereof is omitted.
 第8の実施の形態における検出装置800は、第7の実施の形態における検出装置700に、撮像部811と被写体詳細認識部812とを追加した構成とされている。追加された撮像部811は、第2の実施の形態における検出装置200の撮像部211(図6)と基本的に同様の処理を行う。 The detection apparatus 800 according to the eighth embodiment is configured by adding an imaging unit 811 and a subject detail recognition unit 812 to the detection apparatus 700 according to the seventh embodiment. The added imaging unit 811 performs basically the same processing as the imaging unit 211 (FIG. 6) of the detection apparatus 200 in the second embodiment.
 撮像部811は、通常画像を撮像し、被写体詳細認識部812に供給する。被写体詳細認識部812には、被写体カテゴリ推定部712から、距離、被写体サイズ、被写体カテゴリ、および被写体形状も供給される。被写体詳細認識部812は、被写体カテゴリ推定部712から供給された距離、被写体サイズ、被写体カテゴリ、および被写体形状を用いて、さらに詳細な認識を、通常画像を用いて行う。 The imaging unit 811 captures a normal image and supplies it to the subject detail recognition unit 812. The subject detail recognition unit 812 is also supplied with the distance, subject size, subject category, and subject shape from the subject category estimation unit 712. The subject detail recognition unit 812 performs more detailed recognition using the normal image using the distance, subject size, subject category, and subject shape supplied from the subject category estimation unit 712.
 図23に示したフローチャートを参照して、図22に示した検出装置800の処理について説明を加える。 Referring to the flowchart shown in FIG. 23, the processing of the detection apparatus 800 shown in FIG. 22 will be described.
 ステップS801乃至S806は、距離情報取得部111、被写体サイズ推定部511、被写体形状推定部711、被写体カテゴリ推定部712により行われる処理であり、図21に示したフローチャートのステップS701乃至S706と同様に行われるため、その説明は省略する。 Steps S801 to S806 are processing performed by the distance information acquisition unit 111, the subject size estimation unit 511, the subject shape estimation unit 711, and the subject category estimation unit 712, and are similar to steps S701 to S706 of the flowchart shown in FIG. Since it is performed, the description thereof is omitted.
 ステップS807において、被写体詳細認識部812は、被写体候補領域、被写体カテゴリ、および被写体形状を利用して詳細認識を行う。例えば、被写体詳細認識部812は、被写体カテゴリ推定部712から供給された被写体サイズに該当する枠を撮像部411からの通常画像の該当領域に設定し、その設定した枠内の画像を切り出す。 In step S807, the subject detail recognition unit 812 performs detail recognition using the subject candidate area, the subject category, and the subject shape. For example, the subject detail recognition unit 812 sets a frame corresponding to the subject size supplied from the subject category estimation unit 712 as a corresponding region of the normal image from the imaging unit 411, and cuts out an image within the set frame.
 そして、切り出された通常画像と、被写体カテゴリ推定部712から供給された被写体カテゴリを用いて、カテゴリを絞り込んだ後、そのカテゴリに属する物体のうち、被写体形状に合う物体を特定するような認識処理など、予め設定されている認識処理を実行する。例えば、カテゴリが人と判定されている場合には、人に属する画像とのマッチングを行い、そのマッチングを行うとき、被写体形状を参照し、その形状に近い画像、例えば、形状が人の顔である場合、人の顔を対象とした認識に絞り込み、絞り込んだ後、個人を特定するといった処理が行われる。 Then, after narrowing down the category using the extracted normal image and the subject category supplied from the subject category estimation unit 712, a recognition process for identifying an object that matches the subject shape from among the objects belonging to the category. For example, a preset recognition process is executed. For example, when the category is determined to be a person, matching with an image belonging to the person is performed, and when performing the matching, the object shape is referred to, and an image close to the shape, for example, the shape is a human face. In some cases, processing is performed such as narrowing down recognition to a person's face, and then identifying an individual after narrowing down.
 このように、第8の実施の形態における検出装置800によれば、距離画像から、被写体のサイズを推定し、カテゴリを推定し、被写体の形状を推定することができる。また、検出装置800によれば、複数の被写体(カテゴリ)を推定することができ、例えば、人と車といった異なる物体を検出することができる。 Thus, according to the detection apparatus 800 in the eighth embodiment, it is possible to estimate the size of the subject, estimate the category, and estimate the shape of the subject from the distance image. Further, the detection apparatus 800 can estimate a plurality of subjects (categories), and can detect different objects such as a person and a car, for example.
 さらに、検出した物体を、詳細に認識することができる。その詳細な認識のとき、推定されている被写体のサイズ、カテゴリ、形状などの情報も用いて行うことができるため、詳細認識に係る処理を軽減することができる。 Furthermore, the detected object can be recognized in detail. Since the detailed recognition can be performed using information such as the estimated size, category, and shape of the subject, the processing related to the detailed recognition can be reduced.
 第1乃至第8の実施の形態における検出装置によれば、距離画像から物体を検出することができる。例えば、物体として人を検出するようにした場合、本技術を監視カメラ等に適用できる。また、例えば、本技術をゲーム機に適用し、ゲームを行う人を検出し、その人のジェスチャーを検出する(手や、その手の向きなどを検出する)装置にも適用できる。 According to the detection apparatus in the first to eighth embodiments, an object can be detected from a distance image. For example, when a person is detected as an object, the present technology can be applied to a surveillance camera or the like. Further, for example, the present technology can be applied to a game machine, and can be applied to a device that detects a person who plays a game and detects a gesture of the person (detects a hand, a direction of the hand, and the like).
 また、第1乃至第8の実施の形態における検出装置を、自動車に搭載し、人、自転車、自車以外の自動車などを検出し、検出した物体の情報を、ユーザに知らせたり、衝突しないように安全回避を行う制御を行ったりする装置の一部にも適用できる。 In addition, the detection device according to the first to eighth embodiments is mounted on a car, and a person, a bicycle, a car other than the own car is detected, and information on the detected object is notified to the user or a collision is prevented. The present invention can also be applied to a part of a device that performs control for avoiding safety.
 <検出装置の適用例> <Application example of detection device>
 第1乃至第8の実施の形態における検出装置を、複数の基板(ダイ)を積層した積層イメージセンサを採用することができる。ここでは、第2の実施の形態における検出装置200(図6)を例に挙げて、検出装置200を積層イメージセンサで構成した場合について説明する。 As the detection apparatus in the first to eighth embodiments, a stacked image sensor in which a plurality of substrates (dies) are stacked can be employed. Here, the case where the detection device 200 is configured by a laminated image sensor will be described by taking the detection device 200 (FIG. 6) in the second embodiment as an example.
 図24は、図6の検出装置200の全体を内蔵させた積層イメージセンサの第1の構成例を示す図である。図24の積層イメージセンサは、画素基板901と信号処理基板902とが積層された2層構造になっている。 FIG. 24 is a diagram illustrating a first configuration example of a stacked image sensor in which the entire detection device 200 of FIG. 6 is incorporated. The stacked image sensor of FIG. 24 has a two-layer structure in which a pixel substrate 901 and a signal processing substrate 902 are stacked.
 画素基板901には、距離情報取得部111(の一部)と撮像部211(の一部)が形成されている。距離情報取得部111を、TOF方式で距離情報を得る場合、所定の光を被写体に照射する照射部と、その照射された光を受光する撮像素子とが含まれる構成とされる。この距離情報取得部111を構成する撮像素子の部分、または照射部などの部分も含めて、画素基板901に形成することができる。 The pixel substrate 901 is formed with a distance information acquisition unit 111 (part) and an imaging unit 211 (part). When the distance information acquisition unit 111 obtains distance information by the TOF method, the distance information acquisition unit 111 includes an irradiation unit that irradiates a subject with predetermined light and an imaging element that receives the irradiated light. The distance information acquisition unit 111 can be formed on the pixel substrate 901 including a part of an imaging element or a part such as an irradiation unit.
 また撮像部211にも、通常画像を撮像するための撮像素子が含まれている。この距離情報取得部211を構成する撮像素子の部分を、画素基板901に形成することができる。 The imaging unit 211 also includes an image sensor for capturing a normal image. The part of the image sensor that constitutes the distance information acquisition unit 211 can be formed on the pixel substrate 901.
 信号処理基板82には、被写体特徴抽出部112、被写体候補領域検出部113、実サイズデータベース114、および被写体詳細認識部212が形成されている。 On the eyelid signal processing board 82, a subject feature extraction unit 112, a subject candidate region detection unit 113, an actual size database 114, and a subject detail recognition unit 212 are formed.
 以上のように構成される図24の積層イメージセンサでは、画素基板901の距離情報取得部111において、そこに入射する光を受光することにより撮像が行われ、その撮像により得られる画像(距離画像)から、検出対象とされた物体の検出が行われる。 In the stacked image sensor of FIG. 24 configured as described above, the distance information acquisition unit 111 of the pixel substrate 901 performs imaging by receiving light incident thereon, and an image (distance image) obtained by the imaging. ) To detect an object to be detected.
 また図24の積層イメージセンサでは、画素基板901の撮像部211において、そこに入射する光を受光することにより撮像が行われ、その撮像により得られる画像(通常画像)から、検出対象とされた被写体の画像などが切り出され、出力される。 In the stacked image sensor of FIG. 24, the imaging unit 211 of the pixel substrate 901 performs imaging by receiving light incident thereon, and is set as a detection target from an image (normal image) obtained by the imaging. A subject image or the like is cut out and output.
 図25は、図6の検出装置200の全体を内蔵させた積層イメージセンサの第2の構成例を示す図である。 FIG. 25 is a diagram illustrating a second configuration example of the stacked image sensor in which the entire detection device 200 of FIG. 6 is incorporated.
 なお、図中、図24の場合と対応する部分については、同一の符号を付してあり、以下では、その説明は、適宜省略する。 In the figure, portions corresponding to those in FIG. 24 are denoted by the same reference numerals, and description thereof will be omitted as appropriate.
 図25の積層イメージセンサは、画素基板901、信号処理基板902、および、メモリ基板903が積層された3層構造になっている。 The stacked image sensor in FIG. 25 has a three-layer structure in which a pixel substrate 901, a signal processing substrate 902, and a memory substrate 903 are stacked.
 画素基板901には、距離情報取得部111と撮像部211が形成され、信号処理基板902には、被写体特徴抽出部112、被写体候補領域検出部113、および被写体詳細認識部212が形成されている。 A distance information acquisition unit 111 and an imaging unit 211 are formed on the pixel substrate 901, and a subject feature extraction unit 112, a subject candidate region detection unit 113, and a subject detail recognition unit 212 are formed on the signal processing substrate 902. .
 メモリ基板903には、実サイズデータベース114と画像記憶部911が形成されている。 A real size database 114 and an image storage unit 911 are formed on the memory substrate 903.
 図25では、被写体候補領域検出部113による検出結果、例えば、検出対象の被写体が撮像されている距離画像から切り出された画像などを記憶する記憶領域として、画像記憶部911が、メモリ基板903に形成されている。また、テーブル151(図4)を記憶している実サイズデータベース114も、メモリ基板903に形成されている。 In FIG. 25, an image storage unit 911 is stored in the memory substrate 903 as a storage region for storing a detection result by the subject candidate region detection unit 113, for example, an image cut out from a distance image in which a subject to be detected is captured. Is formed. An actual size database 114 storing the table 151 (FIG. 4) is also formed on the memory substrate 903.
 なお、図25では、画素基板901、信号処理基板902、および、メモリ基板903は、上から、その順で積層されているが、その他、例えば、信号処理基板902とメモリ基板903との順番を入れ替えて、画素基板901、メモリ基板903、および、信号処理基板902の順で積層することができる。 In FIG. 25, the pixel substrate 901, the signal processing substrate 902, and the memory substrate 903 are stacked in that order from the top. However, for example, the order of the signal processing substrate 902 and the memory substrate 903 is changed. The pixel substrate 901, the memory substrate 903, and the signal processing substrate 902 can be stacked in this order.
 また、積層イメージセンサは、2層や3層の基板の他、4層以上の基板を積層して構成することができる。 In addition, the laminated image sensor can be constituted by laminating four or more layers in addition to two or three layers of substrates.
 <本技術を適用したコンピュータの説明> <Description of computer to which this technology is applied>
 次に、検出装置100乃至800のそれぞれが行う一連の処理は、ハードウェアにより行うこともできるし、ソフトウェアにより行うこともできる。一連の処理をソフトウェアによって行う場合には、そのソフトウェアを構成するプログラムが、汎用のコンピュータ等にインストールされる。 Next, a series of processes performed by each of the detection devices 100 to 800 can be performed by hardware or software. When a series of processing is performed by software, a program constituting the software is installed in a general-purpose computer or the like.
 図26は、上述した一連の処理を実行するプログラムがインストールされるコンピュータの一実施の形態の構成例を示すブロック図である。 FIG. 26 is a block diagram illustrating a configuration example of an embodiment of a computer in which a program for executing the above-described series of processes is installed.
 プログラムは、コンピュータに内蔵されている記録媒体としてのハードディスク1005やROM1003に予め記録しておくことができる。 The program can be recorded in advance in a hard disk 1005 or ROM 1003 as a recording medium built in the computer.
 あるいはまた、プログラムは、リムーバブル記録媒体1011に格納(記録)しておくことができる。このようなリムーバブル記録媒体1011は、いわゆるパッケージソフトウエアとして提供することができる。ここで、リムーバブル記録媒体1011としては、例えば、フレキシブルディスク、CD-ROM(Compact Disc Read Only Memory),MO(Magneto Optical)ディスク,DVD(Digital Versatile Disc)、磁気ディスク、半導体メモリ等がある。 Alternatively, the program can be stored (recorded) in a removable recording medium 1011. Such a removable recording medium 1011 can be provided as so-called package software. Here, examples of the removable recording medium 1011 include a flexible disk, a CD-ROM (Compact Disc Read Only Memory), a MO (Magneto Optical) disc, a DVD (Digital Versatile Disc), a magnetic disc, and a semiconductor memory.
 なお、プログラムは、上述したようなリムーバブル記録媒体1011からコンピュータにインストールする他、通信網や放送網を介して、コンピュータにダウンロードし、内蔵するハードディスク1005にインストールすることができる。すなわち、プログラムは、例えば、ダウンロードサイトから、ディジタル衛星放送用の人工衛星を介して、コンピュータに無線で転送したり、LAN(Local Area Network)、インターネットといったネットワークを介して、コンピュータに有線で転送することができる。 The program can be installed on the computer from the removable recording medium 1011 as described above, or can be downloaded to the computer via the communication network or the broadcast network and installed on the built-in hard disk 1005. That is, the program is transferred from a download site to a computer wirelessly via a digital satellite broadcasting artificial satellite, or wired to a computer via a network such as a LAN (Local Area Network) or the Internet. be able to.
 コンピュータは、CPU(Central Processing Unit)1002を内蔵しており、CPU1002には、バス1001を介して、入出力インタフェース1010が接続されている。 The computer includes a CPU (Central Processing Unit) 1002, and an input / output interface 1010 is connected to the CPU 1002 via a bus 1001.
 CPU1002は、入出力インタフェース1010を介して、ユーザによって、入力部1007が操作等されることにより指令が入力されると、それに従って、ROM(Read Only Memory)1003に格納されているプログラムを実行する。あるいは、CPU1002は、ハードディスク1005に格納されたプログラムを、RAM(Random Access Memory)1004にロードして実行する。 When an instruction is input by the user operating the input unit 1007 via the input / output interface 1010, the CPU 1002 executes a program stored in a ROM (Read Only Memory) 1003 accordingly. . Alternatively, the CPU 1002 loads a program stored in the hard disk 1005 to a RAM (Random Access Memory) 1004 and executes it.
 これにより、CPU1002は、上述したフローチャートにしたがった処理、あるいは上述したブロック図の構成により行われる処理を行う。そして、CPU1002は、その処理結果を、必要に応じて、例えば、入出力インタフェース1010を介して、出力部1006から出力、あるいは、通信部1008から送信、さらには、ハードディスク1005に記録等させる。 Thereby, the CPU 1002 performs the process according to the flowchart described above or the process performed by the configuration of the block diagram described above. Then, the CPU 1002 outputs the processing result as necessary, for example, via the input / output interface 1010, from the output unit 1006, or from the communication unit 1008, and further recorded on the hard disk 1005.
 なお、入力部1007は、キーボードや、マウス、マイク等で構成される。また、出力部1006は、LCD(Liquid Crystal Display)やスピーカ等で構成される。 Note that the input unit 1007 includes a keyboard, a mouse, a microphone, and the like. The output unit 1006 includes an LCD (Liquid Crystal Display), a speaker, and the like.
 ここで、本明細書において、コンピュータがプログラムに従って行う処理は、必ずしもフローチャートとして記載された順序に沿って時系列に行われる必要はない。すなわち、コンピュータがプログラムに従って行う処理は、並列的あるいは個別に実行される処理(例えば、並列処理あるいはオブジェクトによる処理)も含む。 Here, in the present specification, the processing performed by the computer according to the program does not necessarily have to be performed in chronological order in the order described as the flowchart. That is, the processing performed by the computer according to the program includes processing executed in parallel or individually (for example, parallel processing or object processing).
 また、プログラムは、1のコンピュータ(プロセッサ)により処理されるものであっても良いし、複数のコンピュータによって分散処理されるものであっても良い。さらに、プログラムは、遠方のコンピュータに転送されて実行されるものであっても良い。 Further, the program may be processed by one computer (processor), or may be distributedly processed by a plurality of computers. Furthermore, the program may be transferred to a remote computer and executed.
 さらに、本明細書において、システムとは、複数の構成要素(装置、モジュール(部品)等)の集合を意味し、全ての構成要素が同一筐体中にあるか否かは問わない。したがって、別個の筐体に収納され、ネットワークを介して接続されている複数の装置、および、1つの筐体の中に複数のモジュールが収納されている1つの装置は、いずれも、システムである。 Furthermore, in this specification, the system means a set of a plurality of components (devices, modules (parts), etc.), and it does not matter whether all the components are in the same housing. Therefore, a plurality of devices housed in separate housings and connected via a network, and a single device housing a plurality of modules in one housing are all systems. .
 なお、本技術の実施の形態は、上述した実施の形態に限定されるものではなく、本技術の要旨を逸脱しない範囲において種々の変更が可能である。 Note that the embodiments of the present technology are not limited to the above-described embodiments, and various modifications can be made without departing from the gist of the present technology.
 例えば、上述した検出装置100乃至800の各構成例は、可能な範囲で組み合わせることができる。 For example, the configuration examples of the detection devices 100 to 800 described above can be combined within a possible range.
 ここで、本技術は、1つの機能をネットワークを介して複数の装置で分担、共同して処理するクラウドコンピューティングの構成をとることができる。 Here, the present technology can take a configuration of cloud computing in which one function is shared by a plurality of devices via a network and is jointly processed.
 また、上述のフローチャートで説明した各ステップは、1つの装置で実行する他、複数の装置で分担して実行することができる。 Further, each step described in the above flowchart can be executed by one device or can be shared by a plurality of devices.
 さらに、1つのステップに複数の処理が含まれる場合には、その1つのステップに含まれる複数の処理は、1つの装置で実行する他、複数の装置で分担して実行することができる。 Further, when a plurality of processes are included in one step, the plurality of processes included in the one step can be executed by being shared by a plurality of apparatuses in addition to being executed by one apparatus.
 <応用例>
 本開示に係る技術は、様々な製品へ応用することができる。例えば、本開示に係る技術は、自動車、電気自動車、ハイブリッド電気自動車、自動二輪車、自転車、パーソナルモビリティ、飛行機、ドローン、船舶、ロボット、建設機械、農業機械(トラクター)などのいずれかの種類の移動体に搭載される装置として実現されてもよい。
<Application example>
The technology according to the present disclosure can be applied to various products. For example, the technology according to the present disclosure may be any type of movement such as an automobile, an electric vehicle, a hybrid electric vehicle, a motorcycle, a bicycle, personal mobility, an airplane, a drone, a ship, a robot, a construction machine, and an agricultural machine (tractor). You may implement | achieve as an apparatus mounted in a body.
 図27は、本開示に係る技術が適用され得る移動体制御システムの一例である車両制御システム7000の概略的な構成例を示すブロック図である。車両制御システム7000は、通信ネットワーク7010を介して接続された複数の電子制御ユニットを備える。図27に示した例では、車両制御システム7000は、駆動系制御ユニット7100、ボディ系制御ユニット7200、バッテリ制御ユニット7300、車外情報検出ユニット7400、車内情報検出ユニット7500、及び統合制御ユニット7600を備える。これらの複数の制御ユニットを接続する通信ネットワーク7010は、例えば、CAN(Controller Area Network)、LIN(Local Interconnect Network)、LAN(Local Area Network)又はFlexRay(登録商標)等の任意の規格に準拠した車載通信ネットワークであってよい。 FIG. 27 is a block diagram illustrating a schematic configuration example of a vehicle control system 7000 that is an example of a mobile control system to which the technology according to the present disclosure can be applied. The vehicle control system 7000 includes a plurality of electronic control units connected via a communication network 7010. In the example shown in FIG. 27, the vehicle control system 7000 includes a drive system control unit 7100, a body system control unit 7200, a battery control unit 7300, a vehicle exterior information detection unit 7400, a vehicle interior information detection unit 7500, and an integrated control unit 7600. . The communication network 7010 for connecting the plurality of control units conforms to an arbitrary standard such as CAN (Controller Area Network), LIN (Local Interconnect Network), LAN (Local Area Network), or FlexRay (registered trademark). It may be an in-vehicle communication network.
 各制御ユニットは、各種プログラムにしたがって演算処理を行うマイクロコンピュータと、マイクロコンピュータにより実行されるプログラム又は各種演算に用いられるパラメータ等を記憶する記憶部と、各種制御対象の装置を駆動する駆動回路とを備える。各制御ユニットは、通信ネットワーク7010を介して他の制御ユニットとの間で通信を行うためのネットワークI/Fを備えるとともに、車内外の装置又はセンサ等との間で、有線通信又は無線通信により通信を行うための通信I/Fを備える。図27では、統合制御ユニット7600の機能構成として、マイクロコンピュータ7610、汎用通信I/F7620、専用通信I/F7630、測位部7640、ビーコン受信部7650、車内機器I/F7660、音声画像出力部7670、車載ネットワークI/F7680及び記憶部7690が図示されている。他の制御ユニットも同様に、マイクロコンピュータ、通信I/F及び記憶部等を備える。 Each control unit includes a microcomputer that performs arithmetic processing according to various programs, a storage unit that stores programs executed by the microcomputer or parameters used for various calculations, and a drive circuit that drives various devices to be controlled. Is provided. Each control unit includes a network I / F for communicating with other control units via a communication network 7010, and is connected to devices or sensors inside and outside the vehicle by wired communication or wireless communication. A communication I / F for performing communication is provided. In FIG. 27, as a functional configuration of the integrated control unit 7600, a microcomputer 7610, a general-purpose communication I / F 7620, a dedicated communication I / F 7630, a positioning unit 7640, a beacon receiving unit 7650, an in-vehicle device I / F 7660, an audio image output unit 7670, An in-vehicle network I / F 7680 and a storage unit 7690 are illustrated. Similarly, other control units include a microcomputer, a communication I / F, a storage unit, and the like.
 駆動系制御ユニット7100は、各種プログラムにしたがって車両の駆動系に関連する装置の動作を制御する。例えば、駆動系制御ユニット7100は、内燃機関又は駆動用モータ等の車両の駆動力を発生させるための駆動力発生装置、駆動力を車輪に伝達するための駆動力伝達機構、車両の舵角を調節するステアリング機構、及び、車両の制動力を発生させる制動装置等の制御装置として機能する。駆動系制御ユニット7100は、ABS(Antilock Brake System)又はESC(Electronic Stability Control)等の制御装置としての機能を有してもよい。 The drive system control unit 7100 controls the operation of the device related to the drive system of the vehicle according to various programs. For example, the drive system control unit 7100 includes a driving force generator for generating a driving force of a vehicle such as an internal combustion engine or a driving motor, a driving force transmission mechanism for transmitting the driving force to wheels, and a steering angle of the vehicle. It functions as a control device such as a steering mechanism that adjusts and a braking device that generates a braking force of the vehicle. The drive system control unit 7100 may have a function as a control device such as ABS (Antilock Brake System) or ESC (Electronic Stability Control).
 駆動系制御ユニット7100には、車両状態検出部7110が接続される。車両状態検出部7110には、例えば、車体の軸回転運動の角速度を検出するジャイロセンサ、車両の加速度を検出する加速度センサ、あるいは、アクセルペダルの操作量、ブレーキペダルの操作量、ステアリングホイールの操舵角、エンジン回転数又は車輪の回転速度等を検出するためのセンサのうちの少なくとも一つが含まれる。駆動系制御ユニット7100は、車両状態検出部7110から入力される信号を用いて演算処理を行い、内燃機関、駆動用モータ、電動パワーステアリング装置又はブレーキ装置等を制御する。 A vehicle state detection unit 7110 is connected to the drive system control unit 7100. The vehicle state detection unit 7110 includes, for example, a gyro sensor that detects the angular velocity of the rotational movement of the vehicle body, an acceleration sensor that detects the acceleration of the vehicle, an operation amount of an accelerator pedal, an operation amount of a brake pedal, and steering of a steering wheel. At least one of sensors for detecting an angle, an engine speed, a rotational speed of a wheel, or the like is included. The drive system control unit 7100 performs arithmetic processing using a signal input from the vehicle state detection unit 7110, and controls an internal combustion engine, a drive motor, an electric power steering device, a brake device, or the like.
 ボディ系制御ユニット7200は、各種プログラムにしたがって車体に装備された各種装置の動作を制御する。例えば、ボディ系制御ユニット7200は、キーレスエントリシステム、スマートキーシステム、パワーウィンドウ装置、あるいは、ヘッドランプ、バックランプ、ブレーキランプ、ウィンカー又はフォグランプ等の各種ランプの制御装置として機能する。この場合、ボディ系制御ユニット7200には、鍵を代替する携帯機から発信される電波又は各種スイッチの信号が入力され得る。ボディ系制御ユニット7200は、これらの電波又は信号の入力を受け付け、車両のドアロック装置、パワーウィンドウ装置、ランプ等を制御する。 The body system control unit 7200 controls the operation of various devices mounted on the vehicle body according to various programs. For example, the body system control unit 7200 functions as a keyless entry system, a smart key system, a power window device, or a control device for various lamps such as a headlamp, a back lamp, a brake lamp, a blinker, or a fog lamp. In this case, the body control unit 7200 can be input with radio waves or various switch signals transmitted from a portable device that substitutes for a key. The body system control unit 7200 receives input of these radio waves or signals, and controls a door lock device, a power window device, a lamp, and the like of the vehicle.
 バッテリ制御ユニット7300は、各種プログラムにしたがって駆動用モータの電力供給源である二次電池7310を制御する。例えば、バッテリ制御ユニット7300には、二次電池7310を備えたバッテリ装置から、バッテリ温度、バッテリ出力電圧又はバッテリの残存容量等の情報が入力される。バッテリ制御ユニット7300は、これらの信号を用いて演算処理を行い、二次電池7310の温度調節制御又はバッテリ装置に備えられた冷却装置等の制御を行う。 The battery control unit 7300 controls the secondary battery 7310 that is a power supply source of the drive motor according to various programs. For example, information such as battery temperature, battery output voltage, or remaining battery capacity is input to the battery control unit 7300 from a battery device including the secondary battery 7310. The battery control unit 7300 performs arithmetic processing using these signals, and controls the temperature adjustment of the secondary battery 7310 or the cooling device provided in the battery device.
 車外情報検出ユニット7400は、車両制御システム7000を搭載した車両の外部の情報を検出する。例えば、車外情報検出ユニット7400には、撮像部7410及び車外情報検出部7420のうちの少なくとも一方が接続される。撮像部7410には、ToF(Time Of Flight)カメラ、ステレオカメラ、単眼カメラ、赤外線カメラ及びその他のカメラのうちの少なくとも一つが含まれる。車外情報検出部7420には、例えば、現在の天候又は気象を検出するための環境センサ、あるいは、車両制御システム7000を搭載した車両の周囲の他の車両、障害物又は歩行者等を検出するための周囲情報検出センサのうちの少なくとも一つが含まれる。 The outside information detection unit 7400 detects information outside the vehicle on which the vehicle control system 7000 is mounted. For example, the outside information detection unit 7400 is connected to at least one of the imaging unit 7410 and the outside information detection unit 7420. The imaging unit 7410 includes at least one of a ToF (Time Of Flight) camera, a stereo camera, a monocular camera, an infrared camera, and other cameras. The outside information detection unit 7420 detects, for example, current weather or an environmental sensor for detecting weather, or other vehicles, obstacles, pedestrians, etc. around the vehicle equipped with the vehicle control system 7000. At least one of the surrounding information detection sensors.
 環境センサは、例えば、雨天を検出する雨滴センサ、霧を検出する霧センサ、日照度合いを検出する日照センサ、及び降雪を検出する雪センサのうちの少なくとも一つであってよい。周囲情報検出センサは、超音波センサ、レーダ装置及びLIDAR(Light Detection and Ranging、Laser Imaging Detection and Ranging)装置のうちの少なくとも一つであってよい。これらの撮像部7410及び車外情報検出部7420は、それぞれ独立したセンサないし装置として備えられてもよいし、複数のセンサないし装置が統合された装置として備えられてもよい。 The environmental sensor may be, for example, at least one of a raindrop sensor that detects rainy weather, a fog sensor that detects fog, a sunshine sensor that detects sunlight intensity, and a snow sensor that detects snowfall. The ambient information detection sensor may be at least one of an ultrasonic sensor, a radar device, and a LIDAR (Light Detection and Ranging, Laser Imaging Detection and Ranging) device. The imaging unit 7410 and the outside information detection unit 7420 may be provided as independent sensors or devices, or may be provided as a device in which a plurality of sensors or devices are integrated.
 ここで、図28は、撮像部7410及び車外情報検出部7420の設置位置の例を示す。撮像部7910,7912,7914,7916,7918は、例えば、車両7900のフロントノーズ、サイドミラー、リアバンパ、バックドア及び車室内のフロントガラスの上部のうちの少なくとも一つの位置に設けられる。フロントノーズに備えられる撮像部7910及び車室内のフロントガラスの上部に備えられる撮像部7918は、主として車両7900の前方の画像を取得する。サイドミラーに備えられる撮像部7912,7914は、主として車両7900の側方の画像を取得する。リアバンパ又はバックドアに備えられる撮像部7916は、主として車両7900の後方の画像を取得する。車室内のフロントガラスの上部に備えられる撮像部7918は、主として先行車両又は、歩行者、障害物、信号機、交通標識又は車線等の検出に用いられる。 Here, FIG. 28 shows an example of installation positions of the imaging unit 7410 and the vehicle outside information detection unit 7420. The imaging units 7910, 7912, 7914, 7916, and 7918 are provided at, for example, at least one of the front nose, the side mirror, the rear bumper, the back door, and the upper part of the windshield in the vehicle interior of the vehicle 7900. An imaging unit 7910 provided in the front nose and an imaging unit 7918 provided in the upper part of the windshield in the vehicle interior mainly acquire an image in front of the vehicle 7900. Imaging units 7912 and 7914 provided in the side mirror mainly acquire an image of the side of the vehicle 7900. An imaging unit 7916 provided in the rear bumper or the back door mainly acquires an image behind the vehicle 7900. The imaging unit 7918 provided on the upper part of the windshield in the passenger compartment is mainly used for detecting a preceding vehicle or a pedestrian, an obstacle, a traffic light, a traffic sign, a lane, or the like.
 なお、図28には、それぞれの撮像部7910,7912,7914,7916の撮影範囲の一例が示されている。撮像範囲aは、フロントノーズに設けられた撮像部7910の撮像範囲を示し、撮像範囲b,cは、それぞれサイドミラーに設けられた撮像部7912,7914の撮像範囲を示し、撮像範囲dは、リアバンパ又はバックドアに設けられた撮像部7916の撮像範囲を示す。例えば、撮像部7910,7912,7914,7916で撮像された画像データが重ね合わせられることにより、車両7900を上方から見た俯瞰画像が得られる。 FIG. 28 shows an example of the shooting range of each of the imaging units 7910, 7912, 7914, and 7916. The imaging range a indicates the imaging range of the imaging unit 7910 provided in the front nose, the imaging ranges b and c indicate the imaging ranges of the imaging units 7912 and 7914 provided in the side mirrors, respectively, and the imaging range d The imaging range of the imaging part 7916 provided in the rear bumper or the back door is shown. For example, by superimposing the image data captured by the imaging units 7910, 7912, 7914, and 7916, an overhead image when the vehicle 7900 is viewed from above is obtained.
 車両7900のフロント、リア、サイド、コーナ及び車室内のフロントガラスの上部に設けられる車外情報検出部7920,7922,7924,7926,7928,7930は、例えば超音波センサ又はレーダ装置であってよい。車両7900のフロントノーズ、リアバンパ、バックドア及び車室内のフロントガラスの上部に設けられる車外情報検出部7920,7926,7930は、例えばLIDAR装置であってよい。これらの車外情報検出部7920~7930は、主として先行車両、歩行者又は障害物等の検出に用いられる。 The vehicle outside information detection units 7920, 7922, 7924, 7926, 7928, and 7930 provided on the front, rear, sides, corners of the vehicle 7900 and the upper part of the windshield in the vehicle interior may be, for example, an ultrasonic sensor or a radar device. The vehicle outside information detection units 7920, 7926, and 7930 provided on the front nose, the rear bumper, the back door, and the windshield in the vehicle interior of the vehicle 7900 may be, for example, LIDAR devices. These outside information detection units 7920 to 7930 are mainly used for detecting a preceding vehicle, a pedestrian, an obstacle, and the like.
 図27に戻って説明を続ける。車外情報検出ユニット7400は、撮像部7410に車外の画像を撮像させるとともに、撮像された画像データを受信する。また、車外情報検出ユニット7400は、接続されている車外情報検出部7420から検出情報を受信する。車外情報検出部7420が超音波センサ、レーダ装置又はLIDAR装置である場合には、車外情報検出ユニット7400は、超音波又は電磁波等を発信させるとともに、受信された反射波の情報を受信する。車外情報検出ユニット7400は、受信した情報に基づいて、人、車、障害物、標識又は路面上の文字等の物体検出処理又は距離検出処理を行ってもよい。車外情報検出ユニット7400は、受信した情報に基づいて、降雨、霧又は路面状況等を認識する環境認識処理を行ってもよい。車外情報検出ユニット7400は、受信した情報に基づいて、車外の物体までの距離を算出してもよい。 Returning to FIG. 27, the description will be continued. The vehicle exterior information detection unit 7400 causes the imaging unit 7410 to capture an image outside the vehicle and receives the captured image data. Further, the vehicle exterior information detection unit 7400 receives detection information from the vehicle exterior information detection unit 7420 connected thereto. When the vehicle exterior information detection unit 7420 is an ultrasonic sensor, a radar device, or a LIDAR device, the vehicle exterior information detection unit 7400 transmits ultrasonic waves, electromagnetic waves, or the like, and receives received reflected wave information. The outside information detection unit 7400 may perform an object detection process or a distance detection process such as a person, a car, an obstacle, a sign, or a character on a road surface based on the received information. The vehicle exterior information detection unit 7400 may perform environment recognition processing for recognizing rainfall, fog, road surface conditions, or the like based on the received information. The vehicle outside information detection unit 7400 may calculate a distance to an object outside the vehicle based on the received information.
 また、車外情報検出ユニット7400は、受信した画像データに基づいて、人、車、障害物、標識又は路面上の文字等を認識する画像認識処理又は距離検出処理を行ってもよい。車外情報検出ユニット7400は、受信した画像データに対して歪補正又は位置合わせ等の処理を行うとともに、異なる撮像部7410により撮像された画像データを合成して、俯瞰画像又はパノラマ画像を生成してもよい。車外情報検出ユニット7400は、異なる撮像部7410により撮像された画像データを用いて、視点変換処理を行ってもよい。 Further, the outside information detection unit 7400 may perform image recognition processing or distance detection processing for recognizing a person, a car, an obstacle, a sign, a character on a road surface, or the like based on the received image data. The vehicle exterior information detection unit 7400 performs processing such as distortion correction or alignment on the received image data, and combines the image data captured by the different imaging units 7410 to generate an overhead image or a panoramic image. Also good. The vehicle exterior information detection unit 7400 may perform viewpoint conversion processing using image data captured by different imaging units 7410.
 車内情報検出ユニット7500は、車内の情報を検出する。車内情報検出ユニット7500には、例えば、運転者の状態を検出する運転者状態検出部7510が接続される。運転者状態検出部7510は、運転者を撮像するカメラ、運転者の生体情報を検出する生体センサ又は車室内の音声を集音するマイク等を含んでもよい。生体センサは、例えば、座面又はステアリングホイール等に設けられ、座席に座った搭乗者又はステアリングホイールを握る運転者の生体情報を検出する。車内情報検出ユニット7500は、運転者状態検出部7510から入力される検出情報に基づいて、運転者の疲労度合い又は集中度合いを算出してもよいし、運転者が居眠りをしていないかを判別してもよい。車内情報検出ユニット7500は、集音された音声信号に対してノイズキャンセリング処理等の処理を行ってもよい。 The vehicle interior information detection unit 7500 detects vehicle interior information. For example, a driver state detection unit 7510 that detects the driver's state is connected to the in-vehicle information detection unit 7500. Driver state detection unit 7510 may include a camera that captures an image of the driver, a biosensor that detects biometric information of the driver, a microphone that collects sound in the passenger compartment, and the like. The biometric sensor is provided, for example, on a seat surface or a steering wheel, and detects biometric information of an occupant sitting on the seat or a driver holding the steering wheel. The vehicle interior information detection unit 7500 may calculate the degree of fatigue or concentration of the driver based on the detection information input from the driver state detection unit 7510, and determines whether the driver is asleep. May be. The vehicle interior information detection unit 7500 may perform a process such as a noise canceling process on the collected audio signal.
 統合制御ユニット7600は、各種プログラムにしたがって車両制御システム7000内の動作全般を制御する。統合制御ユニット7600には、入力部7800が接続されている。入力部7800は、例えば、タッチパネル、ボタン、マイクロフォン、スイッチ又はレバー等、搭乗者によって入力操作され得る装置によって実現される。統合制御ユニット7600には、マイクロフォンにより入力される音声を音声認識することにより得たデータが入力されてもよい。入力部7800は、例えば、赤外線又はその他の電波を利用したリモートコントロール装置であってもよいし、車両制御システム7000の操作に対応した携帯電話又はPDA(Personal Digital Assistant)等の外部接続機器であってもよい。入力部7800は、例えばカメラであってもよく、その場合搭乗者はジェスチャにより情報を入力することができる。あるいは、搭乗者が装着したウェアラブル装置の動きを検出することで得られたデータが入力されてもよい。さらに、入力部7800は、例えば、上記の入力部7800を用いて搭乗者等により入力された情報に基づいて入力信号を生成し、統合制御ユニット7600に出力する入力制御回路などを含んでもよい。搭乗者等は、この入力部7800を操作することにより、車両制御システム7000に対して各種のデータを入力したり処理動作を指示したりする。 The integrated control unit 7600 controls the overall operation in the vehicle control system 7000 according to various programs. An input unit 7800 is connected to the integrated control unit 7600. The input unit 7800 is realized by a device that can be input by a passenger, such as a touch panel, a button, a microphone, a switch, or a lever. The integrated control unit 7600 may be input with data obtained by recognizing voice input through a microphone. The input unit 7800 may be, for example, a remote control device using infrared rays or other radio waves, or may be an external connection device such as a mobile phone or a PDA (Personal Digital Assistant) that supports the operation of the vehicle control system 7000. May be. The input unit 7800 may be, for example, a camera. In that case, the passenger can input information using a gesture. Alternatively, data obtained by detecting the movement of the wearable device worn by the passenger may be input. Furthermore, the input unit 7800 may include, for example, an input control circuit that generates an input signal based on information input by a passenger or the like using the input unit 7800 and outputs the input signal to the integrated control unit 7600. A passenger or the like operates the input unit 7800 to input various data or instruct a processing operation to the vehicle control system 7000.
 記憶部7690は、マイクロコンピュータにより実行される各種プログラムを記憶するROM(Read Only Memory)、及び各種パラメータ、演算結果又はセンサ値等を記憶するRAM(Random Access Memory)を含んでいてもよい。また、記憶部7690は、HDD(Hard Disc Drive)等の磁気記憶デバイス、半導体記憶デバイス、光記憶デバイス又は光磁気記憶デバイス等によって実現してもよい。 The storage unit 7690 may include a ROM (Read Only Memory) that stores various programs executed by the microcomputer, and a RAM (Random Access Memory) that stores various parameters, calculation results, sensor values, and the like. The storage unit 7690 may be realized by a magnetic storage device such as an HDD (Hard Disc Drive), a semiconductor storage device, an optical storage device, a magneto-optical storage device, or the like.
 汎用通信I/F7620は、外部環境7750に存在する様々な機器との間の通信を仲介する汎用的な通信I/Fである。汎用通信I/F7620は、GSM(Global System of Mobile communications)、WiMAX、LTE(Long Term Evolution)若しくはLTE-A(LTE-Advanced)などのセルラー通信プロトコル、又は無線LAN(Wi-Fi(登録商標)ともいう)、Bluetooth(登録商標)などのその他の無線通信プロトコルを実装してよい。汎用通信I/F7620は、例えば、基地局又はアクセスポイントを介して、外部ネットワーク(例えば、インターネット、クラウドネットワーク又は事業者固有のネットワーク)上に存在する機器(例えば、アプリケーションサーバ又は制御サーバ)へ接続してもよい。また、汎用通信I/F7620は、例えばP2P(Peer To Peer)技術を用いて、車両の近傍に存在する端末(例えば、運転者、歩行者若しくは店舗の端末、又はMTC(Machine Type Communication)端末)と接続してもよい。 General-purpose communication I / F 7620 is a general-purpose communication I / F that mediates communication with various devices existing in the external environment 7750. General-purpose communication I / F7620 is a cellular communication protocol such as GSM (Global System of Mobile communications), WiMAX, LTE (Long Term Evolution) or LTE-A (LTE-Advanced), or wireless LAN (Wi-Fi (registered trademark)). Other wireless communication protocols such as Bluetooth (registered trademark) may also be implemented. The general-purpose communication I / F 7620 is connected to a device (for example, an application server or a control server) existing on an external network (for example, the Internet, a cloud network, or an operator-specific network) via, for example, a base station or an access point. May be. The general-purpose communication I / F 7620 is a terminal (for example, a driver, a pedestrian or a store terminal, or an MTC (Machine Type Communication) terminal) that exists in the vicinity of the vehicle using, for example, P2P (Peer To Peer) technology. You may connect with.
 専用通信I/F7630は、車両における使用を目的として策定された通信プロトコルをサポートする通信I/Fである。専用通信I/F7630は、例えば、下位レイヤのIEEE802.11pと上位レイヤのIEEE1609との組合せであるWAVE(Wireless Access in Vehicle Environment)、DSRC(Dedicated Short Range Communications)、又はセルラー通信プロトコルといった標準プロトコルを実装してよい。専用通信I/F7630は、典型的には、車車間(Vehicle to Vehicle)通信、路車間(Vehicle to Infrastructure)通信、車両と家との間(Vehicle to Home)の通信及び歩車間(Vehicle to Pedestrian)通信のうちの1つ以上を含む概念であるV2X通信を遂行する。 The dedicated communication I / F 7630 is a communication I / F that supports a communication protocol formulated for use in vehicles. The dedicated communication I / F 7630 is a standard protocol such as WAVE (Wireless Access in Vehicle Environment), DSRC (Dedicated Short Range Communications), or cellular communication protocol, which is a combination of the lower layer IEEE 802.11p and the upper layer IEEE 1609. May be implemented. The dedicated communication I / F 7630 typically includes vehicle-to-vehicle communication, vehicle-to-infrastructure communication, vehicle-to-home communication, and vehicle-to-pedestrian communication. ) Perform V2X communication, which is a concept that includes one or more of the communications.
 測位部7640は、例えば、GNSS(Global Navigation Satellite System)衛星からのGNSS信号(例えば、GPS(Global Positioning System)衛星からのGPS信号)を受信して測位を実行し、車両の緯度、経度及び高度を含む位置情報を生成する。なお、測位部7640は、無線アクセスポイントとの信号の交換により現在位置を特定してもよく、又は測位機能を有する携帯電話、PHS若しくはスマートフォンといった端末から位置情報を取得してもよい。 The positioning unit 7640 receives, for example, a GNSS signal from a GNSS (Global Navigation Satellite System) satellite (for example, a GPS signal from a GPS (Global Positioning System) satellite), performs positioning, and performs latitude, longitude, and altitude of the vehicle. The position information including is generated. Note that the positioning unit 7640 may specify the current position by exchanging signals with the wireless access point, or may acquire position information from a terminal such as a mobile phone, PHS, or smartphone having a positioning function.
 ビーコン受信部7650は、例えば、道路上に設置された無線局等から発信される電波あるいは電磁波を受信し、現在位置、渋滞、通行止め又は所要時間等の情報を取得する。なお、ビーコン受信部7650の機能は、上述した専用通信I/F7630に含まれてもよい。 The beacon receiving unit 7650 receives, for example, radio waves or electromagnetic waves transmitted from a radio station installed on the road, and acquires information such as the current position, traffic jam, closed road, or required time. Note that the function of the beacon receiving unit 7650 may be included in the dedicated communication I / F 7630 described above.
 車内機器I/F7660は、マイクロコンピュータ7610と車内に存在する様々な車内機器7760との間の接続を仲介する通信インタフェースである。車内機器I/F7660は、無線LAN、Bluetooth(登録商標)、NFC(Near Field Communication)又はWUSB(Wireless USB)といった無線通信プロトコルを用いて無線接続を確立してもよい。また、車内機器I/F7660は、図示しない接続端子(及び、必要であればケーブル)を介して、USB(Universal Serial Bus)、HDMI(High-Definition Multimedia Interface)、又はMHL(Mobile High-definition Link)等の有線接続を確立してもよい。車内機器7760は、例えば、搭乗者が有するモバイル機器若しくはウェアラブル機器、又は車両に搬入され若しくは取り付けられる情報機器のうちの少なくとも1つを含んでいてもよい。また、車内機器7760は、任意の目的地までの経路探索を行うナビゲーション装置を含んでいてもよい。車内機器I/F7660は、これらの車内機器7760との間で、制御信号又はデータ信号を交換する。 The in-vehicle device I / F 7660 is a communication interface that mediates the connection between the microcomputer 7610 and various in-vehicle devices 7760 present in the vehicle. The in-vehicle device I / F 7660 may establish a wireless connection using a wireless communication protocol such as a wireless LAN, Bluetooth (registered trademark), NFC (Near Field Communication), or WUSB (Wireless USB). The in-vehicle device I / F 7660 is connected to a USB (Universal Serial Bus), HDMI (High-Definition Multimedia Interface), or MHL (Mobile High-definition Link) via a connection terminal (and a cable if necessary). ) Etc. may be established. The in-vehicle device 7760 may include, for example, at least one of a mobile device or a wearable device that a passenger has, or an information device that is carried into or attached to the vehicle. In-vehicle device 7760 may include a navigation device that searches for a route to an arbitrary destination. In-vehicle device I / F 7660 exchanges control signals or data signals with these in-vehicle devices 7760.
 車載ネットワークI/F7680は、マイクロコンピュータ7610と通信ネットワーク7010との間の通信を仲介するインタフェースである。車載ネットワークI/F7680は、通信ネットワーク7010によりサポートされる所定のプロトコルに則して、信号等を送受信する。 The in-vehicle network I / F 7680 is an interface that mediates communication between the microcomputer 7610 and the communication network 7010. The in-vehicle network I / F 7680 transmits and receives signals and the like in accordance with a predetermined protocol supported by the communication network 7010.
 統合制御ユニット7600のマイクロコンピュータ7610は、汎用通信I/F7620、専用通信I/F7630、測位部7640、ビーコン受信部7650、車内機器I/F7660及び車載ネットワークI/F7680のうちの少なくとも一つを介して取得される情報に基づき、各種プログラムにしたがって、車両制御システム7000を制御する。例えば、マイクロコンピュータ7610は、取得される車内外の情報に基づいて、駆動力発生装置、ステアリング機構又は制動装置の制御目標値を演算し、駆動系制御ユニット7100に対して制御指令を出力してもよい。例えば、マイクロコンピュータ7610は、車両の衝突回避あるいは衝撃緩和、車間距離に基づく追従走行、車速維持走行、車両の衝突警告、又は車両のレーン逸脱警告等を含むADAS(Advanced Driver Assistance System)の機能実現を目的とした協調制御を行ってもよい。また、マイクロコンピュータ7610は、取得される車両の周囲の情報に基づいて駆動力発生装置、ステアリング機構又は制動装置等を制御することにより、運転者の操作に拠らずに自律的に走行する自動運転等を目的とした協調制御を行ってもよい。 The microcomputer 7610 of the integrated control unit 7600 is connected via at least one of a general-purpose communication I / F 7620, a dedicated communication I / F 7630, a positioning unit 7640, a beacon receiving unit 7650, an in-vehicle device I / F 7660, and an in-vehicle network I / F 7680. The vehicle control system 7000 is controlled according to various programs based on the acquired information. For example, the microcomputer 7610 calculates a control target value of the driving force generation device, the steering mechanism, or the braking device based on the acquired information inside and outside the vehicle, and outputs a control command to the drive system control unit 7100. Also good. For example, the microcomputer 7610 realizes ADAS (Advanced Driver Assistance System) functions including vehicle collision avoidance or impact mitigation, following traveling based on inter-vehicle distance, vehicle speed maintaining traveling, vehicle collision warning, or vehicle lane departure warning. You may perform the cooperative control for the purpose. Further, the microcomputer 7610 controls the driving force generator, the steering mechanism, the braking device, or the like based on the acquired information on the surroundings of the vehicle, so that the microcomputer 7610 automatically travels independently of the driver's operation. You may perform the cooperative control for the purpose of driving.
 マイクロコンピュータ7610は、汎用通信I/F7620、専用通信I/F7630、測位部7640、ビーコン受信部7650、車内機器I/F7660及び車載ネットワークI/F7680のうちの少なくとも一つを介して取得される情報に基づき、車両と周辺の構造物や人物等の物体との間の3次元距離情報を生成し、車両の現在位置の周辺情報を含むローカル地図情報を作成してもよい。また、マイクロコンピュータ7610は、取得される情報に基づき、車両の衝突、歩行者等の近接又は通行止めの道路への進入等の危険を予測し、警告用信号を生成してもよい。警告用信号は、例えば、警告音を発生させたり、警告ランプを点灯させたりするための信号であってよい。 The microcomputer 7610 is information acquired via at least one of the general-purpose communication I / F 7620, the dedicated communication I / F 7630, the positioning unit 7640, the beacon receiving unit 7650, the in-vehicle device I / F 7660, and the in-vehicle network I / F 7680. The three-dimensional distance information between the vehicle and the surrounding structure or an object such as a person may be generated based on the above and local map information including the peripheral information of the current position of the vehicle may be created. Further, the microcomputer 7610 may generate a warning signal by predicting a danger such as a collision of a vehicle, approach of a pedestrian or the like or an approach to a closed road based on the acquired information. The warning signal may be, for example, a signal for generating a warning sound or lighting a warning lamp.
 音声画像出力部7670は、車両の搭乗者又は車外に対して、視覚的又は聴覚的に情報を通知することが可能な出力装置へ音声及び画像のうちの少なくとも一方の出力信号を送信する。図27の例では、出力装置として、オーディオスピーカ7710、表示部7720及びインストルメントパネル7730が例示されている。表示部7720は、例えば、オンボードディスプレイ及びヘッドアップディスプレイの少なくとも一つを含んでいてもよい。表示部7720は、AR(Augmented Reality)表示機能を有していてもよい。出力装置は、これらの装置以外の、ヘッドホン、搭乗者が装着する眼鏡型ディスプレイ等のウェアラブルデバイス、プロジェクタ又はランプ等の他の装置であってもよい。出力装置が表示装置の場合、表示装置は、マイクロコンピュータ7610が行った各種処理により得られた結果又は他の制御ユニットから受信された情報を、テキスト、イメージ、表、グラフ等、様々な形式で視覚的に表示する。また、出力装置が音声出力装置の場合、音声出力装置は、再生された音声データ又は音響データ等からなるオーディオ信号をアナログ信号に変換して聴覚的に出力する。 The audio image output unit 7670 transmits an output signal of at least one of audio and image to an output device capable of visually or audibly notifying information to a vehicle occupant or the outside of the vehicle. In the example of FIG. 27, an audio speaker 7710, a display unit 7720, and an instrument panel 7730 are illustrated as output devices. Display unit 7720 may include at least one of an on-board display and a head-up display, for example. The display portion 7720 may have an AR (Augmented Reality) display function. In addition to these devices, the output device may be other devices such as headphones, wearable devices such as glasses-type displays worn by passengers, projectors, and lamps. When the output device is a display device, the display device can display the results obtained by various processes performed by the microcomputer 7610 or information received from other control units in various formats such as text, images, tables, and graphs. Display visually. Further, when the output device is an audio output device, the audio output device converts an audio signal made up of reproduced audio data or acoustic data into an analog signal and outputs it aurally.
 なお、図27に示した例において、通信ネットワーク7010を介して接続された少なくとも二つの制御ユニットが一つの制御ユニットとして一体化されてもよい。あるいは、個々の制御ユニットが、複数の制御ユニットにより構成されてもよい。さらに、車両制御システム7000が、図示されていない別の制御ユニットを備えてもよい。また、上記の説明において、いずれかの制御ユニットが担う機能の一部又は全部を、他の制御ユニットに持たせてもよい。つまり、通信ネットワーク7010を介して情報の送受信がされるようになっていれば、所定の演算処理が、いずれかの制御ユニットで行われるようになってもよい。同様に、いずれかの制御ユニットに接続されているセンサ又は装置が、他の制御ユニットに接続されるとともに、複数の制御ユニットが、通信ネットワーク7010を介して相互に検出情報を送受信してもよい。 In the example shown in FIG. 27, at least two control units connected via the communication network 7010 may be integrated as one control unit. Alternatively, each control unit may be configured by a plurality of control units. Furthermore, the vehicle control system 7000 may include another control unit not shown. In the above description, some or all of the functions of any of the control units may be given to other control units. That is, as long as information is transmitted and received via the communication network 7010, the predetermined arithmetic processing may be performed by any one of the control units. Similarly, a sensor or device connected to one of the control units may be connected to another control unit, and a plurality of control units may transmit / receive detection information to / from each other via the communication network 7010. .
 なお、図1,6,9,12,15,18,20,22を用いて説明した本実施形態に係る検出装置100乃至800の各機能を実現するためのコンピュータプログラムを、いずれかの制御ユニット等に実装することができる。また、このようなコンピュータプログラムが格納された、コンピュータで読み取り可能な記録媒体を提供することもできる。記録媒体は、例えば、磁気ディスク、光ディスク、光磁気ディスク、フラッシュメモリ等である。また、上記のコンピュータプログラムは、記録媒体を用いずに、例えばネットワークを介して配信されてもよい。 Note that a computer program for realizing each function of the detection devices 100 to 800 according to the present embodiment described with reference to FIGS. 1, 6, 9, 12, 15, 18, 20, and 22 is stored in any control unit. Etc. can be implemented. It is also possible to provide a computer-readable recording medium in which such a computer program is stored. The recording medium is, for example, a magnetic disk, an optical disk, a magneto-optical disk, a flash memory, or the like. Further, the above computer program may be distributed via a network, for example, without using a recording medium.
 以上説明した車両制御システム7000において、図1,6,9,12,15,18,20,22を用いて説明した本実施形態に係る検出装置100乃至800は、図27に示した応用例の統合制御ユニット7600に適用することができる。 In the vehicle control system 7000 described above, the detection apparatuses 100 to 800 according to the present embodiment described with reference to FIGS. 1, 6, 9, 12, 15, 18, 20, and 22 are applied to the application example illustrated in FIG. The present invention can be applied to the integrated control unit 7600.
 また、図1,6,9,12,15,18,20,22を用いて説明した本実施形態に係る検出装置100乃至800の少なくとも一部の構成要素は、図27に示した統合制御ユニット7600のためのモジュール(例えば、一つのダイで構成される集積回路モジュール)において実現されてもよい。あるいは、図1,6,9,12,15,18,20,22を用いて説明した本実施形態に係る検出装置100乃至800が、図27に示した車両制御システム7000の複数の制御ユニットによって実現されてもよい。 In addition, at least some of the components of the detection devices 100 to 800 according to the present embodiment described with reference to FIGS. 1, 6, 9, 12, 15, 18, 20, and 22 are the integrated control unit illustrated in FIG. It may be realized in a module for 7600 (for example, an integrated circuit module composed of one die). Alternatively, the detection devices 100 to 800 according to the present embodiment described with reference to FIGS. 1, 6, 9, 12, 15, 18, 20, and 22 are performed by a plurality of control units of the vehicle control system 7000 illustrated in FIG. It may be realized.
 また、本明細書に記載された効果はあくまで例示であって限定されるものではなく、他の効果があってもよい。 Further, the effects described in the present specification are merely examples and are not limited, and other effects may be obtained.
 なお、本技術は、以下のような構成をとることができる。
(1)
 被写体までの距離に関する距離情報を取得する取得部と、
 前記距離情報と、検出対象とされる物体の特徴量から、前記物体が撮像されている可能性がある領域を設定する設定部と、
 前記領域内の画像が、前記物体であるか否かを判定する判定部と
 を備える検出装置。
(2)
 前記設定部は、
 前記物体の特徴量を、所定の距離における前記物体の大きさとし、
 処理対象に設定した画素における距離に応じた前記物体の大きさに該当する枠を設定し、
 前記判定部は、
 前記枠内の画像が前記物体である否かを判定する
 前記(1)に記載の検出装置。
(3)
 前記物体が向いている方向を、前記距離情報から検出する方向検出部をさらに備える
 前記(1)または(2)に記載の検出装置。
(4)
 環境光による画像を撮像する撮像部と、
 前記撮像部で撮像された前記画像と、前記設定部により設定された前記物体の大きさ、前記判定部により前記物体であると判定された領域内の画像、および前記方向検出部により検出された前記物体の方向の少なくとも1つを用いて、前記物体に対する詳細な認識を行う認識部と
 をさらに備える前記(3)に記載の検出装置。
(5)
 被写体までの距離に関する距離情報を取得する取得部と、
 前記距離情報を用いて、所定の物体が撮像されている可能性がある領域を設定する設定部と、
 前記領域の大きさと、前記距離情報とから、前記物体が属するカテゴリを推定する推定部と
 を備える検出装置。
(6)
 前記設定部は、
 前記距離情報が変化する部分までを前記物体が撮像されている可能性がある領域として設定し、
 前記推定部は、
 前記領域内の前記距離情報が表す距離において前記領域の大きさとなる物体が属するカテゴリを、前記物体が属するカテゴリであると推定する
 前記(5)に記載の検出装置。
(7)
 前記設定部により設定された前記領域内の物体の形状を、前記距離情報を用いて推定する形状推定部をさらに備える
 前記(5)または(6)に記載の検出装置。
(8)
 前記推定部は、前記距離情報、前記領域の大きさ、前記形状の少なくとも1つを用いて前記カテゴリを推定する
 前記(7)に記載の検出装置。
(9)
 環境光による画像を撮像する撮像部と、
 前記撮像部で撮像された前記画像と、前記設定部により設定された前記領域の大きさ、前記推定部により推定された前記カテゴリ、および前記形状推定部により推定された前記形状の少なくとも1つを用いて、前記物体に対する詳細な認識を行う認識部と
 をさらに備える前記(7)に記載の検出装置。
(10)
 前記取得部は、TOF方式のセンサ、ステレオカメラ、超音波センサ、またはミリ波レーダを用いて、前記距離情報を取得する
 前記(1)乃至(9)のいずれかに記載の検出装置。
(11)
 被写体までの距離に関する距離情報を取得し、
 前記距離情報と、検出対象とされる物体の特徴量から、前記物体が撮像されている可能性がある領域を設定し、
 前記領域内の画像が、前記物体であるか否かを判定する
 ステップを含む検出方法。
(12)
 被写体までの距離に関する距離情報を取得し、
 前記距離情報を用いて、所定の物体が撮像されている可能性がある領域を設定し、
 前記領域の大きさと、前記距離情報とから、前記物体が属するカテゴリを推定する
 ステップを含む検出方法。
(13)
 コンピュータに、
 被写体までの距離に関する距離情報を取得し、
 前記距離情報と、検出対象とされる物体の特徴量から、前記物体が撮像されている可能性がある領域を設定し、
 前記領域内の画像が、前記物体であるか否かを判定する
 ステップを含む処理を実行させるためのプログラム。
(14)
 コンピュータに、
 被写体までの距離に関する距離情報を取得し、
 前記距離情報を用いて、所定の物体が撮像されている可能性がある領域を設定し、
 前記領域の大きさと、前記距離情報とから、前記物体が属するカテゴリを推定する
 ステップを含む処理を実行させるためのプログラム。
In addition, this technique can take the following structures.
(1)
An acquisition unit for acquiring distance information related to the distance to the subject;
A setting unit for setting a region where the object may be imaged from the distance information and the feature amount of the object to be detected;
A detection device comprising: a determination unit that determines whether an image in the region is the object.
(2)
The setting unit
The feature amount of the object is the size of the object at a predetermined distance,
Set a frame corresponding to the size of the object according to the distance in the pixel set as the processing target,
The determination unit
The detection device according to (1), wherein it is determined whether an image in the frame is the object.
(3)
The detection device according to (1) or (2), further including a direction detection unit that detects a direction in which the object is facing from the distance information.
(4)
An imaging unit that captures an image of ambient light;
The image picked up by the image pickup unit, the size of the object set by the setting unit, an image in an area determined to be the object by the determination unit, and detected by the direction detection unit The detection apparatus according to (3), further comprising: a recognition unit that performs detailed recognition on the object using at least one of the directions of the object.
(5)
An acquisition unit for acquiring distance information related to the distance to the subject;
Using the distance information, a setting unit that sets a region where a predetermined object may be imaged,
A detection apparatus comprising: an estimation unit configured to estimate a category to which the object belongs from the size of the region and the distance information.
(6)
The setting unit
Set up to the part where the distance information changes as an area where the object may be imaged,
The estimation unit includes
The detection apparatus according to (5), wherein a category to which an object having a size of the area belongs at a distance represented by the distance information in the area is a category to which the object belongs.
(7)
The detection apparatus according to (5) or (6), further including a shape estimation unit that estimates the shape of the object in the region set by the setting unit using the distance information.
(8)
The detection device according to (7), wherein the estimation unit estimates the category using at least one of the distance information, the size of the region, and the shape.
(9)
An imaging unit that captures an image of ambient light;
At least one of the image captured by the imaging unit, the size of the region set by the setting unit, the category estimated by the estimation unit, and the shape estimated by the shape estimation unit. The detection device according to (7), further comprising: a recognition unit that performs detailed recognition on the object.
(10)
The detection device according to any one of (1) to (9), wherein the acquisition unit acquires the distance information using a TOF type sensor, a stereo camera, an ultrasonic sensor, or a millimeter wave radar.
(11)
Get distance information about the distance to the subject,
From the distance information and the feature amount of the object to be detected, set an area where the object may be imaged,
A detection method including a step of determining whether an image in the region is the object.
(12)
Get distance information about the distance to the subject,
Using the distance information, set an area where a predetermined object may be imaged,
A detection method including a step of estimating a category to which the object belongs from the size of the region and the distance information.
(13)
On the computer,
Get distance information about the distance to the subject,
From the distance information and the feature amount of the object to be detected, set an area where the object may be imaged,
A program for executing processing including a step of determining whether or not an image in the region is the object.
(14)
On the computer,
Get distance information about the distance to the subject,
Using the distance information, set an area where a predetermined object may be imaged,
A program for executing a process including a step of estimating a category to which the object belongs from the size of the area and the distance information.
 100 検出装置, 111 距離情報取得部, 112 被写体特徴抽出部, 113 被写体候補領域検出部, 114 実サイズデータベース, 200 検出装置, 211 撮像部, 212 被写体詳細認識部, 300 検出装置, 311 被写体方向検出部, 400 検出装置, 411 撮像部, 412 被写体詳細認識部, 500 検出装置, 511 被写体サイズ推定部, 512 被写体カテゴリ推定部, 600 検出装置, 611 撮像部, 612 被写体詳細認識部, 700 検出装置, 711 被写体形状推定部, 712 被写体カテゴリ推定部, 800 検出装置, 811 撮像部, 812 被写体詳細認識部 100 detection device, 111 distance information acquisition unit, 112 subject feature extraction unit, 113 subject candidate area detection unit, 114 actual size database, 200 detection device, 211 imaging unit, 212 subject detail recognition unit, 300 detection device, 311 subject direction detection Unit, 400 detection device, 411 imaging unit, 412 subject detail recognition unit, 500 detection device, 511 subject size estimation unit, 512 subject category estimation unit, 600 detection device, 611 imaging unit, 612 subject detail recognition unit, 700 detection device, 711 subject shape estimation unit, 712 subject category estimation unit, 800 detection device, 811 imaging unit, 812 subject detail recognition unit

Claims (14)

  1.  被写体までの距離に関する距離情報を取得する取得部と、
     前記距離情報と、検出対象とされる物体の特徴量から、前記物体が撮像されている可能性がある領域を設定する設定部と、
     前記領域内の画像が、前記物体であるか否かを判定する判定部と
     を備える検出装置。
    An acquisition unit for acquiring distance information related to the distance to the subject;
    A setting unit for setting a region where the object may be imaged from the distance information and the feature amount of the object to be detected;
    A detection device comprising: a determination unit that determines whether an image in the region is the object.
  2.  前記設定部は、
     前記物体の特徴量を、所定の距離における前記物体の大きさとし、
     処理対象に設定した画素における距離に応じた前記物体の大きさに該当する枠を設定し、
     前記判定部は、
     前記枠内の画像が前記物体である否かを判定する
     請求項1に記載の検出装置。
    The setting unit
    The feature amount of the object is the size of the object at a predetermined distance,
    Set a frame corresponding to the size of the object according to the distance in the pixel set as the processing target,
    The determination unit
    The detection device according to claim 1, wherein it is determined whether an image in the frame is the object.
  3.  前記物体が向いている方向を、前記距離情報から検出する方向検出部をさらに備える
     請求項2に記載の検出装置。
    The detection device according to claim 2, further comprising: a direction detection unit that detects a direction in which the object is facing from the distance information.
  4.  環境光による画像を撮像する撮像部と、
     前記撮像部で撮像された前記画像と、前記設定部により設定された前記物体の大きさ、前記判定部により前記物体であると判定された領域内の画像、および前記方向検出部により検出された前記物体の方向の少なくとも1つを用いて、前記物体に対する詳細な認識を行う認識部と
     をさらに備える請求項3に記載の検出装置。
    An imaging unit that captures an image of ambient light;
    The image picked up by the image pickup unit, the size of the object set by the setting unit, an image in an area determined to be the object by the determination unit, and detected by the direction detection unit The detection apparatus according to claim 3, further comprising: a recognition unit that performs detailed recognition on the object using at least one of the directions of the object.
  5.  被写体までの距離に関する距離情報を取得する取得部と、
     前記距離情報を用いて、所定の物体が撮像されている可能性がある領域を設定する設定部と、
     前記領域の大きさと、前記距離情報とから、前記物体が属するカテゴリを推定する推定部と
     を備える検出装置。
    An acquisition unit for acquiring distance information related to the distance to the subject;
    Using the distance information, a setting unit that sets a region where a predetermined object may be imaged,
    A detection apparatus comprising: an estimation unit configured to estimate a category to which the object belongs from the size of the region and the distance information.
  6.  前記設定部は、
     前記距離情報が変化する部分までを前記物体が撮像されている可能性がある領域として設定し、
     前記推定部は、
     前記領域内の前記距離情報が表す距離において前記領域の大きさとなる物体が属するカテゴリを、前記物体が属するカテゴリであると推定する
     請求項5に記載の検出装置。
    The setting unit
    Set up to the part where the distance information changes as an area where the object may be imaged,
    The estimation unit includes
    The detection device according to claim 5, wherein a category to which an object having a size of the area belongs at a distance represented by the distance information in the area is estimated as a category to which the object belongs.
  7.  前記設定部により設定された前記領域内の物体の形状を、前記距離情報を用いて推定する形状推定部をさらに備える
     請求項5に記載の検出装置。
    The detection device according to claim 5, further comprising a shape estimation unit that estimates the shape of the object in the region set by the setting unit using the distance information.
  8.  前記推定部は、前記距離情報、前記領域の大きさ、前記形状の少なくとも1つを用いて前記カテゴリを推定する
     請求項7に記載の検出装置。
    The detection device according to claim 7, wherein the estimation unit estimates the category using at least one of the distance information, the size of the region, and the shape.
  9.  環境光による画像を撮像する撮像部と、
     前記撮像部で撮像された前記画像と、前記設定部により設定された前記領域の大きさ、前記推定部により推定された前記カテゴリ、および前記形状推定部により推定された前記形状の少なくとも1つを用いて、前記物体に対する詳細な認識を行う認識部と
     をさらに備える請求項7に記載の検出装置。
    An imaging unit that captures an image of ambient light;
    At least one of the image captured by the imaging unit, the size of the region set by the setting unit, the category estimated by the estimation unit, and the shape estimated by the shape estimation unit. The detection device according to claim 7, further comprising: a recognition unit that performs detailed recognition on the object.
  10.  前記取得部は、TOF方式のセンサ、ステレオカメラ、超音波センサ、またはミリ波レーダを用いて、前記距離情報を取得する
     請求項1に記載の検出装置。
    The detection apparatus according to claim 1, wherein the acquisition unit acquires the distance information using a TOF type sensor, a stereo camera, an ultrasonic sensor, or a millimeter wave radar.
  11.  被写体までの距離に関する距離情報を取得し、
     前記距離情報と、検出対象とされる物体の特徴量から、前記物体が撮像されている可能性がある領域を設定し、
     前記領域内の画像が、前記物体であるか否かを判定する
     ステップを含む検出方法。
    Get distance information about the distance to the subject,
    From the distance information and the feature amount of the object to be detected, set an area where the object may be imaged,
    A detection method including a step of determining whether an image in the region is the object.
  12.  被写体までの距離に関する距離情報を取得し、
     前記距離情報を用いて、所定の物体が撮像されている可能性がある領域を設定し、
     前記領域の大きさと、前記距離情報とから、前記物体が属するカテゴリを推定する
     ステップを含む検出方法。
    Get distance information about the distance to the subject,
    Using the distance information, set an area where a predetermined object may be imaged,
    A detection method including a step of estimating a category to which the object belongs from the size of the region and the distance information.
  13.  コンピュータに、
     被写体までの距離に関する距離情報を取得し、
     前記距離情報と、検出対象とされる物体の特徴量から、前記物体が撮像されている可能性がある領域を設定し、
     前記領域内の画像が、前記物体であるか否かを判定する
     ステップを含む処理を実行させるためのプログラム。
    On the computer,
    Get distance information about the distance to the subject,
    From the distance information and the feature amount of the object to be detected, set an area where the object may be imaged,
    A program for executing processing including a step of determining whether or not an image in the region is the object.
  14.  コンピュータに、
     被写体までの距離に関する距離情報を取得し、
     前記距離情報を用いて、所定の物体が撮像されている可能性がある領域を設定し、
     前記領域の大きさと、前記距離情報とから、前記物体が属するカテゴリを推定する
     ステップを含む処理を実行させるためのプログラム。
    On the computer,
    Get distance information about the distance to the subject,
    Using the distance information, set an area where a predetermined object may be imaged,
    A program for executing a process including a step of estimating a category to which the object belongs from the size of the area and the distance information.
PCT/JP2017/015212 2016-04-28 2017-04-14 Detection device, detection method, and program WO2017188017A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2016-091357 2016-04-28
JP2016091357A JP2017199278A (en) 2016-04-28 2016-04-28 Detection device, detection method, and program

Publications (1)

Publication Number Publication Date
WO2017188017A1 true WO2017188017A1 (en) 2017-11-02

Family

ID=60161592

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2017/015212 WO2017188017A1 (en) 2016-04-28 2017-04-14 Detection device, detection method, and program

Country Status (2)

Country Link
JP (1) JP2017199278A (en)
WO (1) WO2017188017A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112470231A (en) * 2018-07-26 2021-03-09 索尼公司 Information processing apparatus, information processing method, and program

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019202670A1 (en) * 2018-04-17 2019-10-24 株式会社ソシオネクスト Gesture recognition method and gesture recognition device
JP7046786B2 (en) * 2018-12-11 2022-04-04 株式会社日立製作所 Machine learning systems, domain converters, and machine learning methods
JP7489225B2 (en) 2020-04-20 2024-05-23 メタウォーター株式会社 IMAGE PROCESSING SYSTEM, INFORMATION PROCESSING APPARATUS, PROGRAM, AND IMAGE PROCESSING METHOD
CN112633218B (en) * 2020-12-30 2023-10-13 深圳市优必选科技股份有限公司 Face detection method, face detection device, terminal equipment and computer readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006145352A (en) * 2004-11-18 2006-06-08 Matsushita Electric Works Ltd Image processor
JP2010020404A (en) * 2008-07-08 2010-01-28 Toshiba Corp Image processor and method thereof
JP2010165183A (en) * 2009-01-15 2010-07-29 Panasonic Electric Works Co Ltd Human body detection device
JP2012243050A (en) * 2011-05-19 2012-12-10 Fuji Heavy Ind Ltd Environment recognition device and environment recognition method
JP2014106732A (en) * 2012-11-27 2014-06-09 Sony Computer Entertainment Inc Information processor and information processing method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006145352A (en) * 2004-11-18 2006-06-08 Matsushita Electric Works Ltd Image processor
JP2010020404A (en) * 2008-07-08 2010-01-28 Toshiba Corp Image processor and method thereof
JP2010165183A (en) * 2009-01-15 2010-07-29 Panasonic Electric Works Co Ltd Human body detection device
JP2012243050A (en) * 2011-05-19 2012-12-10 Fuji Heavy Ind Ltd Environment recognition device and environment recognition method
JP2014106732A (en) * 2012-11-27 2014-06-09 Sony Computer Entertainment Inc Information processor and information processing method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112470231A (en) * 2018-07-26 2021-03-09 索尼公司 Information processing apparatus, information processing method, and program

Also Published As

Publication number Publication date
JP2017199278A (en) 2017-11-02

Similar Documents

Publication Publication Date Title
US10753757B2 (en) Information processing apparatus and information processing method
WO2017159382A1 (en) Signal processing device and signal processing method
JP6834964B2 (en) Image processing equipment, image processing methods, and programs
US10946868B2 (en) Methods and devices for autonomous vehicle operation
US10880498B2 (en) Image processing apparatus and image processing method to improve quality of a low-quality image
WO2017212928A1 (en) Image processing device, image processing method, and vehicle
WO2017188017A1 (en) Detection device, detection method, and program
JP6764573B2 (en) Image processing equipment, image processing methods, and programs
JP2017068589A (en) Information processing apparatus, information terminal, and information processing method
CN109791706B (en) Image processing apparatus and image processing method
WO2019225349A1 (en) Information processing device, information processing method, imaging device, lighting device, and mobile object
US20190266786A1 (en) Image processing apparatus and image processing method
WO2018150685A1 (en) Image processing device, image processing method, and program
US11585898B2 (en) Signal processing device, signal processing method, and program
JP6922169B2 (en) Information processing equipment and methods, vehicles, and information processing systems
JP7363890B2 (en) Information processing device, information processing method and program
US20220012552A1 (en) Information processing device and information processing method
CN111868778B (en) Image processing device, image processing method, and storage medium
US20240127042A1 (en) Information processing device, information processing system, information processing method, and recording medium
US20240004075A1 (en) Time-of-flight object detection circuitry and time-of-flight object detection method
WO2020195969A1 (en) Information processing device, information processing method, and program
WO2024070673A1 (en) Solid-state imaging device, electronic device, and program
JP2024065130A (en) Information processing device, information processing method, and program
JPWO2020116204A1 (en) Information processing device, information processing method, program, mobile control device, and mobile

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17789303

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17789303

Country of ref document: EP

Kind code of ref document: A1