CN112232272B

CN112232272B - Pedestrian recognition method by fusing laser and visual image sensor

Info

Publication number: CN112232272B
Application number: CN202011203273.8A
Authority: CN
Inventors: 李承政
Original assignee: Shanghai Yogo Robot Co Ltd
Current assignee: Shanghai Yogo Robot Co Ltd
Priority date: 2020-11-02
Filing date: 2020-11-02
Publication date: 2023-09-08
Anticipated expiration: 2040-11-02
Also published as: CN112232272A

Abstract

The application provides a pedestrian recognition method by fusing laser and a visual image sensor, which comprises the following steps: acquiring laser image point cloud data and information of captured visual images in the same preset reference direction; analyzing a first position of a target feature in the laser image point cloud data and correcting the position by using a tracker, wherein the position comprises a first distance and a first direction angle in the laser image; analyzing a second position of the target feature from information of the visual image, wherein the position comprises a first distance and a second direction angle in the visual image; calculating a first deviation value of the first distance and the second distance or/and a second deviation value of the first direction angle and the second direction angle; and judging whether the target feature is false detection or not according to the first deviation value and the second deviation value.

Description

Pedestrian recognition method by fusing laser and visual image sensor

Technical Field

The application relates to the field of intelligent robots, in particular to a pedestrian recognition method by fusing laser and a visual image sensor.

Background

The laser tracking is a positioning method which can capture objects in the dark, tracks specific targets, and generally needs to input a large number of models to enable the models for extracting the specific targets to be more accurate; visual image recognition can be quite demanding, however, the light is required for the image captured by the visual model.

The general laser human leg recognition method is characterized in that the human leg position is recognized by utilizing a plurality of traditional machine learning algorithms based on laser data, or the laser data is converted into a gray level graph, and then the human leg position is directly detected by utilizing a deep learning related algorithm.

Disclosure of Invention

One of the purposes of the present application is to fuse laser light with a visual image sensor to identify pedestrians.

In order to achieve the above object, the present application provides a pedestrian recognition method by fusing laser and a visual image sensor, comprising:

acquiring laser image point cloud data and information of captured visual images in the same preset reference direction;

analyzing a first position of a target feature in the laser image point cloud data and correcting the position by using a tracker, wherein the position comprises a first distance and a first direction angle in the laser image;

analyzing a second position of the target feature from information of the visual image, wherein the position comprises a first distance and a second direction angle in the visual image;

calculating a first deviation value of the first distance and the second distance or/and a second deviation value of the first direction angle and the second direction angle;

and judging whether the target feature is false detection or not according to the first deviation value and the second deviation value.

Further, the specific step of judging the false detection includes:

presetting a first threshold and a second threshold, wherein the first threshold is related to the first distance and the second distance, and the second threshold is related to the first direction angle and the second direction;

the first deviation value falls into a first threshold range, and the second deviation value falls into a second threshold range, and the target feature detection is judged to be correct;

and if the first deviation value does not fall into the first threshold range or/and the second deviation value falls into the second threshold range, judging that the target feature is false detection.

Further, the method further comprises the specific step of determining the number of target features contained in the laser image point cloud data:

carrying out gradient analysis on an image area where each target feature in the foreground image is located;

obtaining gradient information of each pixel in an image area where each target feature in the foreground image is located;

determining an image area of a target feature, wherein gradient information of pixels in the area meets a preset gradient requirement, as an interest area;

labeling and numbering all the interest areas, clustering all the labeled interest areas, and summarizing and determining a target characteristic area from all the labeled interest areas according to a clustering result;

and determining the number of the image areas as the number of target features contained in the image to be identified.

Further, clustering the interest areas according to the colors carried by the interest areas to obtain a plurality of area clusters, wherein each area cluster comprises at least one interest area;

and determining each region of interest in the region cluster meeting the requirements of the human leg colors as a target characteristic region, wherein the colors carried by each region of interest in the region cluster meeting the requirements of the human leg colors are all in a preset human leg color range.

Further, tracking target features of a plurality of frames of laser point cloud data continuously appearing by using the tracker pair;

and the target features which appear continuously are regarded as positive examples, and the positive examples are extracted and saved.

Further, analyzing a pedestrian detection result detected by the visual image and a result tracked by the tracker to obtain a direction angle deviation;

checking whether a target feature exists within a second threshold range and marking that the target feature is assigned;

if the direction angles of all the target features are not in the range of the angle interval, the pedestrian does not allocate legs.

Further, the specific steps of the tracker training include:

inputting a human leg recognition result predicted by a laser human leg recognition algorithm into an existing tracker, wherein the tracker outputs a group of human leg tracking results;

comparing the position deviation between the human leg recognition result and the human leg tracking result, and if the corresponding position deviation is not large for each group of human leg recognition results, directly taking the recognition results as output; if the position deviation of a group of human legs is overlarge, taking the tracking result as final output according to the human leg identification result of the frame as false detection;

after the leg results are output, the tracker is updated with the corresponding results.

Further, the stitching the images to be synthesized in the same matching image group to obtain the image to be identified includes:

and splicing the plurality of successfully matched images to be synthesized, and performing image fusion processing on the spliced area to obtain the images to be identified.

Further, after the number of human legs included in the image to be identified is determined according to the number of image areas where the human legs are located, the method further includes:

and displaying the leg number information according to the leg number contained in the image to be identified.

Further, the method comprises the steps of: a first obtaining unit, a second obtaining unit, a third obtaining unit, a fourth obtaining unit and a first determining unit, wherein the first obtaining unit is used for obtaining an image to be identified;

the second obtaining unit is used for inputting the image to be identified into a preset human leg identification model to obtain an identification result output by the preset human leg identification model;

the third obtaining unit is used for obtaining a foreground image from the image to be identified when the identification result is that the image to be identified contains a human leg image;

the fourth obtaining unit is configured to perform leg detection on the foreground image, and obtain an image area where each leg in the foreground image is located;

the first determining unit is configured to determine, according to the number of image areas where the legs of the person are located, the number of the legs of the person included in the image to be identified.

Compared with the prior art, the application has the following technical effects:

in the application, whether the number of square frame marks in the right image is correct is judged, and the judging mode is mainly that whether the distance and the angle of the human legs in the visual image are the same as the positions of the human legs identified by the laser identification technology; in the application, the missing leg is judged in the laser recognition technology, and the completion mode is to add a new leg target in the laser recognition technology diagram according to recognition mainly according to the distance and angle of the leg in the visual image; at most one person leg is bound with each pedestrian in the laser identification technology diagram, and all unbound person legs are discarded.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the prior art, the drawings that are required in the detailed description or the prior art will be briefly described, it will be apparent that the drawings in the following description are some embodiments of the present disclosure, and other drawings may be obtained according to the drawings without inventive effort for a person of ordinary skill in the art.

FIG. 1 is a flowchart of a pedestrian recognition method by fusing laser with a visual image sensor according to an embodiment of the present application;

FIG. 2 is a flowchart of a pedestrian recognition method by fusing laser with a visual image sensor according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a pedestrian recognition method by fusing laser and a visual image sensor according to an embodiment of the present application;

fig. 4 is a schematic diagram of a pedestrian recognition method by fusing laser and a visual image sensor according to an embodiment of the present application;

fig. 5 is a schematic diagram of a pedestrian recognition method by fusing laser and a visual image sensor according to an embodiment of the present application;

FIG. 6 is a schematic diagram of a pedestrian recognition method by fusing laser with a visual image sensor according to an embodiment of the present application;

fig. 7 is a flowchart of a pedestrian recognition method by fusing laser and a visual image sensor according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings. Example row embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application that are set forth in the following claims.

In some embodiments, the present application provides a pedestrian recognition method by fusing laser with a visual image sensor, specifically, as shown in fig. 1, comprising:

101. acquiring laser image point cloud data and information of captured visual images in the same preset reference direction;

201. analyzing a first position of a target feature in the laser image point cloud data and correcting the position by using a tracker, wherein the position comprises a first distance and a first direction angle in the laser image;

301. analyzing a second position of the target feature from information of the visual image, wherein the position comprises a first distance and a second direction angle in the visual image;

401. calculating a first deviation value of the first distance and the second distance or/and a second deviation value of the first direction angle and the second direction angle;

501. and judging whether the target feature is false detection or not according to the first deviation value and the second deviation value.

In some embodiments, as shown in fig. 2, the specific step of determining the false detection includes:

511. presetting a first threshold and a second threshold, wherein the first threshold is related to the first distance and the second distance, and the second threshold is related to the first direction angle and the second direction;

512. the first deviation value falls into a first threshold range, and the second deviation value falls into a second threshold range, and the target feature detection is judged to be correct;

513. and if the first deviation value does not fall into the first threshold range or/and the second deviation value falls into the second threshold range, judging that the target feature is false detection.

In the ideal state, the angles and the distances of the visual identification image and the laser identification image are the same, and the laser is combined with the visual identification image to identify the human legs, so that the accurate positions of the human legs in the laser can be very well identified, namely, the visual image identification corrects the laser identification technical errors.

In some embodiments, the laser identification technique entails the specific step of identifying the number of human legs of laser image point cloud data, determining the number of target features contained in the laser image point cloud data:

In some embodiments, clustering the regions of interest according to the colors carried by the regions of interest to obtain a plurality of region clusters, wherein each region cluster comprises at least one region of interest;

In fig. 3, the left is a visual recognition technology diagram, and the right is a laser recognition technology diagram; in fig. 4, the left side is the laser recognition technology diagram, the right side is the number of human legs recognized by the complement diagram laser image recognition technology, the left side is the number of human legs recognized by the image recognition technology, in the application, whether the number in the square frame mark in the right side image is correct or not needs to be judged, and the judging mode is mainly that whether the distance and the angle of the human legs in the visual image are the same as the positions of the human legs recognized by the laser recognition technology or not.

In fig. 5, the left is a visual recognition technology diagram, and the right is a laser recognition technology diagram; in fig. 6, the left is a laser recognition technology diagram, the right is a missing full-complement diagram, the missing leg is judged in the laser recognition technology in the application, and the full-complement mode is to add a new leg target in the laser recognition technology diagram according to the recognition mainly according to the distance and angle of the leg in the visual image.

In some embodiments, the tracker pair is utilized to track a plurality of frames of target features where the laser point cloud data appear continuously;

In some embodiments, analyzing the pedestrian detection result of the visual image detection and the result tracked by the tracker to obtain a direction angle deviation;

In some embodiments, the specific steps of tracker training include:

In fig. 7, a method of training a tracker and a method of human leg identification are provided:

in step 601, laser data is input;

in step 602, a laser human leg identification algorithm;

in step 603, a human leg tracking module;

in step 604, a human leg recognition result;

fused human leg results in step 605;

in step 606, the module is fused;

in step 607, pedestrian detection results;

in step 608, an image pedestrian detection algorithm;

in step 609, visual image data is input.

And the method mainly updates the identifier, and improves the accuracy of the laser identification technology for individually identifying the legs of the person.

In some embodiments, the stitching each image to be synthesized in the same matching image group to obtain the image to be identified includes:

In some embodiments, after the determining the number of human legs included in the image to be identified according to the number of image areas where the human legs are located, the method further includes:

In some embodiments, comprising: a first obtaining unit, a second obtaining unit, a third obtaining unit, a fourth obtaining unit, and a first determining unit,

the first obtaining unit is used for obtaining an image to be identified;

the second obtaining unit is configured to input the image to be identified into a preset human leg identification model to obtain the preset image

A recognition result output by the human leg recognition model;

The method comprises the steps of firstly obtaining the positions of the human legs on the laser image, and then fusing the pedestrian detection results on the matched visible light image, so as to assist in improving the identification effect of the laser human legs. Before fusion, the method further combines the human leg positions of the history frames by utilizing a tracking algorithm on the basis of the human leg positions of the current laser frames, so that the missing detection of the human legs is effectively completed to a certain extent, and the false detection is correspondingly removed. Specifically, in the method, the position and the direction angle of the human leg on the laser frame are obtained firstly, the tracker is utilized to revise the obtained position of the human leg correspondingly, and then the pedestrian detection result and the direction angle thereof on the visible light image corresponding to the laser frame are obtained. Then, the deviation amount between the human leg direction angle on the laser frame and the pedestrian direction angle on the visible light image is calculated, and the human leg on the laser frame and the pedestrian on the visible light image can be bound together according to the deviation of the direction angle. If the direction angle deviation is smaller than the threshold value, the human leg is allocated to the pedestrian, and the pedestrian is marked as allocated, and if the direction angle deviation exceeds the threshold value, the human leg on the frame laser is considered to be identified as false detection, and the false detection is removed. If the direction angle section of a certain pedestrian contains a plurality of human leg results, only the human leg closest to the pedestrian is taken as a target to be bound, the target is marked as allocated, and other human legs meeting the direction angle condition are marked as unallocated and wait for allocation of the next round. In the final result, every pedestrian will get at most one leg bound to it, and all unbound legs will be discarded. The application can reduce the instability of the human leg identification only by the laser data to a certain extent so as to provide high-quality human leg identification results.

As can be easily understood, the fusion process is to use an image fusion technology to extract the beneficial information in each channel to the greatest extent from the image data of a target acquired by the multi-source channel through image processing, computer technology and the like, and finally combine the beneficial information into a high-quality image, so as to improve the utilization rate of the image information, improve the interpretation precision and reliability of the computer, and improve the spatial resolution and spectral resolution of the original image, thereby facilitating detection. The application adds the pixel values of the splicing area according to the preset full time by using the weighted fusion method in the image fusion technology, so that the splicing area is more natural and convenient to observe. Specifically, the preset human leg recognition model is obtained by machine learning a preset training image, and the preset training image mainly comprises an image with human leg characteristics and an image without human leg characteristics. The preset human leg model mainly uses the expression of preset training images under different scales and aspect ratios to create an image pyramid, the expression of the same training image under different scales and aspect ratios is created to create the image pyramid, and human leg areas in the human leg measurement response diagram under different scales and aspect ratios are mapped in the preset training diagram with original resolution, so that the human leg areas of the input preset training image are identified. Through training the preset leg models by using a large number of preset training images, a secondary classifier is generated in the preset leg models, and the secondary classifier is used for judging whether the legs exist in the images to be identified. Among them, the image pyramid is one of the multi-size expressions of images, and is an effective but conceptually simple structure for interpreting images in multiple resolutions. Among these, the pyramid of an image is a series of efficient but conceptually simple structures that arrange the resolution in a pyramid shape to interpret the image. A pyramid of an image is a series of progressively lower resolution images arranged in a pyramid shape and derived from the same original collection of images. Obtained by gradient downsampling, and not stopping sampling until a certain termination condition is reached. We metaphe a layer-by-layer image into a pyramid, the higher the level, the smaller the image and the lower the resolution.

A computer readable medium having stored thereon a computer program, the computer program being implemented when executed by a processor. A computer program is stored thereon, which when executed by a processor implements the above described robot self-test control method from the dispatch server side.

The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the application.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.

The above describes in detail a pedestrian recognition method by fusing laser and a visual image sensor, and specific examples are applied to illustrate the principle and implementation of the present application, and the above description of the examples is only used to help understand the method and core idea of the present application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1. A pedestrian recognition method by fusing laser and a visual image sensor, comprising the steps of:

analyzing a second position of the target feature from information of the visual image, wherein the position comprises a second distance and a second direction angle in the visual image;

calculating a first deviation value of the first distance and the second distance and a second deviation value of the first direction angle and the second direction angle;

judging whether the target feature is false detection or not according to the first deviation value and the second deviation value;

the specific steps for judging the false detection comprise:

if the first deviation value does not fall into a first threshold range or/and the second deviation value does not fall into a second threshold range, judging that the target feature is false detection;

the method further comprises the specific step of determining the number of target features contained in the laser image point cloud data:

determining the number of the image areas as the number of target features contained in the image to be identified;

clustering the interest areas according to the colors carried by the interest areas to obtain a plurality of area clusters, wherein each area cluster comprises at least one interest area; determining each region of interest in the region cluster meeting the requirements of the human leg colors as a target characteristic region, wherein the colors carried by each region of interest in the region cluster meeting the requirements of the human leg colors are all in a preset human leg color range;

the specific steps of the tracker training include:

comparing the position deviation between the human leg recognition result and the human leg tracking result, and if the corresponding position deviation is not large for each group of human leg recognition results, directly taking the recognition results as output; if the position deviation of a group of human legs is overlarge, taking the tracking result as final output according to the false detection of the human leg identification result of the group;

after the leg result is output, updating the tracker by using the corresponding result;

the step of stitching the images to be synthesized in the same matching image group to obtain the images to be identified includes:

2. The pedestrian recognition method by fusing laser and a visual image sensor according to claim 1, wherein a plurality of frames of target features of which the laser point cloud data continuously appear are tracked by utilizing the tracker pair;

and taking the continuously-occurring target features as positive examples, and extracting and storing the positive examples.

3. The pedestrian recognition method by combining laser and a visual image sensor according to claim 1, wherein the direction angle deviation is obtained by analyzing the pedestrian detection result of visual image detection and the result tracked by a tracker;

if the direction angles of all the target features are not in the range of the preset angle interval, the pedestrian does not allocate the legs.

4. The method according to claim 1, wherein after the determining of the number of human legs contained in the image to be recognized based on the number of image areas where the human legs are located, the method further comprises:

5. A system for recognizing the number of human legs contained in an image by using the pedestrian recognition method of fusing laser light according to any one of claims 1 to 4 with a visual image sensor, comprising: a first obtaining unit, a second obtaining unit, a third obtaining unit, a fourth obtaining unit, and a first determining unit,

the first obtaining unit is used for obtaining an image to be identified;