CN111626080A

CN111626080A - Vehicle detection method and vehicle-mounted terminal

Info

Publication number: CN111626080A
Application number: CN201910147320.2A
Authority: CN
Inventors: 年素磊; 梁继
Original assignee: Momenta Suzhou Technology Co Ltd
Current assignee: Momenta Suzhou Technology Co Ltd
Priority date: 2019-02-27
Filing date: 2019-02-27
Publication date: 2020-09-04
Anticipated expiration: 2039-02-27
Also published as: CN111626080B

Abstract

The invention discloses a vehicle detection method and a vehicle-mounted terminal, wherein the method comprises the following steps: performing feature extraction on an input picture, inputting the input picture into a wheel point identification model obtained by pre-training, and predicting to obtain whether the input picture comprises wheel points and coordinates of the wheel points; connecting the wheel points with opposite wheel points, voting the obtained connecting lines, and taking the connecting line with the highest vote number as a wheel line; and determining the position of the vehicle according to the position of the wheel line.

Description

Vehicle detection method and vehicle-mounted terminal

Technical Field

The invention relates to the field of automatic driving, in particular to a wheel detection method and a vehicle-mounted terminal.

Background

Vehicle detection is currently mainly achieved in the form of a detection box as shown in fig. 1. Namely, the detection is performed for the vehicle outer shape whole frame.

However, this form of vehicle detection is not suitable in many scenes, for example, in a fish-eye image (192 fov), the distortion of the vehicle is large, and the vehicle detection effect achieved by the detection frame is not good and is not accurate enough.

Disclosure of Invention

The invention provides a vehicle detection method and a vehicle-mounted terminal, which are used for overcoming at least one problem in the prior art.

According to a first aspect of embodiments of the present invention, there is provided a vehicle detection method including the steps of:

performing feature extraction on an input picture, inputting the input picture into a wheel point identification model obtained by pre-training, and predicting to obtain whether the input picture comprises wheel points and coordinates of the wheel points, wherein the wheel point identification model enables the input picture and the class information of whether pixel points in the corresponding input picture belong to the wheel points, and the coordinate offset of each wheel point and the coordinate offset of the opposite wheel point to be associated;

connecting the wheel points with opposite wheel points, voting the obtained connecting lines, and taking the connecting line with the highest vote number as a wheel line;

and determining the position of the vehicle according to the position of the wheel line.

Optionally, the wheel point identification model is constructed by:

generating a training sample set, wherein the training sample set comprises a plurality of groups of sample data, and each group of sample data comprises a sample picture containing a wheel image, class information of whether pixel points labeled in advance in the corresponding sample picture belong to wheel points or not and offset of each wheel point and an opposite wheel point;

training a pre-built deep neural network model based on the training sample set to obtain a wheel point identification model, wherein the wheel point identification model enables the sample pictures in each group of sample data to be associated with the class information, which is pre-labeled in the corresponding sample pictures and belongs to the wheel point or not, and the offset of each wheel point and the opposite wheel point.

Optionally, the pre-trained wheel point recognition model includes 5 channels (channels), where the 5 channels include 1 classification channel and 4 regression channels; the 1 classification channel is used for outputting a classification confidence map of the wheel points and is represented as a circular area with the coordinates of the wheel points as the center and a given radius; the regression channel is used for outputting two coordinate offsets of the current wheel point and two coordinate offsets of the opposite wheel point respectively.

Optionally, predicting whether the input picture includes a wheel point and coordinates of the wheel point includes:

adding the coordinates of the current wheel point output by the wheel point identification model with the two coordinate offsets of the current wheel point to obtain the coordinates of a plurality of candidate wheel points;

voting the obtained coordinates of the candidate wheel points, and eliminating the candidate wheel points smaller than a preset fraction threshold value;

points with higher scores than the adjacent four points are taken as candidate wheel points;

the resulting candidate wheel points are sorted by score and a non-maximum suppression radius is set such that points having a distance less than the radius are suppressed to yield wheel points.

Optionally, connecting the wheel point and the opposing wheel point comprises:

sorting the wheel points according to fractions;

matching with other points in sequence from the point with the highest score, searching the point matched with the current point from the other points, and if the point is found, outputting the connecting line of the two points; if not found, this point is deleted.

Optionally, voting on the obtained multiple connecting lines, and using the connecting line with the highest vote number as a wheel line, including:

traversing each voting connecting line aiming at each candidate point, and if the distance between the connecting line and the two points is less than a threshold value, adding 1 to the number of votes cast by the two points; for the two points with the highest number of votes, if the number of votes is greater than the set threshold, a line is matched.

According to a second aspect of the embodiments of the present invention, there is provided a vehicle-mounted terminal including:

the wheel line identification module is used for extracting features of an input picture, inputting the input picture into a wheel point identification model obtained through pre-training, and predicting whether the input picture comprises wheel points and coordinates of the wheel points, wherein the wheel point identification model enables the input picture to be associated with class information of whether pixel points in the corresponding input picture belong to the wheel points, coordinate offset of each wheel point and coordinate offset of the opposite wheel point;

the wheel line voting module is used for connecting the wheel points with the opposite wheel points, voting the obtained connecting lines and taking the connecting line with the highest vote number as a wheel line;

and the detection module is used for determining the position of the vehicle according to the position of the wheel line.

Optionally, the vehicle-mounted terminal further includes:

the system comprises a sample set generation module, a training sample set generation module and a training analysis module, wherein the training sample set comprises a plurality of groups of sample data, and each group of sample data comprises a sample picture containing a wheel image, class information of whether a pixel point labeled in advance in a corresponding sample picture belongs to a wheel point, coordinate offset of each wheel point and coordinate offset of an opposite wheel point;

and the model training module is used for training a pre-built deep neural network model based on the training sample set to obtain a wheel point identification model, and the wheel point identification model associates the sample picture in each group of sample data with the type information of whether a pre-labeled pixel point in the corresponding sample picture belongs to a wheel point, the coordinate offset of each wheel point and the coordinate offset of the opposite wheel point.

Optionally, the wheel line identification module comprises:

the coordinate calculation unit is used for adding the coordinates of the current wheel point output by the wheel point identification model with the two coordinate offsets of the current wheel point to obtain the coordinates of a plurality of candidate wheel points;

the voting unit is used for voting the obtained coordinates of the candidate wheel points and eliminating the candidate wheel points smaller than a preset fraction threshold value;

a selecting unit configured to select a point having a higher score than the four adjacent points as a candidate wheel point;

a suppressing unit for sorting the obtained candidate wheel points by scores and setting a non-maximum suppression radius so that points having a distance smaller than the radius are suppressed; and obtaining wheel points.

Optionally, the wheel line voting module comprises:

the sorting unit is used for sorting the wheel points according to fractions;

the matching unit is used for sequentially matching with the other points from the point with the highest score; searching points matched with the current point from the other points, and if the points are found, outputting a connecting line of the two points; if not, then delete this point.

Optionally, the wheel line voting module comprises:

the traversing unit is used for traversing each voting connecting line aiming at each candidate point, and if the distance between the connecting line and the two points is less than a threshold value, the number of votes cast by the two points is added by 1; for the two points with the highest number of votes, if the number of votes is greater than the set threshold, a line is matched.

The method comprises the steps of extracting features of an input picture, inputting the extracted features into a wheel point identification model obtained through pre-training, and predicting whether the input picture comprises wheel points or not and coordinates of the wheel points; connecting the wheel points with opposite wheel points, voting the obtained connecting lines, and taking the connecting line with the highest vote number as a wheel line; and determining the position of the vehicle according to the position of the wheel line. Compared with the prior art, the embodiment of the invention achieves the purpose of detecting the vehicle by detecting the wheel line, and can effectively solve the detection difficulty caused by large vehicle distortion, incomplete display of nearby vehicles in a camera image and the like.

The innovation points of the embodiment of the invention comprise:

1. the aim of detecting the vehicle is achieved by detecting the wheel line, the detection difficulty caused by large distortion of the vehicle, incomplete display of nearby vehicles in a camera image and the like can be effectively solved, and the method is one of the innovation points of the embodiment of the invention.

2. In the process of implementing the invention, the inventor finds that the wheel line and the vehicle have an incidence relation, and converts the relation among the wheel points, the wheel line and the corresponding vehicle image into a wheel point identification model through a machine learning method based on the law, so that the identification efficiency of the wheel line is improved, and the vehicle identification based on the wheel line can be widely popularized and applied, which is one of the innovation points of the invention.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic diagram of a detection frame for detecting a vehicle in the prior art;

FIG. 2 is a flow chart of a vehicle detection method according to one embodiment of the invention;

FIG. 3 is a schematic view of a wheel point identification model according to one embodiment of the present invention;

FIG. 4 is a diagram illustrating the output of a classification branch according to an embodiment of the present invention;

FIG. 5 is a graph illustrating the results of a wire bonding process according to one embodiment of the present invention;

fig. 6 illustrates a block diagram of a vehicle-mounted terminal according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without any inventive step, are within the scope of the present invention.

It is to be noted that the terms "comprises" and "comprising" and any variations thereof in the embodiments and drawings of the present invention are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

The embodiment of the invention discloses a vehicle detection method, a vehicle-mounted terminal and the vehicle-mounted terminal. The following are detailed below.

FIG. 2 is a flow chart of a vehicle detection method according to one embodiment of the invention; the method is applied to vehicle-mounted terminals such as a vehicle-mounted Computer and a vehicle-mounted Industrial control Computer (IPC), and the embodiment of the invention is not limited. The vehicle-mounted terminal is connected with each sensor of the vehicle, and receives and processes data acquired by each sensor. As shown in fig. 2, the vehicle detection method includes the steps of:

s201, performing feature extraction on an input picture, inputting the input picture into a wheel point identification model obtained through pre-training, and predicting to obtain whether the input picture comprises wheel points and coordinates of the wheel points, wherein the wheel point identification model enables the input picture and corresponding type information whether pixel points in the input picture belong to the wheel points, and coordinate offset of each wheel point and coordinate offset of the opposite wheel point to be associated.

The wheel line is a connecting line between two wheels on the same side of the vehicle and a grounding point. The wheel lines of the front and rear wheels of the vehicle may be represented by one type of line (e.g., color, thickness, etc.), and the wheel lines of the left and right wheels of the vehicle may be represented by another type of line.

In the implementation, the wheel line detection can detect the wheel lines of the front side (including the front and rear sides) and the side surface of the vehicle, but the front side is processed separately, so the side surface is taken as an example and the same principle is applied to the front side.

In particular implementation, the wheel point identification model may be constructed by:

generating a training sample set, wherein the training sample set comprises a plurality of groups of sample data, and each group of sample data comprises a sample picture containing a wheel image, class information of whether a pixel point pre-labeled in a corresponding sample picture belongs to a wheel point, coordinate offset of each wheel point and coordinate offset of an opposite wheel point;

training a pre-built deep neural network model based on the training sample set to obtain a wheel point identification model, wherein the wheel point identification model associates the sample picture in each group of sample data with the class information of whether the pre-labeled pixel point in the corresponding sample picture belongs to a wheel point, the coordinate offset of each wheel point and the coordinate offset of the opposite wheel point.

In implementation, when data is labeled, two wheel points can be connected by a straight line, the wheel points have associated category information, currently 2 types, visible wheel points and invisible cut-off points (the invisible cut-off points refer to the blocked wheel points).

The design of the Training target is described below with reference to fig. 3.

Similar to the segmentation task, the generated label is heatmap with the same size as the original image, and includes 5 channels, which are 1 classification channel and 4 regression channels respectively. As shown in fig. 3:

1. classification channel (classification confidence map of wheel points).

With the wheel point as the center, a radius is given to form a circular area, where the gray area (gray circle) is the wheel point class (assigned 1) and the black area is the background class (assigned 0).

2. Return channels.

Returns to the offsets (offset amounts) of the current wheel point and the opposite wheel point, respectively. Each point x, y has two coordinates, thus 4 channels in total.

The bottom row of fig. 3 is a schematic representation of the regressive current wheel point to the left and the regressive opposite wheel point to the right.

The classification branch may use a Softmax Loss function commonly used for segmentation tasks, and the regression branch may use a Smooth L1 Loss function.

S202, connecting the wheel points with opposite wheel points, voting the obtained connecting lines, and taking the connecting line with the highest vote number as a wheel line.

Optionally, predicting whether the input picture includes a wheel point and coordinates of the wheel point, including adding two coordinate offsets of the current wheel point to the coordinates of the current wheel point output by the wheel point identification model to obtain coordinates of a plurality of candidate wheel points; voting the obtained coordinates of the plurality of candidate wheel points, and eliminating candidate wheel points smaller than a preset fraction threshold value; points with higher scores than the adjacent four points are taken as candidate wheel points; sorting the obtained candidate wheel points according to fractions, and setting a non-maximum suppression radius so that points with a distance smaller than the radius are suppressed; and obtaining wheel points.

Optionally, connecting the wheel point and the opposing wheel point comprises: sorting the wheel points according to fractions; searching points matched with the current point from the rest points from the point with the highest score, and if the points are found, outputting a connecting line of the two points; if not found, this point is deleted.

Optionally, voting on the obtained multiple connecting lines, and using the connecting line with the highest vote number as a wheel line, including: traversing each voting connecting line aiming at each candidate point, and if the distance between the connecting line and the two points is less than a threshold value, adding 1 to the number of votes cast by the two points; for the two points with the highest number of votes, if the number of votes is greater than the set threshold, a line is matched.

In one embodiment, this step may be implemented as follows:

the input picture is subjected to backsbone network extraction characteristics, and then the input picture is respectively output in classification and regression branches. Then a process of Line nms (linear non-maximum suppression) is needed to form the final wheel Line result.

The Line nms flow may be as follows: cls/reg map- > vote map- > selecting a local maximum value point- > candidate wheel point- > wheel point nms- > final wheel point- > point to be connected into a wheel line.

The volume map step may be as follows: after the classification branch passes argmax, a non-background region can be obtained, ideally covering the wheel points, as shown in fig. 4. But a regression section is also required to accurately determine the pixel positions of the wheel points. 4 channels are regressed to output, namely dx1, dy1, dx2 and dy2, the first two channels predict the offset of the current wheel point, and therefore (x + dx1, y + dy1) obtains a predicted wheel point; and then voting on an output graph with the same size as the original graph. After all votes are completed, a comparison is made with a score threshold, and points above the threshold are passed to the next stage.

The step of selecting local maximum points may be as follows: from the points in the previous section, points having scores higher than those of the neighboring four points are selected as candidate wheel points.

The wheel point nms steps may be as follows: and (3) performing nms (non-maximum suppression) on the candidate wheel points, sequencing the candidate points according to fractions similar to the nms of the bbox, setting an nms radius, suppressing the points with the distance smaller than the radius, and taking the points after the nms as final wheel points.

The wheel point wiring steps may be as follows: after the wheel points are obtained, it is also necessary to connect the wheel points into a line. The basis of the connection is the prediction of two channels after the regression branch, namely dx2 and dy2, and the regression from each pixel point to the opposite wheel point is the voting (x, y, x + dx2 and y + dy2) of one connection.

The algorithm flow of the connecting line can be as follows: sorting the wheel points according to scores, starting from the point with the highest score; and greedily finding a point matched with the point from the rest points, if the point can be found, outputting a connecting line of the two points, and if the point cannot be found, deleting the point. The results of the wiring are shown in FIG. 5.

The strategy for point matching may be as follows: and traversing each voting line aiming at each candidate point, if the distance between the two points of the line is smaller than a threshold value, adding 1 to the votes of the two points, and finally matching the two points with the highest votes into a line if the votes of the two points are larger than the threshold value. The threshold may be a distance between center points of two wheels on the same side, and since the distance between the ground points of the two wheels is generally equal to the distance between the center points of the two wheels, the distance between the two wheel points actually obtained from the picture will be smaller than the distance between the center points of the two wheels due to the shooting angle of the vehicle picture.

S203, determining the position of the vehicle according to the position of the wheel line.

Fig. 6 illustrates a block diagram of a vehicle-mounted terminal according to an embodiment of the present invention. As shown in fig. 6, the in-vehicle terminal 600 according to the embodiment of the present invention includes: the wheel line identification module 601 is configured to perform feature extraction on an input picture, input the input picture to a wheel point identification model obtained through pre-training, and predict whether the input picture includes a wheel point and a coordinate of the wheel point, where the wheel point identification model associates the input picture with category information of whether a pixel point in the corresponding input picture belongs to the wheel point, and a coordinate offset of each wheel point with a coordinate offset of an opposite wheel point; a wheel line voting module 602, configured to connect the wheel point with an opposite wheel point, vote for multiple obtained connection lines, and use the connection line with the highest number of votes as a wheel line; a detecting module 603, configured to determine a position of the vehicle according to the position of the wheel line.

Optionally, the vehicle-mounted terminal further includes:

Optionally, the wheel line identification module comprises: the coordinate calculation unit is used for adding the coordinates of the current wheel point output by the wheel point identification model with the two coordinate offsets of the current wheel point to obtain the coordinates of a plurality of candidate wheel points; the voting unit is used for voting the obtained coordinates of the candidate wheel points and eliminating the candidate wheel points smaller than a preset fraction threshold value; a selecting unit configured to select a point having a higher score than the four adjacent points as a candidate wheel point; a suppressing unit for sorting the obtained candidate wheel points in a fractional order and setting a non-maximum suppression radius so that points whose distance is smaller than the radius are suppressed; wheel points are obtained.

Optionally, the wheel line voting module comprises: the sorting unit is used for sorting the wheel points according to scores, and starting from the point with the highest score; the matching unit is used for searching a point matched with the current point from the rest points, and if the point is found, the connecting line of the two points is output; if not found, this point is deleted.

Optionally, the wheel line voting module comprises: the traversing unit is used for traversing each voting connecting line aiming at each candidate point, and if the distance between the connecting line and the two points is less than a threshold value, the number of votes cast by the two points is added by 1; for the two points with the highest number of votes, if the number of votes is greater than the set threshold, a line is matched.

Those of ordinary skill in the art will understand that: the figures are merely schematic representations of one embodiment, and the blocks or flow diagrams in the figures are not necessarily required to practice the present invention.

Those of ordinary skill in the art will understand that: modules in the devices in the embodiments may be distributed in the devices in the embodiments according to the description of the embodiments, or may be located in one or more devices different from the embodiments with corresponding changes. The modules of the above embodiments may be combined into one module, or further split into multiple sub-modules.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A vehicle detection method, characterized by comprising the steps of:

2. The vehicle detection method according to claim 1, wherein the wheel point identification model is constructed by:

generating a training sample set, wherein the training sample set comprises a plurality of groups of sample data, and each group of sample data comprises a sample picture containing a wheel image, class information which is pre-labeled in the corresponding sample picture and is whether the corresponding sample picture belongs to a wheel point, and offset of each wheel point and an opposite wheel point;

training a pre-built deep neural network model based on the training sample set to obtain a wheel point identification model, wherein the wheel point identification model associates the sample pictures in each group of sample data with the class information of whether the pre-labeled pixel points in the corresponding sample pictures belong to wheel points or not and the offset of each wheel point and the opposite wheel point.

3. The vehicle detection method according to claim 1 or 2, wherein the pre-trained wheel point recognition model comprises 5 channels, the 5 channels comprising 1 classification channel and 4 regression channels; the 1 classification channel is used for outputting a classification confidence map of the wheel points and is represented as a circular area with the coordinates of the wheel points as the center and a given radius; the regression channel is used for outputting two coordinate offsets of the current wheel point and two coordinate offsets of the opposite wheel point respectively.

4. The vehicle detection method according to any one of claims 1 to 3, wherein predicting whether the input picture includes a wheel point and coordinates of the wheel point includes:

voting is carried out on the obtained coordinates of the plurality of candidate wheel points, and candidate wheel points smaller than a preset fraction threshold value are eliminated;

5. The vehicle detecting method according to any one of claims 1 to 4, wherein connecting the wheel point and an opposite wheel point includes:

sorting the wheel points according to fractions;

6. The vehicle detection method according to any one of claims 1 to 5, wherein voting is performed on the obtained plurality of links, and the link having the highest vote count is taken as a wheel line, and the method includes:

7. A vehicle-mounted terminal characterized by comprising:

the wheel line identification module is used for extracting features of an input picture, inputting the input picture into a wheel point identification model obtained through pre-training, and predicting whether the input picture comprises wheel points and coordinates of the wheel points, wherein the wheel point identification model enables the input picture and corresponding type information whether pixel points in the input picture belong to the wheel points, and coordinate offset of each wheel point and coordinate offset of the opposite wheel point to be associated;

the wheel line voting module is used for connecting the wheel points with opposite wheel points, voting the obtained connecting lines and taking the connecting line with the highest vote number as a wheel line;

8. The in-vehicle terminal according to claim 7, characterized by further comprising:

the system comprises a sample set generation module, a training sample set generation module and a training analysis module, wherein the training sample set comprises a plurality of groups of sample data, and each group of sample data comprises a sample picture containing a wheel image, category information of whether a pixel point labeled in advance in a corresponding sample picture belongs to a wheel point or not and offset of each wheel point and an opposite wheel point;

and the model training module is used for training a pre-built deep neural network model based on the training sample set to obtain a wheel point identification model, and the wheel point identification model associates the sample pictures in each group of sample data with the class information of whether the pre-labeled pixel points in the corresponding sample pictures belong to wheel points or not and the offset of each wheel point and the opposite wheel point.

9. The vehicle-mounted terminal of claim 7, wherein the pre-trained wheel point recognition model comprises 5 channels (channels), and the 5 channels comprise 1 classification channel and 4 regression channels; the 1 classification channel is used for outputting a classification confidence map of the wheel points and is represented as a circular area with the coordinates of the wheel points as the center and a given radius; the regression channel is used for outputting two coordinate offsets of the current wheel point and two coordinate offsets of the opposite wheel point respectively.

10. The in-vehicle terminal according to any one of claims 8 to 9, wherein the wheel-line identifying module includes:

a suppressing unit for sorting the obtained candidate wheel points by scores and setting a non-maximum suppressing radius so that points whose distance is smaller than the radius are suppressed to obtain wheel points.