CN111986080A

CN111986080A - Logistics vehicle feature positioning method based on improved master R-CNN

Info

Publication number: CN111986080A
Application number: CN202010690178.9A
Authority: CN
Inventors: 张烨; 樊一超; 陈威慧
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2020-07-17
Filing date: 2020-07-17
Publication date: 2020-11-24
Anticipated expiration: 2040-07-17
Also published as: CN111986080B

Abstract

A method for positioning logistics vehicle characteristics based on improved master R-CNN comprises the following steps: step one, enhancement processing of logistics vehicle images; introducing a data enhancement means to process the logistics vehicle image; step two, constructing a basic network model; a VGGNet-16 basic network is adopted as a feature extraction network; meanwhile, in order to realize the positioning of the logistics vehicles, a target detection positioning model of an RPN network is added behind a feature extraction module in a third convolution layering of a fifth convolution layer of VGGNet-16; thirdly, screening the logistics vehicle target by using a non-maximum suppression algorithm; step four, uniformly normalizing the target characteristics of the logistics vehicles; and (4) introducing the obtained characteristic diagram of the fixed dimension data into a seventh stage of the basic network model to obtain the accurate probability of the logistics vehicle positioning boundary frame and the corresponding vehicle type. The invention has good performance of positioning the characteristics of the logistics vehicles in different environments and scenes.

Description

Logistics vehicle feature positioning method based on improved master R-CNN

Technical Field

The invention relates to a logistics vehicle feature positioning method based on improved master R-CNN.

Technical Field

In recent years, with the development of traffic logistics, more and more logistics vehicles serve the work and life of people, but the problem is caused, and the difficulty coefficient of vehicle parking management in a park is increased due to the excessive logistics engineering vehicles. Although the operation such as the operation of the commodity circulation vehicle of pulling, getting rid of and hanging can improve the operating efficiency that the goods loaded, but still there is the unreasonable parking stall that occupies of commodity circulation vehicle at present, pulls, gets rid of and hangs unable accurate charging scheduling problem, and what is more serious is that there are extremely dangerous behaviors such as fake plate in the detection of avoiding the control of some car owners.

In order to effectively solve the management problem in the aspect of logistics engineering vehicles, many examples of identifying logistics vehicles of different vehicle types by adopting technical means such as computer vision and the like exist at present, most of identification methods of the logistics vehicles are that images of the vehicle types are obtained from a traffic intersection camera or an image acquisition card, and because images acquired by traffic videos are that the vehicles pass through a certain position under a natural environment, namely the accurate position of the vehicles in the images is found out and then the characteristics of the vehicles are extracted, so that the vehicle types are identified. However, the current recognition method mainly has the following difficulties for vehicle type recognition: (1) the recognition effect of the vehicle type is greatly influenced under different illumination conditions, and the situation of recognition error is caused when the visual perception of the same vehicle is different in the environments such as sunny days, rainy days and snowy days; (2) the scene of the vehicle is complex and changeable, for example, in the scene with complex background such as a rural area, the foreground and the background can not be separated quickly and accurately; (3) the appearances of the vehicle types are various, and the appearances of different vehicle types comprise various parameters, such as colors, shapes, brands, sizes and the like, which all affect the recognition of the vehicle characteristics. In a word, the influence of uncertain factors such as environment, scene, appearance and the like still exists in the identification of the logistics vehicle characteristics by using computer vision at present, so that the logistics vehicle characteristics are difficult to identify.

Disclosure of Invention

In order to overcome the defects in the prior art, the invention provides a method for positioning logistics vehicle characteristics based on improved false R-CNN, aiming at the management problem in the aspect of logistics engineering vehicles and the problems that the traditional identification method is difficult to identify due to uncertain factors such as environment, scene, appearance and the like.

According to the method, the logistics vehicle image is subjected to data enhancement, so that the scene diversity of the sample image is increased; then, constructing a basic network model by using the improved faster-CNN; then, introducing a non-maximum suppression algorithm to screen a logistics vehicle target boundary box; and finally, uniformly normalizing the target characteristics of the logistics vehicles to realize accurate positioning.

In order to achieve the purpose, the invention adopts the following technical scheme:

a logistics vehicle feature positioning method based on improved master R-CNN comprises the following steps:

step one, enhancement processing of logistics vehicle images;

for the problems of fixed shooting angle, single background, low detection rate and the like of the logistics vehicles, the invention introduces a data enhancement means, and processes the logistics vehicle images through operations such as multi-scale equal-scale scaling, image rotation, saturation enhancement and the like, so as to increase the scene diversity of the logistics vehicle images, and further identify and position the logistics vehicle images.

1.1) carrying out multi-scale scaling operation on the logistics vehicles;

according to the principle that the high proportion of the specific length, the width and the height of the logistics vehicles in the original images is not damaged, the logistics vehicle images are scaled in multiple scales, and therefore the positioning network can learn the target features in the specific proportion.

Suppose that the pixel coordinate of a certain logistics vehicle before zooming is marked as A₀(x₀,y₀) And the scaled coordinates are denoted as A₁(x₁,y₁) Then A is₀And A₁Satisfy the relation:

(x₁,y₁)＝(μx₀,μy₀) (1)

where μ denotes a scaling factor. The above equation corresponds to the image scaling matrix, expressed as the following matrix:

wherein, when mu is more than 1, the image magnification operation is represented; when μ < 1, an image reduction operation is indicated.

1.2) carrying out rotation operation on the logistics vehicle image;

when the camera shoots the logistics vehicles in fast running, the captured images have great angle difference, and in order to adapt to recognition and positioning of different angles, the logistics vehicle images obtained by capturing need to be subjected to rotation transformation, so that vehicle characteristic information of various angles is generated.

The center of the logistics vehicle image is set as a rotation center O (0,0), the anticlockwise rotation angle of the logistics vehicle image is marked as theta, and when any pixel point P (x, y) in the image is subjected to rotation transformation, the pixel point P is changed into P₁(x₁,y₁) The rotation process is then represented by:

the above formula is a polar coordinate transformation formula, and corresponds to the image rotation matrix, and is expressed as the following matrix:

1.3) carrying out saturation enhancement operation on the logistics vehicle image;

in order to increase the diversity of data samples and enable the feature positioning network to be suitable for complex illumination environments, the method adjusts the saturation of the logistics vehicle images.

The specific flow of adjusting the image saturation is as follows:

(S1) calculating a pixel extremum on the logistics vehicle image;

where rgbMax denotes the pixel maximum value and rgbMin denotes the pixel minimum value.

(S2) saturation calculation;

the saturation S is calculated as follows:

delta＝(rgbMax-rgbMin)/255 (6)

value＝(rgbMax+rgbMin)/255 (7)

L＝value/2 (8)

(S3) adjusting the logistics vehicle image saturation;

setting a saturation parameter β for adjusting the illumination intensity, the calculation process is as follows:

1. if the parameter beta is not less than 0, the intermediate variable is first calculated

The value of (c):

updating

The value of (c):

and (3) adjusting the saturation:

RGB'＝RGB+(RGB-L*255)*α (12)

2. if the parameter β <0, then:

RGB'＝L*255+(RGB-L*255)*(1+α) (13)

the logistics vehicle image subjected to the scaling processing, the rotation operation and the saturation enhancement is applied to the following steps so as to accurately locate the logistics vehicle characteristics.

Step two, constructing a basic network model;

although the faster-CNN provides three basic networks for feature extraction, in order to obtain better feature extraction effect, the invention adopts the VGGNet-16 basic network as a feature extraction network for classifying logistics vehicles of different vehicle types. Meanwhile, in order to realize the positioning of the logistics vehicles, an RPN network target detection positioning model is added behind a feature extraction module in the third convolution layering of the fifth convolution layer of VGGNet-16.

The detailed design flow of the basic network model constructed by the invention is as follows:

(T1) first stage: firstly, inputting the W x H x 3 size image processed by the step one; then, carrying out convolution operation on the logistics vehicle image through two continuous convolution layers with 64 channels, wherein the convolution kernel size is 3 x 3, and the convolution step size is 2; the convolved image is then reduced in dimension through a 64-pass maximum pooling layer with a pooling kernel size of 2 x 2 and a step size of 2. This stage outputs a sheet

Size feature map.

(T2) second stage: the flow is the same as the first stage, namely, the image obtained in the first stage is input into a second stage network, and then a new feature map is obtained through convolution and pooling operations. But unlike the first stage, the second stage convolution and pooling channels both become 128, with the other parameters being the same as in the first stage.

(T3) third stage: firstly, inputting the image output in the second stage into the network in the third stage; then, carrying out convolution operation on the image through three continuous convolution layers with 256 channels, wherein the convolution kernel size is 3 x 3, and the convolution step size is 2; the convolved image is then reduced in dimension through a 256-pass maximum pooling layer with a pooling kernel size of 2 x 2 and a step size of 2.

(T4) fourth stage: the process is the same as the third stage, namely, the image obtained in the third stage is input into a fourth stage network, and then a new characteristic diagram is obtained through convolution and pooling operations. But unlike the third stage, the convolution and pooling channels of the fourth stage both become 512, and the other parameters are the same as those of the third stage.

(T5) fifth stage: this stage consists of three convolutional layers, each with 512 channels, with a convolution kernel size of 3 x 3 and a convolution step size of 2. The signature size output at this stage is

(T6) sixth stage: firstly, connecting a convolution layer with the convolution kernel size of 3 x 3, the convolution step size of 2 and the convolution channel size of 512; then connecting a two-classification loss function and a frame regression loss function, and regressing and judging frame information and classification information (the probability of the frame information and the classification information is most likely to be a certain vehicle type) belonging to a logistics vehicle or a background;

(T7) seventh stage: firstly, connecting two 4096-channel full-connection layers; then connecting an overall loss function; and finally, outputting the accurate logistics vehicle positioning boundary frame and the probability of the corresponding vehicle type.

In the above-mentioned infrastructure, the design of some parameters such as activation function, loss function, etc. is designed, and the following will be described in detail:

(P1) in the VGGNet-16 basic network, the activation functions for all post-convolution connections are all ReLu activation functions used in the present invention:

ReLu(x)＝max(0,x) (14)

(P2) in the sixth stage, the classification loss function and bounding box regression loss function used are designed as follows:

classification damageLose L_{rpn_cls}Expressed as:

wherein p is_iRepresenting the probability that the frame is a logistics vehicle or a background;

and representing the probability that the real frame corresponding to the frame is a logistics vehicle or a background, if the real frame is a foreground, recording the probability as 1, and otherwise, recording the probability as 0.

Bounding box regression loss L_{rpn_box}Expressed as:

wherein, t_iFour-dimensional position information representing the predicted frame i, denoted as t_i(x_i,y_i,w_i,h_i)；

Four-dimensional position information representing a real frame, denoted

The function is represented as follows:

(P3) in the seventh stage, the overall loss function design rule is: the frame regression loss adopts a logistic regression mapping method; the classification loss adopts a gradient descent method; in the process of loss function reduction, Adam gradient reduction method is used for optimizing the loss function, and corresponding parameters are set to be alpha 0.001 and beta₁＝0.9、β₂0.999 and 10E-8.

(P4) during the training process, the learning rate adjustment strategy of the present invention employs a multi-stage decay approach.

Thirdly, screening the logistics vehicle target by using a non-maximum suppression algorithm;

and (4) the logistics vehicle images are processed by the basic network model in the second step to obtain more boundary frames on the same logistics vehicle, so that a method needs to be introduced to screen out redundant boundary frames. The specific operation flow is as follows:

(Q1) four-dimensional position information (x) based on the predicted bounding box_i,y_i,w_i,h_i) The area S of all the predicted frames of each vehicle in the logistics vehicle image can be obtained_i；

S_i＝w_i*h_i (18)

(Q2) in the sixth stage of the basic network model, frame information and classification information belonging to the physical distribution vehicle or the background are determined through regression. For each real logistics vehicle, the corresponding boundary frames are more, the real logistics vehicles are sorted from large to small according to the probability, and the boundary frame with the maximum probability is screened out;

(Q3) circularly calculating the area intersection ratio I of the screened boundary frame and the rest boundary frames, if I is larger than a preset threshold (the default of the invention is that the threshold is 0.7), the boundary frame is considered to be heavily overlapped with the boundary frame screened in the step (Q2), and then the boundary frame is deleted until all the boundary frames in the step (Q2) are processed.

If and only if the condition is satisfied

The calculation formula of the intersection-to-parallel ratio I is as follows:

if and only if the condition is satisfied

The calculation formula of the intersection-to-parallel ratio I is as follows:

the subscripts in the above formulas (19) to (20) and their constraints may each be represented by

At the same time must

Taking the constraint in equation (19) as an example, it may be varied as follows:

if the constraint in equation (19) is changed to the form of equation (21), the subscript in equation (19) must also be changed accordingly.

When the constraint conditions in the equations (19) and (20) are both satisfied, the calculation formula of the intersection-to-parallel ratio I is:

otherwise, I ═ 0, meaning that the two bounding boxes do not intersect, then both remain.

In the expressions (19) to (22), max represents the maximum bounding box whose area is denoted by S_max(ii) a oti represents any other bounding box with area denoted S_oti(ii) a The intersection area between the two bounding boxes is denoted S_ovp；(x_max,y_max,w_max,h_max) Four-dimensional position information, namely a center coordinate, a frame width and a frame height, of the screened maximum bounding box is represented; (x)_oti,y_oti,w_oti,h_oti) And four-dimensional position information representing any remaining bounding box.

Step four, uniformly normalizing the target characteristics of the logistics vehicles;

in order to solve the problem that the dimensions of subsequent connection layers are not matched due to different characteristics of the border of the non-maximum-value-restrained border, an interested area pooling layer is connected after the loss function of the sixth stage, and the borders with different characteristics of the border are subjected to unified normalization. The specific operation flow is as follows:

(M1) quantizing the four-dimensional position information of the bounding box on the logistics vehicle image obtained in the third step into integer array coordinates;

(M2) the quantized bounding box is equally divided into maximum pools of 4 x 4, 2 x 2, 1 x 1, forming fixed length data dimensions.

And (4) introducing the obtained characteristic diagram of the fixed dimension data into a seventh stage of the basic network model to obtain the accurate probability of the logistics vehicle positioning boundary frame and the corresponding vehicle type.

Preferably, in step 1.1), in order to reduce the cost of running the model, the present invention sets the scaling factor μ to 0.84, and stops scaling when the short-edge pixel is less than 100 pix.

Preferably, the area intersection ratio I of step (Q3) has a threshold value of 0.7.

The invention has the advantages that:

the invention provides a method for positioning logistics vehicle characteristics based on improved false R-CNN, aiming at the management problem of logistics engineering vehicles and the problem that the traditional identification method is difficult to identify due to uncertain factors such as environment, scene and appearance. Firstly, data enhancement is carried out on a logistics vehicle image, so that scene diversity of a sample image is increased; then, constructing a basic network model by using the improved master R-CNN; then, introducing a non-maximum suppression algorithm to screen a logistics vehicle target boundary box; and finally, uniformly normalizing the target characteristics of the logistics vehicles to realize accurate positioning. Therefore, the characteristic positioning performance of the logistics vehicle under the conditions of different environments, scenes and the like is superior to that of the traditional vehicle detection method, the problems in the aspect of logistics engineering vehicle management in a park can be well solved, and the method has certain practical value and application prospect.

Drawings

FIG. 1 is a schematic diagram of the image scaling of a logistics vehicle according to the present invention;

FIG. 2 is a schematic diagram of the image rotation of a logistics vehicle according to the present invention;

FIG. 3 is a diagram of the basic network architecture of the present invention;

FIG. 4 is a comparison graph before and after the logistic vehicle image is processed by the non-maximum algorithm designed by the invention; wherein, fig. 4a is the image before the non-maximum suppression, and fig. 4b is the image after the non-maximum suppression;

FIG. 5 is a flow chart of unified normalization of target features of the present invention;

FIG. 6 is a technical roadmap for the present invention.

Detailed Description

The technical scheme of the invention is further explained by combining the attached drawings.

In order to overcome the defects in the prior art, the invention provides a method for positioning logistics vehicle characteristics based on improved false R-CNN, aiming at the management problem in the aspect of logistics engineering vehicles and the problems that the traditional identification method is difficult to identify due to uncertain factors such as environment, scene, appearance and the like. Firstly, data enhancement is carried out on a logistics vehicle image, so that scene diversity of a sample image is increased; then, constructing a basic network model by using the improved faster-CNN; then, introducing a non-maximum suppression algorithm to screen a logistics vehicle target boundary box; and finally, uniformly normalizing the target characteristics of the logistics vehicles to realize accurate positioning.

a logistics vehicle feature positioning method based on improved faster R-CNN comprises the following steps:

step one, enhancement processing of logistics vehicle images;

1.1) carrying out multi-scale scaling operation on the logistics vehicles;

(x₁,y₁)＝(μx₀,μy₀) (1)

wherein, when mu is more than 1, the image magnification operation is represented; when μ < 1, an image reduction operation is indicated. To reduce the cost of model operation, the present invention sets the scaling factor μ to 0.84 and stops scaling when the short edge pixels are less than 100 pix.

1.2) carrying out rotation operation on the logistics vehicle image;

The specific flow of adjusting the image saturation is as follows:

(S1) calculating a pixel extremum on the logistics vehicle image;

(S2) saturation calculation;

the saturation S is calculated as follows:

delta＝(rgbMax-rgbMin)/255 (6)

value＝(rgbMax+rgbMin)/255 (7)

L＝value/2 (8)

(S3) adjusting the logistics vehicle image saturation;

The value of (c):

updating

The value of (c):

and (3) adjusting the saturation:

RGB'＝RGB+(RGB-L*255)*α (12)

2. if the parameter β <0, then:

RGB'＝L*255+(RGB-L*255)*(1+α) (13)

Step two, constructing a basic network model;

Size feature map.

ReLu(x)＝max(0,x) (14)

loss of classification L_{rpn_cls}Expressed as:

Bounding box regression loss L_{rpn_box}Expressed as:

Four-dimensional position information representing a real frame, denoted

The function is represented as follows:

S_i＝w_i*h_i (18)

If and only if the condition is satisfied

The calculation formula of the intersection-to-parallel ratio I is as follows:

if and only if the condition is satisfied

The calculation formula of the intersection-to-parallel ratio I is as follows:

the subscripts in the above formulae (19) to (20) and their constraints may each be represented by

At the same time must

if the constraint in equation (19) is changed to the form of equation (21), the subscript in equation (19) must be changed accordingly.

In formulae (19) to (22), max representsMaximum bounding box of table, area denoted S_max(ii) a oti represents any other bounding box with area denoted S_oti(ii) a The intersection area between the two bounding boxes is denoted S_ovp；(x_max,y_max,w_max,h_max) Four-dimensional position information, namely a center coordinate, a frame width and a frame height, of the screened maximum bounding box is represented; (x)_oti,y_oti,w_oti,h_oti) And four-dimensional position information representing any remaining bounding box.

The embodiments described in this specification are merely illustrative of implementations of the inventive concept and the scope of the present invention should not be considered limited to the specific forms set forth in the embodiments but rather by the equivalents thereof as may occur to those skilled in the art upon consideration of the present inventive concept.

Claims

1. A logistics vehicle feature positioning method based on improved master R-CNN comprises the following steps:

step one, enhancement processing of logistics vehicle images;

introducing a data enhancement means, and processing the logistics vehicle images by operations of multi-scale equal-scale scaling, image rotation, saturation enhancement and the like so as to increase the scene diversity of the logistics vehicle images for further identification and positioning;

1.1) carrying out multi-scale scaling operation on the logistics vehicles;

scaling the logistics vehicle images in multiple scales without destroying the principle of the high ratio of the specific length, the width and the height of the logistics vehicles in the original images so that a positioning network can learn the target characteristics in a specific ratio;

(x₁,y₁)＝(μx₀,μy₀) (1)

where μ represents a scaling factor; the above equation corresponds to the image scaling matrix, expressed as the following matrix:

wherein, when mu is more than 1, the image magnification operation is represented; when mu < 1, representing an image reduction operation; to reduce the cost of model operation;

1.2) carrying out rotation operation on the logistics vehicle image;

when the camera shoots the logistics vehicles in rapid driving, the captured images have great angle difference, and in order to adapt to the identification and positioning of different angles, the captured logistics vehicle images need to be subjected to rotation transformation, so that vehicle characteristic information of various angles is generated;

setting the center of the logistics vehicle image as a rotation center O (0,0), recording the anticlockwise rotation angle of the logistics vehicle image as theta, and changing any pixel point P (x, y) in the image into P after rotation conversion₁(x₁,y₁) The rotation process is then represented by:

in order to increase the diversity of data samples and enable the feature positioning network to be suitable for complex illumination environments, the saturation of the logistics vehicle images is adjusted;

the specific flow of adjusting the image saturation is as follows:

(S1) calculating a pixel extremum on the logistics vehicle image;

wherein rgbMax denotes a pixel maximum value and rgbMin denotes a pixel minimum value;

(S2) saturation calculation;

the saturation S is calculated as follows:

delta＝(rgbMax-rgbMin)/255 (6)

value＝(rgbMax+rgbMin)/255 (7)

L＝value/2 (8)

(S3) adjusting the logistics vehicle image saturation;

The value of (c):

updating

The value of (c):

and (3) adjusting the saturation:

RGB'＝RGB+(RGB-L*255)*α (12)

2. if the parameter β <0, then:

RGB'＝L*255+(RGB-L*255)*(1+α) (13)

the logistics vehicle image which is subjected to scaling processing, rotation operation and saturation enhancement is applied to the following steps so as to accurately position the characteristics of the logistics vehicle;

step two, constructing a basic network model;

the method comprises the steps that a VGGNet-16 basic network is adopted as a feature extraction network and is used for classifying logistics vehicles of different vehicle types; meanwhile, in order to realize the positioning of the logistics vehicles, a target detection positioning model of an RPN network is added behind a feature extraction module in a third convolution layering of a fifth convolution layer of VGGNet-16;

the steps of constructing the basic network model are as follows:

(T1) first stage: firstly, inputting the W x H x 3 size image processed by the step one; then, carrying out convolution operation on the logistics vehicle image through two continuous convolution layers with 64 channels, wherein the convolution kernel size is 3 x 3, and the convolution step size is 2; subsequently, the maximum pooling layer pair passes through one 64-passReducing the dimension of the convolved image, wherein the size of a pooling kernel is 2 x 2, and the step length is 2; this stage outputs a sheet

A feature map of size;

(T2) second stage: the flow is the same as the first stage, namely, the image obtained in the first stage is input into a second stage network, and then a new characteristic diagram is obtained through convolution and pooling operations; but the convolution and pooling channels of the second stage are changed to 128 unlike the first stage, and other parameters are the same as those of the first stage;

(T3) third stage: firstly, inputting the image output in the second stage into the network in the third stage; then, carrying out convolution operation on the image through three continuous convolution layers with 256 channels, wherein the convolution kernel size is 3 x 3, and the convolution step size is 2; then, reducing the dimension of the convolved image through a maximum pooling layer of 256 channels, wherein the size of a pooling kernel is 2 x 2, and the step size is 2;

(T4) fourth stage: the process is the same as the third stage, namely, the image obtained in the third stage is input into a fourth stage network, and then a new characteristic diagram is obtained through convolution and pooling operations; but the difference from the third stage is that the convolution and pooling channels of the fourth stage are both changed to 512, and other parameters are the same as those of the third stage;

(T5) fifth stage: the stage consists of three convolution layers, each convolution layer has 512 channels, the convolution kernel size is 3 x 3, and the convolution step size is 2; the signature size output at this stage is

(T7) seventh stage: firstly, connecting two 4096-channel full-connection layers; then connecting an overall loss function; finally, outputting an accurate logistics vehicle positioning boundary frame and the probability of the corresponding vehicle type;

in the above infrastructure network structure, the design of parameters related to the activation function and the loss function specifically includes:

(P1) in the VGGNet-16 basic network, for all activation functions connected after the convolutional layer, the ReLu activation function is used:

ReLu(x)＝max(0,x) (14)

(P2) in the sixth stage, using the classification loss function and bounding box regression loss function:

loss of classification L_{rpn_cls}Expressed as:

representing the probability that the real frame corresponding to the frame is a logistics vehicle or a background, if the real frame is a foreground, recording the probability as 1, and otherwise, recording the probability as 0;

bounding box regression loss L_{rpn_box}Expressed as:

Four-dimensional position information representing a real frame, denoted

The function is represented as follows:

(P3) in the seventh stage, the overall loss function design rule is: the frame regression loss adopts a logistic regression mapping method; the classification loss adopts a gradient descent method; in the process of loss function reduction, Adam gradient reduction method is used for optimizing the loss function, and corresponding parameters are set to be alpha 0.001 and beta₁＝0.9、β₂0.999 and 10E-8;

(P4) during the training process, the learning rate adjustment strategy adopts a multi-stage attenuation method;

the logistics vehicle image is processed by the basic network model in the second step to obtain more boundary frames on the same logistics vehicle, so that a method needs to be introduced to screen out the redundant boundary frames; the specific operation flow is as follows:

S_i＝w_i*h_i (18)

(Q2) in the sixth stage of the basic network model, determining the frame information and classification information belonging to the logistic vehicle or background through regression; for each real logistics vehicle, the corresponding boundary frames are more, the real logistics vehicles are sorted from large to small according to the probability, and the boundary frame with the maximum probability is screened out;

(Q3) circularly calculating the area intersection ratio I of the screened boundary frame and the rest boundary frames, if I is larger than a preset threshold value, determining that the boundary frame is heavily overlapped with the boundary frame screened in the step (Q2), and then deleting the boundary frame until all the boundary frames in the step (Q2) are processed;

if and only if the condition is satisfied

The calculation formula of the intersection-to-parallel ratio I is as follows:

if and only if the condition is satisfied

The calculation formula of the intersection-to-parallel ratio I is as follows:

At the same time must

if the constraint in equation (19) is changed to the form of equation (21), the subscript in equation (19) must also be changed accordingly;

otherwise, I is 0, which means that the two bounding boxes do not intersect, both are retained;

in the expressions (19) to (22), max represents the maximum bounding box whose area is denoted by S_max(ii) a oti represents any other bounding box with area denoted S_oti(ii) a The intersection area between the two bounding boxes is denoted S_ovp；(x_max,y_max,w_max,h_max) Four-dimensional position information, namely a center coordinate, a frame width and a frame height, of the screened maximum bounding box is represented; (x)_oti,y_oti,w_oti,h_oti) Four-dimensional position information representing any remaining bounding box;

in order to solve the problem that the dimensions of subsequent connection layers are not matched due to different characteristics of the border of the non-maximum-value restrained border, a region-of-interest pooling layer is connected after the loss function of the sixth stage, and the borders with different characteristics are subjected to unified normalization; the specific operation flow is as follows:

(M2) dividing the quantized bounding box equally into maximum pooling of 4 x 4, 2 x 2, 1 x 1, forming fixed length data dimensions;

2. The method for improving the feature localization of the master R-CNN-based logistics vehicles as claimed in claim 1, wherein: the threshold value of the area intersection ratio I in the step (Q3) is 0.7.

3. The method for improving the feature localization of the master R-CNN-based logistics vehicles as claimed in claim 1, wherein: step 1.1) sets the scaling factor μ to 0.84, and stops scaling when the short edge pixel is less than 100 pix.