CN111986080B

CN111986080B - Logistics vehicle feature positioning method based on improved master R-CNN

Info

Publication number: CN111986080B
Application number: CN202010690178.9A
Authority: CN
Inventors: 张烨; 樊一超; 陈威慧
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2020-07-17
Filing date: 2020-07-17
Publication date: 2024-01-16
Anticipated expiration: 2040-07-17
Also published as: CN111986080A

Abstract

A method for improved faster R-CNN based logistic vehicle feature localization, comprising: step one, carrying out image enhancement processing on logistics vehicles; processing the logistics vehicle image by introducing a data enhancement means; step two, constructing a basic network model; adopting VGGNet-16 basic network as characteristic extraction network; meanwhile, in order to realize the positioning of the logistics vehicles, a target detection positioning model of an RPN network is added behind a feature extraction module in a third convolution layering of a fifth convolution layer of VGGNet-16; thirdly, screening a logistics vehicle target by using a non-maximum suppression algorithm; step four, unified normalization is carried out on the object characteristics of the logistics vehicles; and (3) transmitting the obtained feature map of the fixed dimension data to a seventh stage of the basic network model to obtain the accurate logistics vehicle positioning boundary frame and the probability of the corresponding vehicle type. The invention has good characteristic positioning performance on logistics vehicles in different environments and scenes.

Description

Logistics vehicle feature positioning method based on improved master R-CNN

Technical Field

The invention relates to a logistic vehicle characteristic positioning method based on improved faster R-CNN.

Technical Field

In recent years, with the development of traffic logistics, more and more logistics vehicles serve the work and the life of people, but the problem is caused by the fact that too many logistics engineering vehicles lead to the increase of the difficulty coefficient of vehicle parking management in a park. Although operations such as pulling and throwing of the logistics vehicle can improve the running efficiency of cargo loading, the problems that the logistics vehicle occupies a parking space unreasonably, the pulling and throwing cannot charge accurately and the like exist at present, and more serious is that some car owners have extremely dangerous behaviors such as fake license and the like in order to avoid monitoring detection.

In order to effectively solve the management problem of logistics engineering vehicles, a plurality of examples of identifying logistics vehicles of different vehicle types by adopting technical means such as computer vision and the like exist today, and the identification method is mostly to obtain images of the vehicle types from a traffic intersection camera or an image acquisition card, and because the images acquired by the traffic videos are the positions of the vehicles in a certain position in the natural environment, namely, the accurate positions of the vehicles in the images are found, and then the characteristic extraction operation of the vehicles is carried out on the vehicles, so that the identification of the vehicle types is achieved. However, the current recognition method mainly has the following difficulties in vehicle type recognition: (1) The recognition effect on the vehicle type is greatly influenced under different illumination conditions, and the situation of wrong recognition can be caused by different visual experiences of the same vehicle in sunny days, rainy days, snowy days and other environments; (2) The scene where the vehicle is located is complex and changeable, for example, in the scene with complex background such as a rural way, the foreground and the background cannot be separated quickly and accurately; (3) The appearance of the vehicle model is changeable, and the appearance of different vehicle models comprises various parameters, such as color, shape, brand, size and the like, which can influence the recognition of the characteristics of the vehicle. In short, at present, the characteristic recognition of the logistics vehicles by utilizing computer vision still has the influence of uncertainty factors such as environment, scene, appearance and the like, so that the problem of difficult recognition is caused.

Disclosure of Invention

In order to overcome the defects in the prior art, the invention provides a method for positioning the characteristics of a logistics vehicle based on an improved master R-CNN, aiming at the management problem of the logistics engineering vehicle and the problem that the traditional recognition method is difficult to recognize due to uncertainty factors such as environment, scene, appearance and the like.

The method comprises the steps of carrying out data enhancement on the logistics vehicle image to enable the sample image to increase scene diversity; then, constructing a basic network model by using the improved faster-CNN; then, a non-maximum suppression algorithm is introduced to screen a logistics vehicle target boundary box; and finally, unified normalization is carried out on the object characteristics of the logistics vehicle, so that accurate positioning is realized.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

a method for logistic vehicle feature localization based on improved faster R-CNN, comprising the steps of:

step one, carrying out image enhancement processing on logistics vehicles;

for the problems of fixed shooting angle, single background, lower detection rate and the like of logistics vehicles, the invention introduces a data enhancement means, and the logistics vehicle images are processed through operations such as multi-scale equal-scale scaling, image rotation, enhanced saturation and the like, so that the scene diversity is increased, and the logistics vehicle images are used for further identification and positioning.

1.1 Multi-scale scaling operation is carried out on the logistics vehicle;

the object flow vehicle image is scaled in multiple scales according to the principle of not damaging the specific aspect ratio of the object flow vehicle in the original image, so that the positioning network can learn the object features with specific proportions.

Suppose that the pixel before scaling of a certain flow vehicle is marked as A ₀ (x ₀ ,y ₀ ) The scaled seat is marked as A ₁ (x ₁ ,y ₁ ) Then A ₀ And A is a ₁ The relation is satisfied:

(x ₁ ,y ₁ )＝(μx ₀ ,μy ₀ ) (1)

where μ represents a scaling factor. The above equation corresponds to the image scaling matrix, expressed as the following matrix:

wherein when μ > 1, an image enlarging operation is represented; when μ < 1, an image reduction operation is indicated.

1.2 Rotating the logistics vehicle image;

when the camera shoots a logistics vehicle in quick running, the phenomenon that the angle difference of the captured images is extremely large is caused, and in order to adapt to the recognition and positioning of different angles, the captured logistics vehicle images are required to be subjected to rotary transformation, so that vehicle characteristic information of various angles is generated.

The invention sets the center of the logistics vehicle image as the rotation center O (0, 0), the anticlockwise rotation angle is recorded as theta, and when any pixel point P (x, y) in the image is changed into P after rotation conversion ₁ (x ₁ ,y ₁ ) The rotation process is represented by the following equation:

the above formula is a polar transformation formula, which corresponds to the image rotation matrix and is expressed as the following matrix:

1.3 Performing saturation enhancement operation on the logistics vehicle image;

in order to increase the diversity of data samples and enable the feature positioning network to be suitable for complex illumination environments, the saturation of the logistics vehicle image is adjusted.

The specific flow of adjusting the image saturation is as follows:

(S1) calculating a pixel extremum on the logistics vehicle image;

where rgbMax represents a pixel maximum value and rgbMin represents a pixel minimum value.

(S2) saturation calculation;

the saturation S is calculated as follows:

delta＝(rgbMax-rgbMin)/255 (6)

value＝(rgbMax+rgbMin)/255 (7)

L＝value/2 (8)

(S3) adjusting the logistics vehicle image saturation;

a saturation parameter beta is set for adjusting the illumination intensity, and the calculation flow is as follows:

1. if the parameter beta is more than or equal to 0, firstly, obtaining an intermediate variableIs the value of (1):

updatingIs the value of (1):

adjusting the saturation:

RGB'＝RGB+(RGB-L*255)*α (12)

2. if the parameter β <0, then:

RGB'＝L*255+(RGB-L*255)*(1+α) (13)

the logistics vehicle image subjected to scaling, rotation operation and saturation enhancement is applied to the following steps so as to accurately position logistics vehicle characteristics.

Step two, constructing a basic network model;

although three basic networks are provided by the faster-CNN for feature extraction, in order to obtain a better feature extraction effect, the VGGNet-16 basic network is adopted as the feature extraction network for classifying logistics vehicles of different vehicle types. Meanwhile, in order to realize the positioning of the logistics vehicles, the invention adds an object detection positioning model of the RPN network after the feature extraction module in the third convolution layering of the fifth convolution layer of VGGNet-16.

The detailed design flow of the basic network model constructed by the invention is as follows:

(T1) first stage: firstly, inputting an image with the size of W, H and 3 processed in the step one; then carrying out convolution operation on the logistics vehicle image through two continuous 64-channel convolution layers, wherein the convolution kernel size is 3*3, and the convolution step length is 2; the convolved image is then reduced in dimension by a 64-channel max pooling layer with a pooling kernel size of 2 x 2 and a step size of 2. At this stage output a sheetA feature map of size.

(T2) second stage: the flow is the same as the first stage, namely the image obtained in the first stage is input into a second stage network, and then a new characteristic diagram is obtained through convolution and pooling operation. But unlike the first stage, the convolution and pooling channels of the second stage are each changed to 128, and the other parameters are the same as the first stage.

(T3) third stage: firstly, inputting the image output by the second stage into a network of a third stage; then, carrying out convolution operation on the image through three continuous 256-channel convolution layers, wherein the convolution kernel size is 3*3, and the convolution step length is 2; the convolved image is then reduced in dimension by a 256-channel max pooling layer with a pooling kernel size of 2 x 2 and a step size of 2.

(T4) fourth stage: the flow is the same as the third stage, namely the image obtained in the third stage is input into a fourth stage network, and then a new characteristic diagram is obtained through convolution and pooling operation. But unlike the third stage, the convolution and pooling channels of the fourth stage are each changed to 512, and the other parameters are the same as the third stage.

(T5) fifth stage: this stage consists of three convolutional layers, each with 512 channels, with a convolution kernel size of 3*3 and a convolution step size of 2. The feature map output at this stage has a size of

(T6) sixth stage: firstly, connecting a convolution layer with a convolution kernel size of 3*3, a convolution step length of 2 and a convolution channel of 512; then connecting a classification loss function and a frame regression loss function, and carrying out regression judgment on frame information and classification information (the probability of most likely being a certain vehicle type) belonging to logistics vehicles or backgrounds;

(T7) seventh stage: firstly, connecting two full connection layers of 4096 channels; then connecting a total loss function; and finally outputting the accurate logistics vehicle positioning boundary frame and the probability of the corresponding vehicle type.

In the above-mentioned basic network structure, the design of parameters such as some activation functions, loss functions and the like is designed, and will be described in detail below:

(P1) in VGGNet-16 basic network, regarding the activation function of the connection after all convolution layers, reLu activation functions are used in the invention:

ReLu(x)＝max(0,x) (14)

(P2) in the sixth stage, the classification loss function and the rim regression loss function used are designed as follows:

classification loss L _{rpn_cls} Expressed as:

wherein p is _i Representing the probability that the frame is a logistics vehicle or a background;and (5) representing the probability that the real frame corresponding to the frame is a logistics vehicle or a background, if the real frame is a foreground, marking the real frame as 1, and otherwise marking the real frame as 0.

Boundary box regression loss L _{rpn_box} Expressed as:

wherein t is _i Four-dimensional position information representing predicted border i, denoted t _i (x _i ,y _i ,w _i ,h _i )；Four-dimensional position information representing a real frame, denoted +.> The function is expressed as follows:

(P3) in the seventh stage, the overall loss function design principle is: the frame regression loss adopts a logistic regression mapping method; the classification loss adopts a gradient descent method; in the process of decreasing the loss function, an Adam gradient decreasing method is used for optimizing the loss function, and corresponding parameters are set to alpha=0.001 and beta ₁ ＝0.9、β ₂ =0.999 and ε=10e-8.

(P4) during training, the learning rate adjustment strategy of the present invention employs a multi-stage decay method.

Thirdly, screening a logistics vehicle target by using a non-maximum suppression algorithm;

the number of boundary boxes on the same logistics vehicle obtained after the logistics vehicle image is processed by the basic network model in the second step is large, so that a method is required to be introduced to screen out redundant boundary boxes. The specific operation flow is as follows:

(Q1) four-dimensional position information (x) according to the prediction boundary box _i ,y _i ,w _i ,h _i ) The area S of all predicted frames of each vehicle in the logistics vehicle image can be obtained _i ；

S _i ＝w _i *h _i (18)

(Q2) in the sixth stage of the basic network model, determining the border information and the classification information belonging to the logistics vehicle or the background through regression. For each real logistics vehicle, the corresponding bounding boxes are more, the real logistics vehicles are ranked according to the probability from big to small, and one bounding box with the highest probability is screened out;

and (Q3) circularly calculating the area intersection ratio I of the screened bounding box and the rest bounding boxes, if the I is larger than a preset threshold value (the default threshold value is 0.7), determining that the bounding box is overlapped with the bounding box screened in the step (Q2) in a heavy manner, and deleting the bounding box until all the bounding boxes in the step (Q2) are processed.

If and only if the condition is satisfiedWhen the cross ratio I is calculated as follows:

the subscripts in the above formulas (19) - (20) and their constraints can be defined byAt the same time mustTaking the constraint in formula (19) as an example, it can be changed into the following form:

if the constraint in formula (19) is changed to the form of formula (21), the subscript in formula (19) must be changed accordingly.

When the constraint conditions in the formula (19) and the formula (20) are satisfied, the calculation formula of the cross ratio I is:

otherwise, i=0, meaning that the two bounding boxes do not intersect, both remain.

In the formulae (19) - (22), max represents the maximum bounding box, and its area is denoted as S _max The method comprises the steps of carrying out a first treatment on the surface of the oti represents any of the remaining bounding boxes, the area of which is denoted S _oti The method comprises the steps of carrying out a first treatment on the surface of the The area of intersection between two bounding boxes is denoted S _ovp ；(x _max ,y _max ,w _max ,h _max ) Four-dimensional position information of the maximum screened boundary frame, namely center coordinates, frame width and frame height; (x) _oti ,y _oti ,w _oti ,h _oti ) Four-dimensional position information representing any of the remaining bounding boxes.

Step four, unified normalization is carried out on the object characteristics of the logistics vehicles;

in order to solve the problem of unmatched dimensions of subsequent connection layers caused by different edge characteristics of boundary frames after non-maximum suppression, the invention connects a region-of-interest pooling layer after a loss function in a sixth stage, and performs unified normalization on the boundary frames with different edge characteristics. The specific operation flow is as follows:

(M1) quantizing four-dimensional position information of a boundary box on the logistics vehicle image obtained in the step three into integer array coordinates;

(M2) dividing the quantized bounding box average into maximum pooling of 4*4, 2 x 2, 1*1, forming a fixed length data dimension.

And (3) transmitting the obtained feature map of the fixed dimension data to a seventh stage of the basic network model to obtain the accurate logistics vehicle positioning boundary frame and the probability of the corresponding vehicle type.

Preferably, in step 1.1), in order to reduce the cost of model operation, the present invention sets the scaling factor μ=0.84, and stops scaling when the short-side pixels are less than 100 pix.

Preferably, the area overlap ratio I in step (Q3) has a threshold value of 0.7.

The invention has the advantages that:

the invention provides a method for positioning characteristics of a logistics vehicle based on an improved master R-CNN (factory R-CNN) aiming at the management problem of the logistics engineering vehicle and the problem that the traditional recognition method is difficult to recognize due to uncertainty factors such as environment, scene and appearance. Firstly, carrying out data enhancement on a logistics vehicle image to enable a sample image to increase scene diversity; then, constructing a basic network model by using the improved master R-CNN; then, a non-maximum suppression algorithm is introduced to screen a logistics vehicle target boundary box; and finally, unified normalization is carried out on the object characteristics of the logistics vehicle, so that accurate positioning is realized. Therefore, the characteristic positioning performance of the logistics vehicle is superior to that of the traditional vehicle detection method under the conditions of different environments, scenes and the like, the problem of logistics engineering vehicle management in a park can be well solved, and the logistics vehicle detection method has certain practical value and application prospect.

Drawings

FIG. 1 is a schematic view of a physical distribution vehicle image zoom according to the present invention;

FIG. 2 is a schematic diagram of the image rotation of a logistics vehicle in accordance with the present invention;

FIG. 3 is a diagram of the basic network architecture of the present invention;

FIG. 4 is a comparison of the logistic vehicle image before and after processing by using the non-maximum algorithm designed by the invention; wherein, fig. 4a is an image before non-maximum suppression, and fig. 4b is an image after non-maximum suppression;

FIG. 5 is a unified normalization flow chart for target features of the present invention;

fig. 6 is a technical roadmap of the invention.

Detailed Description

The technical scheme of the invention is further described below with reference to the accompanying drawings.

In order to overcome the defects in the prior art, the invention provides a method for positioning the characteristics of a logistics vehicle based on an improved master R-CNN, aiming at the management problem of the logistics engineering vehicle and the problem that the traditional recognition method is difficult to recognize due to uncertainty factors such as environment, scene, appearance and the like. Firstly, carrying out data enhancement on a logistics vehicle image to enable a sample image to increase scene diversity; then, constructing a basic network model by using the improved faster-CNN; then, a non-maximum suppression algorithm is introduced to screen a logistics vehicle target boundary box; and finally, unified normalization is carried out on the object characteristics of the logistics vehicle, so that accurate positioning is realized.

a method for logistic vehicle feature localization based on improved faster-CNN, comprising the steps of:

step one, carrying out image enhancement processing on logistics vehicles;

1.1 Multi-scale scaling operation is carried out on the logistics vehicle;

(x ₁ ,y ₁ )＝(μx ₀ ,μy ₀ ) (1)

wherein when μ > 1, an image enlarging operation is represented; when μ < 1, an image reduction operation is indicated. In order to reduce the cost of model operation, the present invention sets the scaling factor μ=0.84, and stops scaling when the short-side pixels are less than 100 pix.

1.2 Rotating the logistics vehicle image;

1.3 Performing saturation enhancement operation on the logistics vehicle image;

The specific flow of adjusting the image saturation is as follows:

(S1) calculating a pixel extremum on the logistics vehicle image;

(S2) saturation calculation;

the saturation S is calculated as follows:

delta＝(rgbMax-rgbMin)/255 (6)

value＝(rgbMax+rgbMin)/255 (7)

L＝value/2 (8)

(S3) adjusting the logistics vehicle image saturation;

updatingIs the value of (1):

adjusting the saturation:

RGB'＝RGB+(RGB-L*255)*α (12)

2. if the parameter β <0, then:

RGB'＝L*255+(RGB-L*255)*(1+α) (13)

Step two, constructing a basic network model;

ReLu(x)＝max(0,x) (14)

classification loss L _{rpn_cls} Expressed as:

Boundary box regression loss L _{rpn_box} Expressed as:

S _i ＝w _i *h _i (18)

the subscripts in the above formulae (19) - (20) and their constraints can be defined byAt the same time must->Taking the constraint in formula (19) as an example, it can be changed into the following form:

The embodiments described in the present specification are merely examples of implementation forms of the inventive concept, and the scope of protection of the present invention should not be construed as being limited to the specific forms set forth in the embodiments, and the scope of protection of the present invention and equivalent technical means that can be conceived by those skilled in the art based on the inventive concept.

Claims

1. A method for logistic vehicle feature localization based on improved faster-CNN, comprising the steps of:

step one, carrying out image enhancement processing on logistics vehicles;

introducing a data enhancement means, and processing the logistics vehicle image through operations of multi-scale equal-scale scaling, image rotation and saturation enhancement, so as to increase scene diversity of the logistics vehicle image for further identification and positioning;

1.1 Multi-scale scaling operation is carried out on the logistics vehicle;

the method has the advantages that the principle that the specific length-width ratio of the logistics vehicle in the original image is not destroyed is adopted, the logistics vehicle image is scaled in multiple scales, and the positioning network can learn target features in specific proportions;

(x ₁ ,y ₁ )＝(μx ₀ ,μy ₀ ) (1)

wherein μ represents a scaling factor; the above equation corresponds to the image scaling matrix, expressed as the following matrix:

wherein when μ > 1, an image enlarging operation is represented; when μ < 1, an image reduction operation is represented;

1.2 Rotating the logistics vehicle image;

when a camera shoots a logistics vehicle in quick running, the phenomenon that the angle difference of the captured images is extremely large is caused, and in order to adapt to the identification and positioning of different angles, the captured logistics vehicle images are required to be subjected to rotary transformation so as to generate vehicle characteristic information of various angles;

setting the center of the logistics vehicle image as a rotation center O (0, 0), recording the anticlockwise rotation angle as theta, and changing any pixel point P (x, y) in the image into P after rotation conversion ₁ (x ₁ ,y ₁ ) The rotation process is represented by the following equation:

1.3 Performing saturation enhancement operation on the logistics vehicle image;

in order to increase the diversity of data samples, the characteristic positioning network can be suitable for a complex illumination environment, and the saturation of the logistics vehicle image is adjusted;

the specific flow of adjusting the image saturation is as follows:

(S1) calculating a pixel extremum on the logistics vehicle image;

wherein rgbMax represents a maximum value of the pixel, and rgbMin represents a minimum value of the pixel;

(S2) saturation calculation;

the saturation S is calculated as follows:

delta＝(rgbMax-rgbMin)255 (6)

value＝(rgbMax+rgbMin)/255 (7)

L＝value/2 (8)

(S3) adjusting the logistics vehicle image saturation;

updatingIs the value of (1):

adjusting the saturation:

2. if the parameter β <0, then:

the logistics vehicle image subjected to scaling treatment, rotation operation and saturation enhancement is applied to the following steps so as to accurately position the logistics vehicle characteristics;

step two, constructing a basic network model;

the VGGNet-16 basic network is used as a feature extraction network for classifying logistics vehicles of different vehicle types; meanwhile, in order to realize the positioning of the logistics vehicles, a target detection positioning model of an RPN network is added behind a feature extraction module in a third convolution layering of a fifth convolution layer of VGGNet-16;

the steps of constructing the basic network model are as follows:

(T1) first stage: firstly, inputting an image with the size of W, H and 3 processed in the step one; then carrying out convolution operation on the logistics vehicle image through two continuous 64-channel convolution layers, wherein the convolution kernel size is 3*3, and the convolution step length is 2; then, the convolved image is subjected to dimension reduction through a maximum pooling layer of 64 channels, wherein the pooling kernel size is 2 x 2, and the step length is 2; at this stage output a sheetA feature map of size;

(T2) second stage: the flow is the same as the first stage, namely, the image obtained in the first stage is input into a second stage network, and then a new characteristic diagram is obtained through convolution and pooling operation; but the difference from the first stage is that the convolution and pooling channels of the second stage are both 128, and other parameters are the same as those of the first stage;

(T3) third stage: firstly, inputting the image output by the second stage into a network of a third stage; then, carrying out convolution operation on the image through three continuous 256-channel convolution layers, wherein the convolution kernel size is 3*3, and the convolution step length is 2; then, the convolved image is subjected to dimension reduction through a maximum pooling layer of 256 channels, wherein the pooling kernel size is 2 x 2, and the step length is 2;

(T4) fourth stage: the flow is the same as the third stage, namely, the image obtained in the third stage is input into a fourth stage network, and then a new feature map is obtained through convolution and pooling operation; but the difference from the third stage is that the convolution and pooling channels of the fourth stage are changed to 512, and other parameters are the same as those of the third stage;

(T5) fifth stage: the phase consists of three convolution layers, each convolution layer has 512 channels, the convolution kernel size is 3*3, and the convolution step length is 2; the feature map output at this stage has a size of(T6) sixth stage: firstly, connecting a convolution layer with a convolution kernel size of 3*3, a convolution step length of 2 and a convolution channel of 512; then connecting a classification loss function and a frame regression loss function, and carrying out regression judgment on frame information and classification information belonging to logistics vehicles or backgrounds, wherein the classification information is the probability that the logistics vehicles displayed by the images are most likely to be of a certain vehicle type;

(T7) seventh stage: firstly, connecting two full connection layers of 4096 channels; then connecting a total loss function; finally, outputting an accurate logistics vehicle positioning boundary frame and the probability of the corresponding vehicle type;

in the above basic network structure, the design of parameters related to the activation function and the loss function specifically includes:

(P1) in VGGNet-16 base network, the ReLu activation function is used for all the activation functions of the convolutional layer post-connection:

ReLu(x)＝max(0,x) (14)

(P2) in the sixth stage, using the classification loss function and the bounding box regression loss function:

classification loss L _{rpn_cls} Expressed as:

wherein p is _i Representing the probability that the frame is a logistics vehicle or a background;representing the probability that the real frame corresponding to the frame is a logistics vehicle or a background, if the real frame is a foreground, marking the real frame as 1, otherwise marking the real frame as 0;

boundary box regression loss L _{rpn_box} Expressed as:

(P3) in the seventh stage, the overall loss function design principle is: the frame regression loss adopts a logistic regression mapping method; the classification loss adopts a gradient descent method; in the process of decreasing the loss function, an Adam gradient decreasing method is used for optimizing the loss function, and the corresponding parameter is set to alpha=0.001,β ₁ ＝0.9、β ₂ =0.999 and epsilon=10e-8;

(P4) in the training process, the learning rate adjustment strategy adopts a multi-stage attenuation method;

the number of the boundary boxes on the same logistics vehicle obtained after the logistics vehicle image is processed by the basic network model in the second step is multiple, so that a method is required to be introduced to screen out redundant boundary boxes; the specific operation flow is as follows:

S _i ＝w _i *h _i (18)

(Q2) in a sixth stage of the basic network model, determining frame information and classification information belonging to the logistics vehicle or the background through regression; for each real logistics vehicle, a plurality of corresponding bounding boxes are arranged, the bounding boxes are ordered according to the probability from big to small, and one bounding box with the highest probability is screened out;

(Q3) circularly calculating the area intersection ratio I of the screened bounding box and the rest bounding boxes, if the I is larger than a preset threshold value, determining that the bounding box is seriously overlapped with the bounding box screened in the step (Q2), and deleting the bounding box until all the bounding boxes in the step (Q2) are processed;

if and only if the condition is satisfied

When the cross ratio I is calculated as follows:

if and only if the condition is satisfied

When the cross ratio I is calculated as follows:

the subscripts in the above formulae (19) - (20) and their constraints can be defined byAt the same time must->Taking the constraint in formula (19) as an example, it becomes the following form:

if the constraint in formula (19) changes to the form of formula (21), then the subscript in formula (19) must also be changed accordingly;

otherwise, i=0, meaning that the two bounding boxes do not intersect, both remain;

in the formulae (19) - (22), max represents the maximum bounding box, and its area is denoted as S _max The method comprises the steps of carrying out a first treatment on the surface of the oti represents any of the remaining bounding boxes, the area of which is denoted S _oti The method comprises the steps of carrying out a first treatment on the surface of the The area of intersection between two bounding boxes is denoted S _ovp ；

(x _max ,y _max ,w _max ,h _max ) Representing the largest bounding box screenedFour-dimensional position information, namely center coordinates, frame width and frame height; (x) _oti ,y _oti ,w _oti ,h _oti ) Four-dimensional position information representing any one of the remaining bounding boxes;

in order to solve the problem of unmatched dimensions of subsequent connecting layers caused by different edge characteristics of the boundary frames after non-maximum suppression, after a loss function in a sixth stage, a region-of-interest pooling layer is connected, and unified normalization is carried out on the boundary frames with different edge characteristics; the specific operation flow is as follows:

(M2) dividing the quantized bounding box into a maximum pooling of 4*4, 2 x 2, 1*1 on average, forming a fixed length data dimension;

2. A method of improving the positioning of a feature of a faster R-CNN based logistics vehicle as set forth in claim 1, wherein: the threshold of the area intersection ratio I in the step (Q3) is 0.7.

3. A method of improving the positioning of a feature of a faster R-CNN based logistics vehicle as set forth in claim 1, wherein: the scaling factor μ=0.84 is set in step 1.1), and scaling is stopped when the short-side pixel is less than 100 pix.