CN113989604A

CN113989604A - Tire DOT information identification method based on end-to-end deep learning

Info

Publication number: CN113989604A
Application number: CN202111370406.5A
Authority: CN
Inventors: 蔡念; 李嘉豪; 何兆泉; 罗智浩; 王晗
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2021-11-18
Filing date: 2021-11-18
Publication date: 2022-01-28
Anticipated expiration: 2041-11-18
Also published as: CN113989604B

Abstract

The invention discloses a tire DOT information identification method based on end-to-end deep learning, which comprises the following steps: carrying out feature extraction on the tire image to respectively obtain first feature maps output in N stages, and simultaneously carrying out feature fusion on the first feature maps output in the N stages to obtain a second feature map; performing DOT information rough positioning on the fused second feature map, and detecting whether three characters of DOT and position information thereof exist or not to obtain a regional map; generating a mask image from the area image, multiplying the mask image by the second feature image, finely positioning DOT information on a third feature image obtained by multiplying, obtaining DOT information text probability and position information, and positioning to a candidate text block with an angle; detecting the bending direction of the tire on the first characteristic diagram output in the last stage to acquire character direction information of the tire tread of the tire; affine transformation is carried out on the candidate text blocks and the character direction information of the tire tread, and the candidate text blocks and the character direction information are converted into horizontal text blocks with upward directions; and inputting the horizontal text block into a text recognition network based on deep learning to perform DOT character recognition, so as to obtain final recognition information.

Description

Tire DOT information identification method based on end-to-end deep learning

Technical Field

The invention relates to the technical field of image recognition, in particular to a tire DOT information recognition method based on end-to-end deep learning.

Background

Tire DOT information carries product information for its tire manufacturer and is very important to the manufacturer. The factory needs to determine information such as the origin information, factory code, and date of manufacture of the tire from the DOT information of the recovered tire. In the automobile manufacturing industry, the process on each relevant tire needs to read and match the tread information, and if the identification information is wrong with the actual condition, the result that the estimation is impossible can be caused. If rely on artifical the detection, the speed is very slow, still needs a large amount of manpowers, and long-time work brings visual fatigue and also can make the rate of accuracy decline. There is therefore a need for an automatic detection system for detecting tire DOT information.

The existing methods for locating and identifying the DOT information of the tire mostly adopt the traditional image processing method to detect the target, as follows:

(1) there are conventional methods of image processing. When the characters are positioned through template matching, the abrasion of the die causes the surrounding fins of the characters to cover the character areas, the matching of template pictures is wrong, and the like, so that the positioning is wrong. If the distance between the embossed characters is small, the embossed characters can not be well segmented because the projection curve presents a trough which is not obvious enough, and the identification accuracy is affected. The Least Square Support Vector Machine (LSSVM) training is based on a single character, the types of the tire DOT information characters are numerous, the individual characters need to be manually segmented to serve as a data set, and the training is very tedious and has large workload. If the boundary of the segmented character is not good, the trained model is influenced, and the recognition effect is further influenced.

Meanwhile, such conventional image-based processing methods involve image preprocessing, character positioning, character segmentation, and character recognition, which require artificial setting of thresholds. Due to the influence of factors such as the shooting environment and the material of the tire, the threshold values cannot be well applied, and the accuracy of tire character detection and identification is further influenced.

(2) The method combines the existing deep learning and the traditional image processing. Extracting concentric circles through Hough circle transformation, and unfolding the obtained concentric circles into rectangles, namely unfolding the tire surface into a rectangular tire tread. And finally, carrying out target detection on the DOT character on the tire by a deep learning Faster R-CNN method. Here, only the tire DOT information is subjected to position detection and then recognized. The stay is that after the tire text information is detected by the positioning network, the tire text information is identified by the text identification network, and the tire text information and the text identification network are respectively and independently trained. Such a stepwise approach involves a number of cumbersome steps and error accumulation, resulting in poor tire DOT information identification performance.

Disclosure of Invention

The invention provides a tire DOT information identification method based on end-to-end deep learning, aiming at solving the problem of low identification accuracy rate in the prior art, and the identification accuracy rate of the tire DOT information can be effectively improved.

In order to achieve the purpose of the invention, the technical scheme is as follows:

a tire DOT information identification method based on end-to-end deep learning comprises the following steps:

s1: carrying out feature extraction on the tire image acquired with the tire DOT information to respectively obtain first feature maps output in N stages, and simultaneously carrying out feature fusion on the first feature maps output in the N stages to obtain a second feature map;

s2: performing DOT information rough positioning on the fused second feature map, and detecting whether three characters of DOT and position information thereof exist or not so as to obtain a regional map;

s3: generating a mask image from the area image, multiplying the mask image by the second feature image, and performing DOT information fine positioning on a third feature image obtained after multiplication to obtain DOT information text probability and position information so as to position a candidate text block with an angle;

s4: detecting the bending direction of the tire on the first characteristic diagram output in the last stage to acquire character direction information of the tire tread of the tire;

s5: affine transformation is carried out on the candidate text blocks obtained in the step S3 and the character direction information of the tire tread obtained in the step S4, and the candidate text blocks are converted into horizontal text blocks with upward directions;

s6: and inputting the horizontal text block into a text recognition network based on deep learning to perform DOT character recognition, so as to obtain final recognition information.

Preferably, in step S1, a feature extraction network is used to perform feature extraction, where the feature extraction network includes a ResNet-50 network and a feature pyramid network FPN;

the ResNet-50 network firstly extracts the characteristics of the tire images of which the DOT information is acquired, so as to obtain first characteristic maps C1, C2, C3 and C4 which are output in 4 stages, wherein the corresponding resolutions are 1/4,1/8,1/16 and 1/32 of the input tire images respectively;

inputting the first feature maps C1, C2, C3 and C4 into a feature pyramid network FPN for feature fusion, wherein the feature fusion is used for connecting low-level feature mapping and high-level semantic feature mapping; and respectively outputting second characteristic maps P1, P2, P3 and P4.

Further, the rough location of the DOT information is as follows:

s201: inputting the fused second feature P1 into a spatial attention module to obtain an output feature map A1;

s202: prediction is carried out through a regional suggestion network RPN, softmax classification and position regression are carried out on the feature map A1, and a regional map containing the DOT three characters and position information thereof is obtained.

Still further, the detailed steps of the fine positioning of DOT information are as follows:

s301: establishing a text detection branch for a third feature map, wherein the size of the third feature map is 1/4 the size of the collected tire image;

s302: the third feature map consists of six channels, the first channel calculates the probability that each pixel is a positive sample, the middle four channels calculate the distance between each positive sample pixel point and the upper, right, lower and left boundaries of the text box, and the last channel predicts the direction of the related boundary box;

s303: thereby generating DOT information text probabilities and location information and locating candidate text blocks for the tire image.

Still further, the detection of the tire bending direction is specifically as follows: and taking out the last layer of feature map C4 output by the ResNet-50 network, classifying through two layers of full connection layers and through class, predicting a 4-dimensional array, representing the probability of belonging to four directions, namely the upper direction, the lower direction, the left direction and the right direction of the tire bending, and obtaining the character direction information of the tire tread.

Still further, in step S5, the affine transformation is specifically as follows:

inputting the obtained character direction information of the tire tread and the obtained candidate text block with an angle as affine transformation parameters into an ROI Rotate module, and carrying out affine transformation on the text block to convert the text block into a horizontal text block with an upward direction; the process of the radiation transformation is divided into two steps:

(1) calculating affine transformation parameters through the predicted coordinates of the text candidates obtained in step S3 and the character direction information of the tire tread obtained in step S4;

(2) an affine transformation is applied to the shared feature maps of each region separately, and a normal-case horizontal feature map of the text region is obtained.

Furthermore, in order to reduce the influence of local loss of each stage on convergence, a total loss function is adopted for training to ensure effective convergence; wherein the total loss function is defined as follows:

L_total＝λ₁L_dot+λ₂L_detect+λ₃L_cls+λ₄L_rg

in the formula, L_dotA loss function representing a DOT information rough positioning stage; l is_detectA loss function representing the DOT information fine positioning stage; l is_clsA loss function representing a stage of detecting a bending direction of the tire; l is_rgA loss function representing a DOT character recognition stage; lambda [ alpha ]₁，λ₂，λ₃，λ₄Are the corresponding trade-off factors representing the contribution of the four losses to the overall loss function.

Still further, the loss function L of DOT information rough positioning stage_dotThe system consists of a classification loss function and a position regression loss function, and the formula is as follows:

in the formula, i represents an anchor frame index; p is a radical of_iRepresenting the probability that a positive sample is predicted;

representing the probability of the corresponding true value; t is t_iA candidate box representing a prediction;

representing a true label box corresponding to the positive anchor; n is a radical of_s、N_gRespectively representing the number of samples of the corresponding tasks; l is_sRepresenting a classification loss function; l is_gThe positional regression loss function is represented.

Still further, so the loss function L of the DOT information fine positioning stage_detect：

L_detect＝L_dcls+L_dreg

Wherein, | X ^ Y | represents the intersection of set X and Y, | X | and | Y | represent their element number, for cutting apart the task, | X | and | Y | represent group True and Predict mask cut separately; l is_dclsRepresents a loss of classification;

L_dregrepresenting the total Loss of coordinate regression, using the IOU Loss + cosine angle difference Loss:

L_dreg＝L_iou+L_θ

IOU Loss：

wherein the content of the first and second substances,

representing predicted geometry, R^*Is its corresponding label box;

cosine angle difference loss:

wherein the content of the first and second substances,

is a prediction of the angle of rotation, and θ^*Indicating the annotated value.

Still further, the loss function L of the tire bending direction detection stage_clsThe formula is as follows:

in the formula, a_jRepresents the jth value of the input vector T; a is_kRepresents the kth value of the input vector T; y is_jRepresenting a real tag; t represents the number of categories; s_jIs the jth value of the vector S, indicating the probability that this sample belongs to the jth class;

the loss function L of the DOT character recognition stage_rgThe formula is as follows:

where ψ represents a set of ground truth sequences; y denotes the estimated sequence and l the authentic marker sequence.

The invention has the following beneficial effects:

1. the invention combines the characteristic that the DOT information of the tire starts with the DOT character, and the DOT character is roughly positioned and found out through the DOT information. Meanwhile, a mask image with a DOT information rough position is generated and multiplied by a third feature image output by a feature extraction network, the multiplied result is used as the input of DOT fine positioning, the interference of character information outside a DOT information area is eliminated, the region of interest is extracted, and the detection accuracy is further improved.

2. The invention provides a tire DOT information identification method based on end-to-end deep learning, wherein a frame of the tire DOT information identification method is seamlessly composed of a feature extraction network module, a DOT information rough positioning module, a DOT information fine positioning module, a tire bending direction detection module, an ROI Rotate module and a DOT character identification module so as to complete different tasks. And each part of the framework is not independent, and needs to be trained through a total loss function without error accumulation.

3. Compared with the traditional text positioning algorithm, the method adopts the feature extraction network to detect the DOT information text, can extract abundant features, and improves the text detection accuracy. In the traditional text recognition method, the characters need to be segmented firstly and then recognized, and when the character spacing is small, the segmentation effect is not good, and recognition is influenced. And a text recognition network based on deep learning is introduced, so that even if the intervals among the DOT information embossed characters of the tire are very small and even the embossed characters are connected, the DOT information embossed characters can be accurately recognized.

Drawings

Fig. 1 is a schematic block diagram of a tire DOT information identification method based on end-to-end deep learning shown in embodiment 1.

Fig. 2 is a tire image of the tire DOT information acquired in example 1.

Detailed Description

The invention is described in detail below with reference to the drawings and the detailed description.

Example 1

As shown in fig. 1, a method for identifying DOT information of a tire based on end-to-end deep learning includes the following steps:

s1: the tire image of the tire DOT information acquired by the image acquisition hardware system is shown in fig. 2, a camera of the tire image system shoots a part of the tire each time, the resolution is high, and characters on the tire can be clearly displayed in the image. As can be seen from fig. 2, since the direction in which the tire is curved is different, the text information can be corrected in the positive direction based on this information. Each group of DOT information has three characters of 'DOT' in front, so that whether the image has complete DOT information or not can be judged according to the information.

In the embodiment, the tire image acquired with the tire DOT information is subjected to feature extraction to respectively obtain the first feature maps output in the N stages, and meanwhile, the first feature maps output in the N stages are subjected to feature fusion to obtain the second feature map.

In a specific embodiment, in step S1, a feature extraction network is used to perform feature extraction, where the feature extraction network includes a residual error network ResNet-50 network and a feature pyramid network FPN. The feature extraction network is a convolutional neural network.

The residual error network ResNet-50 network firstly extracts the characteristics of the tire images collected with the tire DOT information, as shown in FIG. 1, first characteristic maps C1, C2, C3 and C4 output in 4 stages are obtained, and the corresponding resolutions are 1/4,1/8,1/16 and 1/32 of the input tire images respectively.

Still further, the first feature maps C1, C2, C3 and C4 are respectively input into a feature pyramid network FPN for feature fusion, and are used for connecting low-level feature mapping and high-level semantic feature mapping; the second feature maps P1, P2, P3, P4 are output, respectively, with the corresponding resolutions 1/4,1/8,1/16,1/32, respectively, of the tire image.

And a feature extraction part, outputting the output of the last 4 stages of the residual error network ResNet-50 as the input of a feature fusion stage.

The feature fusion part has higher resolution of low-level features and contains more position and detail information, but has lower semantic property and more noise due to less convolution. The high-level features have stronger semantic information, but the resolution is very low, and the perception capability of the details is poor. The advantages of each layer can be extracted by fusing the characteristic diagram of the bottom layer and the characteristic diagram of the high layer, so that the detection and segmentation performance is improved.

S2: and roughly positioning DOT information on the fused second feature map, and detecting whether three characters of DOT and position information of the three characters exist or not so as to obtain a regional map.

In a specific embodiment, the rough location of the DOT information is as follows:

s201: inputting the fused second feature P1 into a spatial attention module to obtain an output feature map A1; in order to make the network focus more on text features, a spatial Attention module (Attention module) is added.

S202: prediction is carried out through a regional suggestion network RPN, softmax classification and position regression are carried out on the feature map A1, and a regional map containing the DOT three characters and position information thereof is obtained. After the DOT three-character position information is obtained, the DOT information area position is preliminarily positioned.

S3: and generating a mask image from the area image, multiplying the mask image by the second feature image, and finely positioning DOT information on a third feature image obtained after multiplication to obtain DOT information text probability and position information so as to position the candidate text block with an angle.

In a specific embodiment, the detailed steps of the fine positioning of DOT information are as follows:

s301: and establishing a text detection branch for a third feature map, wherein the size of the third feature map is 1/4 of the size of the acquired tire image, so that the huge calculation amount can be reduced, the positioning performance is not obviously lost, and after the DOT information is roughly positioned, the calculation of other character information of the tire is shielded, so that the prediction is more focused on the DOT information region part.

S302: the third feature map is composed of six channels, the first channel calculates the probability that each pixel is a positive sample, the middle four channels calculate the distance between each positive sample pixel point and the upper, right, lower and left boundaries of the text box, and the last channel predicts the direction of the related boundary box.

S4: obtaining the correct orientation of the text box is important because the recognition stage requires better coordinates of the text box to correctly recognize the text. As shown in fig. 2, 360 degrees of tire character direction is possible at this time. The orientation of the tire direction corresponds to the direction of the characters on the tread thereof. Therefore, the tire bending direction detection is carried out on the first characteristic diagram output in the last stage, and the character direction information of the tire tread is obtained.

The detection of the bending direction of the tire is as follows: and taking out the last layer of feature map C4 output by the ResNet-50 network, classifying through two layers of full connection layers and through class, predicting a 4-dimensional array, representing the probability of belonging to four directions, namely the upper direction, the lower direction, the left direction and the right direction of the tire bending, and obtaining the character direction information of the tire tread.

in a specific embodiment, in step S5, the affine transformation is specifically as follows:

Predicting the text label by using the region characteristics output by the ROI Rotate module; the text recognition network based on deep learning is composed of a VGG16 layer and a BLSTM layer. The input features of LSTM are reduced only twice along the width axis by sharing the convolution with the original image, taking into account the length of the tag sequence in the text region.

The multi-task framework inevitably generates inconsistent convergence, the convergence of the identification method described in the embodiment is influenced by four stages of DOT information rough positioning, DOT information fine positioning, tire bending direction detection and DOT character identification, and training needs to be carried out through a total loss function, so that error accumulation is reduced. The present embodiment analyzes the contribution of the local loss of each stage of the method to convergence through theoretical derivation and experimental comparison, which can supervise the composition of the total loss function of the method to ensure effective convergence. Wherein the total loss function is defined as follows:

L_total＝λ₁L_dot+λ₂L_detect+λ₃L_cls+λ₄L_rg

in the formula, L_dotIndicating the DOT information rough positioning; l is_detectShowing DOT information fine positioning;

L_clsindicating tire bending direction detection; l is_rgIndicating DOT character recognition; lambda [ alpha ]₁，λ₂，λ₃，λ₄Are the corresponding trade-off factors representing the contribution of the four losses to the overall loss function.

Generally, when the tire DOT information identification method based on end-to-end deep learning is trained, the contribution of local loss at each stage to the total loss function should be balanced. The nature of the training data and the loss functions dictates that the size of each local loss function may vary widely. If this amplitude difference is not processed correctly, the convergence of the frame may be biased towards one local loss function during training, while the convergence of the other local loss functions may be attenuated or even ignored.

Theoretical derivation and experimental comparison show that the initial loss value of DOT information fine positioning is at least two orders of magnitude smaller than the loss of other three stages. In order to maintain a relative balance of the four-stage losses and ensure consistent convergence, the principle of the initial configuration trade-off factor can be generalized in that the contribution of the local losses used for fine positioning of DOT information should be initially set to a value two orders of magnitude greater than the other three local losses. In practice, the contributions of L1, L3 and 43 are fixed, and only the contribution of L2 is adjusted, instead of adjusting the contributions of the four phases simultaneously. The contribution of the four phases can be set to λ₁＝0.01，λ₂＝1，λ₃＝0.01，λ₃＝0.01。

DOT information coarse positioning

In this embodiment, the loss function of the DOT information coarse positioning network is composed of two parts, namely a classification loss function and a position regression loss function, and the formula is as follows:

where i denotes an anchor frame index (anchors index); p is a radical of_iRepresents the probability that a positive sample is predicted (positive softmax probability);

a probability (Ground True prediction) representing the corresponding True value; t is t_iA candidate box representing a prediction;

representing a true label box corresponding to the positive anchor; n is a radical of_s、N_gRespectively representing the number of samples of the corresponding tasks; l is_sRepresenting a classification loss function, wherein the classification loss function is a softmax loss function; l is_gThe position regression loss function is expressed, and the position regression loss function is soomth L1 loss function.

And calculating softmax loss function, which is used for network training for classifying anchors as positive and negative.

And calculating a soomth L1 loss function used for bounding box regression network training.

DOT information fine positioning

The DOT information in the tire only accounts for a small part in the image, the Dice loss is provided for the problem that the foreground proportion is too small, and the Dice loss has the advantages of being better for the problem of category imbalance:

wherein, | X ^ Y | represents the intersection of set X and Y, | X | and | Y | represent their element number, for cutting apart the task, | X | and | Y | represent group True and Predict mask cut separately; l is_dclsIndicating a loss of classification.

Coordinate regression uses IOU Loss + cosine angle difference Loss:

IOU Loss：

wherein the content of the first and second substances,

representing predicted geometry, R^*Is its corresponding label box.

Cosine angle difference loss:

wherein the content of the first and second substances,

L_dreg＝L_iou+L_θ

so the loss function for fine positioning of DOT information:

L_detec＝L_dcls+L_dreg。

loss function L in tire bending direction detection stage_cls：

Wherein, a_jRepresents the jth value of the input vector T; a is_kRepresents the kth value of the input vector T; y is_jRepresenting a real tag; t represents the number of categories; s_jIs the jth value of the vector S, indicating the probability that this sample belongs to the jth class

DOT character recognition;

in DOT character recognition stage, a connectionist time classification loss function (CTC) is used, which is a promising loss function for deep learning of text recognition:

The identification method provided by the embodiment has the advantages and beneficial effects:

1. since the tire surface has many characters, the present embodiment only focuses on DOT information, and other characters will affect our detection results. In combination with the characteristic that the tire DOT information starts with the DOT character, the DOT character is found through the RPN network, and the DOT information is roughly positioned. Meanwhile, a mask image with a DOT information rough position is generated and multiplied by a second feature image output by a feature extraction network, the multiplied result is used as the input of DOT fine positioning, the interference of character information outside the DOT information area is eliminated, the region of interest is extracted, and the detection accuracy is further improved.

2. The prior method detects the DOT information of the tire, stays after detecting the text information of the tire by using a positioning network, and then identifies the text information of the tire by using a text identification network, wherein the DOT information and the text information are respectively and independently trained. Such a stepwise approach involves a number of cumbersome steps and error accumulation, resulting in poor tire DOT information identification performance. The tire DOT information identification method provided by the embodiment is characterized in that a neural network framework is formed by six parts, namely a feature extraction network, DOT information rough positioning, DOT information fine positioning, a tire bending direction, ROI rotation and DOT character identification in a seamless mode, so that different tasks are completed. And each stage of the framework is not independent, and the framework needs to be trained through a total loss function without error accumulation.

3. The deep learning based multitasking framework inevitably has inconsistent convergence. The present embodiment analyzes the contribution of the local loss of each part of the framework to convergence through theoretical derivation and experimental comparison, which can supervise the composition of the proposed total loss function of the multi-task framework to ensure effective convergence.

4. Compared with the traditional text positioning algorithm, the method and the device have the advantages that the DOT information text is detected by the feature extraction network, rich features can be extracted, and the text detection accuracy is improved. In the traditional text recognition method, the characters need to be segmented firstly and then recognized, and when the character spacing is small, the segmentation effect is not good, and recognition is influenced. And a text recognition network based on deep learning is introduced, so that even if the intervals among the DOT information embossed characters of the tire are very small and even the embossed characters are connected, the DOT information embossed characters can be accurately recognized.

Example 2

Based on the tire DOT information identification method based on the end-to-end deep learning described in the embodiment 1, the embodiment also provides a tire DOT information identification device, and the device comprises a feature extraction network module, a DOT information rough positioning module, a DOT information fine positioning module, a tire bending direction detection module, an ROI Rotate module and a DOT character identification module;

the feature extraction network module is used for extracting features of the tire image acquired with the tire DOT information to obtain first feature maps output in N stages, and meanwhile, performing feature fusion on the first feature maps output in the N stages to obtain a second feature map;

the DOT information rough positioning module is used for carrying out DOT information rough positioning on the second characteristic diagram and detecting whether three characters of DOT and position information thereof exist or not so as to obtain a region diagram;

the DOT information fine positioning module is used for generating a mask image for the area image and multiplying the mask image by the second feature image to obtain a third feature image for DOT information fine positioning to obtain DOT information text probability and position information so as to position candidate text blocks with angles;

the tire bending direction detection module is used for detecting the tire bending direction of the first characteristic diagram output at the last stage to acquire character direction information of the tire tread;

the ROI Rotate module is used for carrying out affine transformation on the character direction information of the candidate text block and the tire tread and converting the character direction information into a horizontal text block with an upward direction;

and the DOT character recognition module is used for performing DOT character recognition on the horizontal text block input text recognition network based on deep learning.

Example 3

A computer system comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the method steps when executing the computer program as follows:

Example 4

A computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the method steps of:

The embodiments of the present invention can be arbitrarily combined to achieve different technical effects.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the procedures or functions described in accordance with the present application are generated, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, digital subscriber line) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk SolidStateDisk), among others.

One of ordinary skill in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by hardware related to instructions of a computer program, which may be stored in a computer-readable storage medium, and when executed, may include the processes of the above method embodiments. And the aforementioned storage medium includes: various media capable of storing program codes, such as ROM or RAM, magnetic or optical disks, etc.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A tire DOT information identification method based on end-to-end deep learning is characterized in that: the method comprises the following steps:

2. The method of identifying tire DOT information based on end-to-end deep learning of claim 1, wherein: step S1, extracting features by using a feature extraction network, wherein the feature extraction network comprises a ResNet-50 network and a feature pyramid network FPN;

3. The method of identifying tire DOT information based on end-to-end deep learning of claim 2, wherein: the rough location of the DOT information is as follows:

4. A tire DOT information identification method based on end-to-end deep learning according to claim 3, characterized in that: the detailed steps of the DOT information fine positioning are as follows:

5. The method of identifying tire DOT information based on end-to-end deep learning of claim 4, wherein: the detection of the bending direction of the tire is as follows: and taking out the last layer of feature map C4 output by the ResNet-50 network, classifying through two layers of full connection layers and through class, predicting a 4-dimensional array, representing the probability of belonging to four directions, namely the upper direction, the lower direction, the left direction and the right direction of the tire bending, and obtaining the character direction information of the tire tread.

6. The method of identifying DOT information in a tire based on end-to-end deep learning of claim 5, wherein: in step S5, the affine transformation is specifically as follows:

7. The method of identifying tire DOT information based on end-to-end deep learning of claim 6, wherein: in order to reduce the influence of local loss of each stage on convergence, a total loss function is adopted for training to ensure effective convergence; wherein the total loss function is defined as follows:

L_total＝λ₁L_dot+λ₂L_detect+λ₃L_cls+λ₄L_rg

8. The method of identifying tire DOT information based on end-to-end deep learning of claim 7, wherein: loss function L of DOT information rough positioning stage_dotThe system consists of a classification loss function and a position regression loss function, and the formula is as follows:

indicating the true label corresponding to the positive anchorFraming; n is a radical of_s、N_gRespectively representing the number of samples of the corresponding tasks; l is_sRepresenting a classification loss function; l is_gThe positional regression loss function is represented.

9. The method of identifying tire DOT information based on end-to-end deep learning of claim 7, wherein: so that the loss function L of the fine positioning stage of the DOT information_detect：

L_detect＝L_dcls+L_dreg

L_dreg＝L_iou+L_θ

IOU Loss：

wherein the content of the first and second substances,

representing predicted geometry, R^*Is its corresponding label box;

cosine angle difference loss:

wherein the content of the first and second substances,

10. The method of identifying tire DOT information based on end-to-end deep learning of claim 7, wherein: the loss function L of the tire bending direction detection stage_clsThe formula is as follows: