CN115457573B - Character determining method and device, storage medium and electronic device - Google Patents

Character determining method and device, storage medium and electronic device Download PDF

Info

Publication number
CN115457573B
CN115457573B CN202211400525.5A CN202211400525A CN115457573B CN 115457573 B CN115457573 B CN 115457573B CN 202211400525 A CN202211400525 A CN 202211400525A CN 115457573 B CN115457573 B CN 115457573B
Authority
CN
China
Prior art keywords
target
training data
feature
network
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211400525.5A
Other languages
Chinese (zh)
Other versions
CN115457573A (en
Inventor
赵之健
汪传坤
林亦宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shanma Zhijian Technology Co ltd
Hangzhou Shanma Zhiqing Technology Co Ltd
Shanghai Supremind Intelligent Technology Co Ltd
Original Assignee
Beijing Shanma Zhijian Technology Co ltd
Hangzhou Shanma Zhiqing Technology Co Ltd
Shanghai Supremind Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shanma Zhijian Technology Co ltd, Hangzhou Shanma Zhiqing Technology Co Ltd, Shanghai Supremind Intelligent Technology Co Ltd filed Critical Beijing Shanma Zhijian Technology Co ltd
Priority to CN202211400525.5A priority Critical patent/CN115457573B/en
Publication of CN115457573A publication Critical patent/CN115457573A/en
Application granted granted Critical
Publication of CN115457573B publication Critical patent/CN115457573B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/625License plates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/1918Fusion techniques, i.e. combining data from various sources, e.g. sensor fusion
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The embodiment of the invention provides a character determining method, a device, a storage medium and an electronic device, wherein the method comprises the following steps: extracting first training data from the training data set according to a first sampling rule, and extracting second training data from the training data set according to a second sampling rule; inputting the first training data into a first branch network included in the target network model, and determining a first characteristic of the first training data; inputting the second training data into a second branch network included in the target network model, and determining a second characteristic of the second training data; fusing the first feature and the second feature based on network parameters of a classification model included in the target network model to obtain a fused feature; and determining a target loss value of the target network model based on the fusion characteristic, and iteratively updating target network parameters of the target network model based on the target loss value. The invention achieves the effect of improving the accuracy of determining the characters.

Description

Character determining method and device, storage medium and electronic device
Technical Field
The embodiment of the invention relates to the field of communication, in particular to a character determining method and device, a storage medium and an electronic device.
Background
With the development of convolutional neural networks, the fields of image classification and identification have made great progress. This advancement is not separable from obtaining high quality large-scale data sets. Common large-scale datasets ImageNet, COCO, places Database, etc. are classical classification datasets that feature a substantially uniform distribution of class labels, but there is often a long tail effect of the distribution on datasets in the real world.
In license plate identification data, long tail distribution is particularly obvious, because the difficulty of data acquisition of each province often causes serious non-uniformity of data distribution, such as A, B, C and other provinces of pictures are relatively easy to acquire, so that the number of pictures of each province is relatively large, but license plate pictures in D, E, F and other regions are relatively difficult to acquire, and therefore the number of pictures of each province is relatively small. This extreme imbalance in license plate number distribution for different provinces can lead to the result that: the identification accuracy of the provinces with more license plate identification data is higher, and the identification accuracy of the provinces with less license plate identification data is lower, so that the overall identification accuracy is affected.
By long tail effect is meant that a small fraction of the categories (head categories) in the dataset occupy the vast majority of the sample size, with the majority of the categories (tail categories) having only a very small sample size. In the prior art, the optimization mode of the long tail effect often has the problems of data loss, high subsequent processing difficulty, high model calculation force requirement and the like.
From this, it is known that the related art has a problem that the character determination by the model is inaccurate due to the long tail effect of the model training data.
In view of the above problems in the related art, no effective solution has been proposed at present.
Disclosure of Invention
The embodiment of the invention provides a character determining method, a device, a storage medium and an electronic device, which are used for at least solving the problem that the character is determined inaccurately through a model due to the long tail effect of model training data in the related technology.
According to an embodiment of the present invention, there is provided a character determining method including: inputting target data into a target network model, and determining target characteristics of the target data; determining a target character corresponding to the target data based on the target feature; the target network model is trained by the following modes: extracting first training data from a training data set according to a first sampling rule, and extracting second training data from the training data set according to a second sampling rule, wherein the training data set comprises first type training data and second type training data, the number of the first type training data is larger than that of the second type training data, the first sampling rule comprises the same probability that each training data is extracted, the second sampling rule comprises the probability that the second type training data is extracted is larger than that of the first type training data, and the first training data, the second training data and the target data are data acquired by acquisition equipment; inputting the first training data into a first branch network included in a target network model, and determining a first characteristic of the first training data; inputting the second training data into a second branch network included in the target network model, and determining a second characteristic of the second training data; fusing the first feature and the second feature based on network parameters of a classification model included in the target network model to obtain a fused feature; and determining a target loss value of the target network model based on the fusion characteristic, and iteratively updating target network parameters of the target network model based on the target loss value.
According to another embodiment of the present invention, there is provided a character determining apparatus including: the first determining module is used for inputting target data into a target network model and determining target characteristics of the target data; the second determining module is used for determining target characters corresponding to the target data based on the target characteristics; the target network model is trained by the following modes: extracting first training data from a training data set according to a first sampling rule, and extracting second training data from the training data set according to a second sampling rule, wherein the training data set comprises first type training data and second type training data, the number of the first type training data is larger than that of the second type training data, the first sampling rule comprises the same probability that each training data is extracted, the second sampling rule comprises the probability that the second type training data is extracted is larger than that of the first type training data, and the first training data, the second training data and the target data are data acquired by acquisition equipment; inputting the first training data into a first branch network included in a target network model, and determining a first characteristic of the first training data; inputting the second training data into a second branch network included in the target network model, and determining a second characteristic of the second training data; fusing the first feature and the second feature based on network parameters of a classification model included in the target network model to obtain a fused feature; and determining a target loss value of the target network model based on the fusion characteristic, and iteratively updating target network parameters of the target network model based on the target loss value.
According to yet another embodiment of the present invention, there is also provided a computer-readable storage medium having stored therein a computer program, wherein the computer program when executed by a processor implements the steps of the method as described in any of the above.
According to a further embodiment of the invention, there is also provided an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
According to the invention, the target data is input into the target network model, the target characteristics of the target data are determined, and the target characters corresponding to the target data are determined according to the target characteristics, wherein the target network model is trained in the following manner: the first training data is extracted from the training data set according to a first sampling rule and the second training data is extracted from the training data set according to a second sampling rule. The first training data is input into a first branch network included in the target network model, first characteristics of the first training data are determined, the second training data is input into a second branch network included in the target network model, and second characteristics of the second training data are determined. And fusing the first features and the second features according to the network parameters of the classification model included in the target network model to obtain fused features, determining a target loss value of the target network model according to the fused features, and iteratively updating the target network parameters of the target network model according to the target loss value. When the model is trained, training data extracted according to different sampling rules are respectively input into branch networks with different values, the characteristics output by each branch network are determined, the first characteristics and the second characteristics are fused according to the network parameters of the classification model, and the target loss value is determined according to the fused characteristics, so that the first type of training data and the second type of training data can be considered, and the accuracy of the trained model is improved. Therefore, the problem that the characters are inaccurately determined through the model due to the long tail effect of the model training data in the related technology can be solved, and the effect of improving the accuracy of determining the characters is achieved.
Drawings
Fig. 1 is a block diagram of a hardware configuration of a mobile terminal of a character determining method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method of determining characters according to an embodiment of the present invention;
FIG. 3 is a long tail effect schematic according to an exemplary embodiment of the present invention;
FIG. 4 is a flow chart of a model training method in a method of determining characters according to an embodiment of the present invention;
fig. 5 is a block diagram of a configuration of a character determining apparatus according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings in conjunction with the embodiments.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
The method embodiments provided in the embodiments of the present application may be performed in a mobile terminal, a computer terminal or similar computing device. Taking the mobile terminal as an example, fig. 1 is a block diagram of a hardware structure of the mobile terminal according to a character determining method according to an embodiment of the present invention. As shown in fig. 1, a mobile terminal may include one or more (only one is shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA) and a memory 104 for storing data, wherein the mobile terminal may also include a transmission device 106 for communication functions and an input-output device 108. It will be appreciated by those skilled in the art that the structure shown in fig. 1 is merely illustrative and not limiting of the structure of the mobile terminal described above. For example, the mobile terminal may also include more or fewer components than shown in fig. 1, or have a different configuration than shown in fig. 1.
The memory 104 may be used to store a computer program, for example, a software program of application software and a module, such as a computer program corresponding to a method for determining characters in an embodiment of the present invention, and the processor 102 executes the computer program stored in the memory 104 to perform various functional applications and data processing, that is, to implement the above-described method. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory remotely located relative to the processor 102, which may be connected to the mobile terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is configured to communicate with the internet wirelessly.
In this embodiment, a method for determining a character is provided, fig. 2 is a flowchart of a method for determining a character according to an embodiment of the present invention, and as shown in fig. 2, the flowchart includes the following steps:
step S202, inputting target data into a target network model, and determining target characteristics of the target data;
step S204, determining target characters corresponding to the target data based on the target characteristics;
the target network model is trained by the following modes: extracting first training data from a training data set according to a first sampling rule, and extracting second training data from the training data set according to a second sampling rule, wherein the training data set comprises first type training data and second type training data, the number of the first type training data is larger than that of the second type training data, the first sampling rule comprises the same probability that each training data is extracted, the second sampling rule comprises the probability that the second type training data is extracted is larger than that of the first type training data, and the first training data, the second training data and the target data are data acquired by acquisition equipment; inputting the first training data into a first branch network included in a target network model, and determining a first characteristic of the first training data; inputting the second training data into a second branch network included in the target network model, and determining a second characteristic of the second training data; fusing the first feature and the second feature based on network parameters of a classification model included in the target network model to obtain a fused feature; and determining a target loss value of the target network model based on the fusion characteristic, and iteratively updating target network parameters of the target network model based on the target loss value.
In the above embodiment, the target character corresponding to the target data may be recognized using the target network model. The target data may be images, video frames, etc. acquired by an acquisition device, such as an imaging device, a monitoring device, etc. The target character may be a character included in the target data. When the target data is a license plate image, the target character may be a license plate number.
In the above embodiment, the training data set includes the first type of training data and the second type of training data. Wherein the number of training data of the first type is greater than the number of training data of the second type. The first type of training data is head data and the second type of training data is tail data. Thus, long tail effects exist in the training dataset. The long tail effect is schematically shown in figure 3.
In the above embodiment, the first type of training data may be data acquired in a region where the number of the acquisition devices is large, and the second type of training data may be data acquired in a region where the number of the acquisition devices is small. When the target data is an image, each type of training data includes the image, and the tag character of the image. During training, loss values can be determined according to the predicted characters and the tag characters determined by the network model, and parameters of the network model are updated iteratively according to the loss values.
In the above embodiment, the first training data may be extracted according to the first sampling rule, and the second training data may be extracted according to the second sampling rule. The first sampling rule may be normal sampling, that is, according to the original distribution of the training sample data, uniform sampling of the picture data is performed, that is, each sample in the training set is sampled only once with the same probability. This uniformly sampled batch of data retains the sample data distribution characteristics in the original dataset, thus facilitating overall feature representation learning. The second sampling rule may be a reverse sampling in which the sampling probability of each class is proportional to the inverse of the sample capacity, i.e. the greater the sample capacity of a class, the less likely it is to be sampled, whereas the smaller the sample capacity of a class, the greater the likelihood it is to be sampled. By the sampling mode, the network can pay attention to the characteristics of the tail class data, and the tail class effect is good.
In the above embodiment, the target network model may include two branch networks, i.e., a first branch network and a second branch network. The first branch network and the second branch network have the same structure, and can use the same residual network to extract the characteristics, thereby being beneficial to fusing the characteristics in the process of accumulated learning. For the head class data and the tail class data which are unevenly distributed, a convolution learning branch and a rebalancing branch are respectively adopted. The first branch network may be a convolution learning branch, and collects training pictures in a normal sampling mode. The normal sampling picture is sent into a convolutional neural network in a convolutional learning branch for training, and a feature vector f is obtained c I.e. the first feature. The second branch network can be a balance branch, and training pictures are acquired in a reverse sampling mode. Sending the reverse sampling picture into a convolutional neural network in a re-balance branch for training to obtain a feature vector f r I.e. the second feature. The first feature and the second feature may be feature vectors extracted by a convolutional neural network, such as three-dimensional matrix vectors, which are expressed as (Height, width, channel).
After the first feature and the second feature are obtained, the first feature and the second feature can be fused according to the network parameters of the classification model included in the target network model to obtain a fused feature. And determining a target loss value according to the fusion characteristic, and iteratively updating target network parameters of the target network model by using the target loss value.
Alternatively, the main body of execution of the above steps may be a background processor, or other devices with similar processing capability, and may also be a machine integrated with at least a data processing device, where the data processing device may include, but is not limited to, a terminal such as a computer, a mobile phone, and the like.
According to the invention, the target data is input into the target network model, the target characteristics of the target data are determined, and the target characters corresponding to the target data are determined according to the target characteristics, wherein the target network model is trained in the following manner: the first training data is extracted from the training data set according to a first sampling rule and the second training data is extracted from the training data set according to a second sampling rule. The first training data is input into a first branch network included in the target network model, first characteristics of the first training data are determined, the second training data is input into a second branch network included in the target network model, and second characteristics of the second training data are determined. And fusing the first features and the second features according to the network parameters of the classification model included in the target network model to obtain fused features, determining a target loss value of the target network model according to the fused features, and iteratively updating the target network parameters of the target network model according to the target loss value. When the model is trained, training data extracted according to different sampling rules are respectively input into branch networks with different values, the characteristics output by each branch network are determined, the first characteristics and the second characteristics are fused according to the network parameters of the classification model, and the target loss value is determined according to the fused characteristics, so that the first type of training data and the second type of training data can be considered, and the accuracy of the trained model is improved. Therefore, the problem that the characters are inaccurately determined through the model due to the long tail effect of the model training data in the related technology can be solved, and the effect of improving the accuracy of determining the characters is achieved.
In one exemplary embodiment, the first feature and the second feature are fused based on classification network parameters of a classification model included in the target network modelThe obtaining of the fusion characteristics comprises the following steps: determining a first weight corresponding to the first branch network and a second weight corresponding to the second branch network; multiplying the first feature by the first weight to obtain a third feature, and multiplying the second feature by the second weight to obtain a fourth feature; and fusing the third feature and the fourth feature based on the classified network parameters to obtain the fused feature. In this embodiment, when the first feature and the second feature are fused, a learnable adaptive compromise parameter is introduced to control the feature vector obtained by the branching, for example, the first weight corresponding to the first branching network and the second weight corresponding to the second branching network may be determined. And adjusting the first characteristic by using the first weight to obtain a third characteristic, and adjusting the second characteristic by using the second weight to obtain a fourth characteristic. And fusing the third feature and the fourth feature to obtain a fused feature. I.e. a learnable adaptive compromise parameter alpha can be initialized. Let alpha be the first characteristic f c Multiplying to obtain weighted convolution learning branch network feature vector alpha f c . Combining (1-alpha) with f r Multiplying to obtain weighted rebalancing branch network characteristic vector (1-alpha) f r . And then alpha f c (1-. Alpha.) f r Fusion is performed.
It should be noted that α is a first weight, and 1- α is a second weight, for controlling the feature vector f obtained by the convolution learning branch and the rebalancing branch c And f r Is a weight of (2). The parameter is less than 1 and greater than 0.
In an exemplary embodiment, fusing the third feature and the fourth feature based on the classification network parameters, the obtaining the fused feature includes: determining a first sub-classification network parameter and a second sub-classification network parameter included in the classification network parameters, wherein the first sub-classification network parameter is a parameter of a first sub-classification network, the first sub-classification network is connected with the first branch network, the second sub-classification network parameter is a parameter of a second sub-classification network, and the second sub-classification network is connected with the second branch network; combining the third feature with the first sub-classified network parameterMultiplying to obtain a fifth feature, and multiplying the fourth feature with the second sub-classification network parameter to obtain a sixth feature; and connecting the fifth feature and the sixth feature to obtain the fusion feature. In this embodiment, the first sub-classification network parameter may be set to be
Figure 364254DEST_PATH_IMAGE001
Setting the second sub-classification network parameter as w r The first feature may be denoted as f c The second feature may be denoted as f r . Let the first weight be α, the second weight be denoted 1- α, and the third feature be denoted α c The fourth feature may be expressed as (1- α) f r . The weighted feature vector af can be used c And (1-. Alpha.) f r Classifier W sent to corresponding branches respectively c And W is r If the classification feature learning is performed, the fifth feature may be expressed as αw c *f c The sixth feature may be denoted as w r (1-α)f r . The outputs from the two branches are finally fused in an element-wise manner, and the fused characteristic can be expressed as z=αw c f c +(1-α)W r f r . Wherein the dimensions of the first sub-classification network parameter and the second sub-classification network parameter are consistent with the dimensions of the first feature and the second feature.
In the above embodiment, the cumulative learning section multiplies the features obtained by the bilateral branches by the compromise parameters α and (1- α), and then sends the features to the identifiers Wc and Wr to perform parameter re-fitting, then performs fusion operation, and finally sends the features to the Softmax function to perform normalization operation. The series of operations are a relearning process of parameters, which is convenient for fitting network parameters, and can make the network converge as soon as possible.
In one exemplary embodiment, determining the target loss value for the target network model based on the fusion feature comprises: determining a first loss value for the first branched network based on the fifth characteristic; determining a second loss value for the second branch network based on the sixth characteristic; a target loss value for the target network model is determined based on the first loss value and the second loss value. In this embodiment, when determining the target loss value, the first loss value of the first branch network may be determined according to the fifth feature. The first loss value is determined, for example, from the fifth feature and the true feature corresponding to the fifth feature. The second loss value may be determined according to the sixth feature and the real feature corresponding to the sixth feature, and the target loss value may be determined according to the first loss value and the second loss value. For example, weights of the first and second branch networks may be determined, and the target loss value determined from a summation of the weights.
In one exemplary embodiment, iteratively updating the target network parameters of the target network model based on the target loss values includes: determining a first loss value corresponding to the first branch network and a second loss value corresponding to the second branch network, which are included in the target loss value; and updating a first weight and a second weight included in the target network parameter based on the first loss value and the second loss value, wherein the first weight is a weight corresponding to the first branch network, and the second weight is a weight corresponding to the second branch network. In this embodiment, the network parameter variables, i.e., the first weight and the second weight, are automatically adjusted according to the network loss, instead of being decremented as the number of training times increases. The output weight of the bilateral branches is controlled through the weight of the classification network, the purpose of transferring the attention of network learning to the space between the two branches in the training stage is achieved, and the effect of accurately training the target network is achieved.
In one exemplary embodiment, iteratively updating the target network parameters of the target network model based on the target loss values includes: determining a network parameter update gradient based on the target loss value; and updating the network parameters of the first branch network, the second branch network and the classification model which are included in the target network model according to the network parameter updating gradient. In this embodiment, after determining the target loss value, network parameters for updating the first branch network, the second branch network, and the classification model may be determined according to the target loss value. And after updating, training again, determining a target loss value, and when the target loss value is smaller than a preset loss value or the training times reach the preset times, exiting training to obtain the final target network parameters.
In an exemplary embodiment, determining, based on the target feature, a target character corresponding to the target data includes: determining a similarity score for each feature included in the target feature to a feature of a character included in a feature dictionary table; and determining the target character corresponding to the target data based on the similarity score. In this embodiment, when the target data is a license plate image, the license plate generally includes a combination of letters and numbers of 7 or 8 bits (new energy vehicle), and in order to ensure full coverage, a length of 10 is generally set, and the license plate identification feature characterization results in 10 number sequences with a length of 71. The license plate recognition dictionary includes: "Wanhu Shanju Yu Ji jin Meng Liao Ji Hei su Zhe Beijing Min Gan Lu Yu Hunan Guangdong Gui Qiongchuan Gui Yun Shan Gan Qingning" new police hanging emergency ABCDEFGHJKLMNPQRSTUVWXYZ0123456789- "wherein" - "indicates a blank area, and the length is 71, and is consistent with the sequence length of license plate recognition characteristic characterization. Each number sequence of length 71 in the license plate recognition feature representation corresponds to a score for a character in the license plate recognition dictionary. The one with the highest score is taken as one character of license plate recognition. Thus, for example, the end result is: "XXXXXXX- -" 10 number sequences of length 71 are mapped to the license plate recognition dictionary to obtain the final license plate number. The final license plate is obtained through post-treatment: "XXXXXXX".
The following describes a method for determining characters in connection with the specific embodiments:
fig. 4 is a flowchart of a model training method in a character determining method according to an embodiment of the present invention, where as shown in fig. 4, a target network model includes a two-sided branch network (a convolution learning branch, corresponding to the first branch network, a rebalancing branch, corresponding to the second branch network), for solving the problem of long tail distribution in license plate recognition data. For head license plate data and tail license plate data which are unevenly distributed, two branches are arranged, namely a convolution learning branch and a rebalancing branch. The network structures of the two branches are the same, and feature extraction is performed by using the same residual network structure. A debugger: a learnable adaptive compromise parameter a (corresponding to the first weight) is introduced for controlling the feature vector obtained by the branch. Cumulative learning strategy: the output weight of the bilateral branches is controlled by the parameters of the debugger, so that the aim of diverting the attention of network learning to the space between the two branches in the training stage is fulfilled.
Wherein (1) normal sampling means that the picture data is uniformly sampled according to the original distribution of the training sample data, i.e., each sample in the training set is sampled only once with the same probability. This uniformly sampled batch of data retains the sample data distribution characteristics in the original dataset, thus facilitating overall feature representation learning.
(2) The reverse sampling means that in the reverse sampling, the sampling probability of each class is proportional to the inverse of the sample capacity, that is, the larger the sample capacity of a certain class is, the smaller the possibility of being sampled is, whereas the smaller the sample capacity of a certain class is, the larger the possibility of being sampled is. Through the sampling mode, the network can pay attention to the characteristics of license plate identification data of the tail class, and the effect on the tail class is good.
(3) License plate recognition features: the license plate generally comprises a combination of letters and numbers of 7 bits or 8 bits (new energy vehicles), the length is generally set to be 10 in order to ensure full coverage, and the license plate identification feature characterization finally obtains 10 number sequences with the length of 71. The license plate recognition dictionary includes: "Wanhu Shanju Yu Ji jin Meng Liao Ji Hei su Zhe Beijing Min Gan Lu Yu Hunan Guangdong Gui Qiongchuan Gui Yun Shan Gan Qingning" new police hanging emergency ABCDEFGHJKLMNPQRSTUVWXYZ0123456789- "wherein" - "indicates a blank area, and the length is 71, and is consistent with the sequence length of license plate recognition characteristic characterization. Each number sequence of length 71 in the license plate recognition feature representation corresponds to a score for a character in the license plate recognition dictionary. The one with the highest score is taken as one character of license plate recognition. Thus, for example, the end result is: "XXXXXXX- -" 10 number sequences of length 71 are mapped to the license plate recognition dictionary to obtain the final license plate number. The final license plate is obtained through post-treatment: "XXXXXXX".
Wherein f c And f r : the eigenvector extracted by the convolutional neural network is typically a three-dimensional (Height, width, channel) matrix vector. Alpha: an adaptive compromise parameter alpha for controlling the eigenvector f obtained by the convolution learning branch and the rebalancing branch c And f r Is a weight of (2). The parameter is less than 1 and greater than 0.W (W) c And W is r : network parameters, dimensions and f that can be learned in a recognizer c F r And keep the same. Softmax function: also called normalized exponential function, the objective is to present the multi-classification result in the form of probability. The method is used for converting the network characteristics into license plate recognition classification characteristics. And p: license plate recognition features.
The convolution learning branch collects training pictures in a normal sampling mode. The normal sampling picture is sent into a convolutional neural network in a convolutional learning branch for training, and a feature vector f is obtained c . And the rebalancing branch acquires training pictures in a reverse sampling mode. Sending the reverse sampling picture into a convolutional neural network in a re-balance branch for training to obtain a feature vector f r . The debugger initializes a learnable adaptive compromise parameter alpha. Alpha and f c Multiplying to obtain weighted convolution learning branch network feature vector alpha f c . Combining (1-alpha) with f r Multiplying to obtain weighted rebalancing branch network characteristic vector (1-alpha) f r . Cumulative learning: feature vector αf after weighting x And (1-. Alpha.) f r Classifier W sent to corresponding branches respectively c And W is r And (3) performing classification feature learning. The outputs obtained by the two branches are finally fused in an element adding mode, so that the fused characteristics can be expressed as follows: z=αw c f c +(1-α)W r f r . And (3) processing the fused features by a Softmax function to obtain a final license plate recognition feature sequence p. And carrying out license plate recognition post-processing on the feature p to obtain a final license plate recognition result.
In the foregoing embodiment, the convolution learning branch learns the sample data distribution characteristics in the original data set, and thus its result represents the overall characteristics. While the rebalancing branch is more focused on the characteristics of license plate recognition data of the tail class, the result it learns is more that the characteristics representing the tail class. The final result is to combine the results of the bilateral outputs, and the compromise parameter alpha is the effect of controlling the branch. This compromise parameter a controls the effect of both features on the final result in a weighted fashion. The problem of low license plate recognition accuracy caused by long tail distribution in license plate recognition data can be effectively solved.
The bilateral branch network learns the uniform data sample characteristics and the tail category characteristics respectively, and finally fuses the uniform data sample characteristics and the tail category characteristics in a cumulative learning mode, so that the accuracy of the head category is not lost, meanwhile, the data distribution of the tail category can be well considered, and the final improvement of the accuracy is realized. The method comprises the steps of firstly, independently learning respective characteristics by using bilateral branches, and then summarizing and then learning the two characteristics by using an accumulated learning strategy. Therefore, on one hand, the generalization capability of the network is improved, and on the other hand, the problem that network parameters are difficult to fit due to direct addition can be avoided.
Compared with a reassignment strategy, the method does not need to define a large number of head and tail data categories, has fewer network parameters, is simple and easy to adjust, and is easy to expand in a large sample data set.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The embodiment also provides a device for determining characters, which is used for implementing the above embodiment and the preferred implementation, and is not described in detail. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
Fig. 5 is a block diagram of a character determining apparatus according to an embodiment of the present invention, as shown in fig. 5, including:
a first determining module 52, configured to input target data into a target network model, and determine target characteristics of the target data;
a second determining module 54, configured to determine a target character corresponding to the target data based on the target feature;
the target network model is trained by the following modes: the method comprises the steps of extracting first training data from a training data set according to a first sampling rule, and extracting second training data from the training data set according to a second sampling rule, wherein the training data set comprises first type training data and second type training data, the number of the first type training data is larger than that of the second type training data, the probability that each training data is extracted is the same, the probability that the second type training data is extracted is larger than that of the first type training data, and the first training data, the second training data and the target data are data acquired by acquisition equipment; inputting the first training data into a first branch network included in a target network model, and determining a first characteristic of the first training data; inputting the second training data into a second branch network included in the target network model, and determining a second characteristic of the second training data; fusing the first feature and the second feature based on network parameters of a classification model included in the target network model to obtain a fused feature; and determining a target loss value of the target network model based on the fusion characteristic, and iteratively updating target network parameters of the target network model based on the target loss value.
In an exemplary embodiment, the apparatus may implement fusing the first feature and the second feature based on the classification network parameters of the classification model included in the target network model to obtain a fused feature by: determining a first weight corresponding to the first branch network and a second weight corresponding to the second branch network; multiplying the first feature by the first weight to obtain a third feature, and multiplying the second feature by the second weight to obtain a fourth feature; and fusing the third feature and the fourth feature based on the classified network parameters to obtain the fused feature.
In an exemplary embodiment, the apparatus may implement fusing the third feature and the fourth feature based on the classification network parameter to obtain the fused feature by: determining a first sub-classification network parameter and a second sub-classification network parameter included in the classification network parameters, wherein the first sub-classification network parameter is a parameter of a first sub-classification network, the first sub-classification network is connected with the first branch network, the second sub-classification network parameter is a parameter of a second sub-classification network, and the second sub-classification network is connected with the second branch network; multiplying the third feature by the first sub-classification network parameter to obtain a fifth feature, and multiplying the fourth feature by the second sub-classification network parameter to obtain a sixth feature; and connecting the fifth feature and the sixth feature to obtain the fusion feature.
In an exemplary embodiment, the apparatus may determine the target loss value of the target network model based on the fusion feature by: determining a first loss value for the first branched network based on the fifth characteristic; determining a second loss value for the second branch network based on the sixth characteristic; a target loss value for the target network model is determined based on the first loss value and the second loss value.
In an exemplary embodiment, the apparatus may implement iteratively updating the target network parameters of the target network model based on the target loss values by: determining a first loss value corresponding to the first branch network and a second loss value corresponding to the second branch network, which are included in the target loss value; and updating a first weight and a second weight included in the target network parameter based on the first loss value and the second loss value, wherein the first weight is a weight corresponding to the first branch network, and the second weight is a weight corresponding to the second branch network.
In an exemplary embodiment, the apparatus may implement iteratively updating the target network parameters of the target network model based on the target loss values by: determining a network parameter update gradient based on the target loss value; and updating the network parameters of the first branch network, the second branch network and the classification model which are included in the target network model according to the network parameter updating gradient.
In an exemplary embodiment, the second determining module 54 may determine the target character corresponding to the target data based on the target feature by: determining a similarity score for each feature included in the target feature to a feature of a character included in a feature dictionary table; and determining the target character corresponding to the target data based on the similarity score.
It should be noted that each of the above modules may be implemented by software or hardware, and for the latter, it may be implemented by, but not limited to: the modules are all located in the same processor; alternatively, the above modules may be located in different processors in any combination.
Embodiments of the present invention also provide a computer readable storage medium having a computer program stored therein, wherein the computer program when executed by a processor implements the steps of the method described in any of the above.
In one exemplary embodiment, the computer readable storage medium may include, but is not limited to: a usb disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing a computer program.
An embodiment of the invention also provides an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
In an exemplary embodiment, the electronic apparatus may further include a transmission device connected to the processor, and an input/output device connected to the processor.
Specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the exemplary implementation, and this embodiment is not described herein.
It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may be implemented in program code executable by computing devices, so that they may be stored in a storage device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. The character determining method is characterized by being applied to the field of license plate recognition and comprising the following steps of:
inputting target data into a target network model, and determining target characteristics of the target data;
determining a target character corresponding to the target data based on the target feature;
the target network model is trained by the following modes: extracting first training data from a training data set according to a first sampling rule, and extracting second training data from the training data set according to a second sampling rule, wherein the training data set comprises first type training data and second type training data, the number of the first type training data is larger than that of the second type training data, the first sampling rule comprises the same probability that each training data is extracted, the second sampling rule comprises the probability that the second type training data is extracted is larger than that of the first type training data, and the first training data, the second training data and the target data are data acquired by acquisition equipment; inputting the first training data into a first branch network included in the target network model, and determining a first characteristic of the first training data; inputting the second training data into a second branch network included in the target network model, and determining a second characteristic of the second training data; fusing the first feature and the second feature based on network parameters of a classification model included in the target network model to obtain a fused feature; determining a target loss value of the target network model based on the fusion characteristic, and iteratively updating target network parameters of the target network model based on the target loss value;
Wherein determining, based on the target feature, a target character corresponding to the target data includes:
determining similarity scores of each feature included in the target features and features of characters included in a feature dictionary, wherein the target features include license plate features, and the license plate features include a fixed-length number sequence;
and determining the target character corresponding to the target data based on the similarity score.
2. The method of claim 1, wherein fusing the first feature and the second feature based on classification network parameters of a classification model included in the target network model, the resulting fused feature comprising:
determining a first weight corresponding to the first branch network and a second weight corresponding to the second branch network;
multiplying the first feature by the first weight to obtain a third feature, and multiplying the second feature by the second weight to obtain a fourth feature;
and fusing the third feature and the fourth feature based on the classified network parameters to obtain the fused feature.
3. The method of claim 2, wherein fusing the third feature and the fourth feature based on the classification network parameters to obtain the fused feature comprises:
Determining a first sub-classification network parameter and a second sub-classification network parameter included in the classification network parameters, wherein the first sub-classification network parameter is a parameter of a first sub-classification network, the first sub-classification network is connected with the first branch network, the second sub-classification network parameter is a parameter of a second sub-classification network, and the second sub-classification network is connected with the second branch network;
multiplying the third feature by the first sub-classification network parameter to obtain a fifth feature, and multiplying the fourth feature by the second sub-classification network parameter to obtain a sixth feature;
and connecting the fifth feature and the sixth feature to obtain the fusion feature.
4. The method of claim 3, wherein determining a target loss value for the target network model based on the fusion feature comprises:
determining a first loss value for the first branched network based on the fifth characteristic;
determining a second loss value for the second branch network based on the sixth characteristic;
a target loss value for the target network model is determined based on the first loss value and the second loss value.
5. The method of claim 1, wherein iteratively updating target network parameters of the target network model based on the target loss values comprises:
determining a first loss value corresponding to the first branch network and a second loss value corresponding to the second branch network, which are included in the target loss value;
and updating a first weight and a second weight included in the target network parameter based on the first loss value and the second loss value, wherein the first weight is a weight corresponding to the first branch network, and the second weight is a weight corresponding to the second branch network.
6. The method of claim 1, wherein iteratively updating target network parameters of the target network model based on the target loss values comprises:
determining a network parameter update gradient based on the target loss value;
and updating the network parameters of the first branch network, the second branch network and the classification model which are included in the target network model according to the network parameter updating gradient.
7. A character determining apparatus, comprising:
the first determining module is used for inputting target data into a target network model and determining target characteristics of the target data;
The second determining module is used for determining target characters corresponding to the target data based on the target characteristics;
the target network model is trained by the following modes: extracting first training data from a training data set according to a first sampling rule, and extracting second training data from the training data set according to a second sampling rule, wherein the training data set comprises first type training data and second type training data, the number of the first type training data is larger than that of the second type training data, the first sampling rule comprises the same probability that each training data is extracted, the second sampling rule comprises the probability that the second type training data is extracted is larger than that of the first type training data, and the first training data, the second training data and the target data are data acquired by acquisition equipment; inputting the first training data into a first branch network included in a target network model, and determining a first characteristic of the first training data; inputting the second training data into a second branch network included in the target network model, and determining a second characteristic of the second training data; fusing the first feature and the second feature based on network parameters of a classification model included in the target network model to obtain a fused feature; determining a target loss value of the target network model based on the fusion characteristic, and iteratively updating target network parameters of the target network model based on the target loss value;
Wherein the second determining module is further configured to: determining similarity scores of each feature included in the target features and features of characters included in a feature dictionary, wherein the target features include license plate features, and the license plate features include a fixed-length number sequence; and determining the target character corresponding to the target data based on the similarity score.
8. A computer readable storage medium, characterized in that a computer program is stored in the computer readable storage medium, wherein the computer program, when being executed by a processor, implements the steps of the method according to any of the claims 1 to 6.
9. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to run the computer program to perform the method of any of the claims 1 to 6.
CN202211400525.5A 2022-11-09 2022-11-09 Character determining method and device, storage medium and electronic device Active CN115457573B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211400525.5A CN115457573B (en) 2022-11-09 2022-11-09 Character determining method and device, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211400525.5A CN115457573B (en) 2022-11-09 2022-11-09 Character determining method and device, storage medium and electronic device

Publications (2)

Publication Number Publication Date
CN115457573A CN115457573A (en) 2022-12-09
CN115457573B true CN115457573B (en) 2023-04-28

Family

ID=84311574

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211400525.5A Active CN115457573B (en) 2022-11-09 2022-11-09 Character determining method and device, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN115457573B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111126224A (en) * 2019-12-17 2020-05-08 成都通甲优博科技有限责任公司 Vehicle detection method and classification recognition model training method
CN112232506A (en) * 2020-09-10 2021-01-15 北京迈格威科技有限公司 Network model training method, image target recognition method, device and electronic equipment
CN114627319A (en) * 2022-05-16 2022-06-14 杭州闪马智擎科技有限公司 Target data reporting method and device, storage medium and electronic device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107704857B (en) * 2017-09-25 2020-07-24 北京邮电大学 End-to-end lightweight license plate recognition method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111126224A (en) * 2019-12-17 2020-05-08 成都通甲优博科技有限责任公司 Vehicle detection method and classification recognition model training method
CN112232506A (en) * 2020-09-10 2021-01-15 北京迈格威科技有限公司 Network model training method, image target recognition method, device and electronic equipment
CN114627319A (en) * 2022-05-16 2022-06-14 杭州闪马智擎科技有限公司 Target data reporting method and device, storage medium and electronic device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨昌东 ; 余烨 ; 徐珑刀 ; 付源梓 ; 路强 ; .基于AT-PGGAN的增强数据车辆型号精细识别.中国图象图形学报.2020,(第03期),全文. *

Also Published As

Publication number Publication date
CN115457573A (en) 2022-12-09

Similar Documents

Publication Publication Date Title
Suganuma et al. Attention-based adaptive selection of operations for image restoration in the presence of unknown combined distortions
CN107766850B (en) Face recognition method based on combination of face attribute information
WO2020073951A1 (en) Method and apparatus for training image recognition model, network device, and storage medium
CN113255694B (en) Training image feature extraction model and method and device for extracting image features
CN113705425B (en) Training method of living body detection model, and method, device and equipment for living body detection
CN107330904A (en) Image processing method, image processing device, electronic equipment and storage medium
CN114723966B (en) Multi-task recognition method, training method, device, electronic equipment and storage medium
CN110321892B (en) Picture screening method and device and electronic equipment
CN114332994A (en) Method for training age prediction model, age detection method and related device
CN110991349A (en) Lightweight vehicle attribute identification method based on metric learning
CN114511042A (en) Model training method and device, storage medium and electronic device
CN115829027A (en) Comparative learning-based federated learning sparse training method and system
CN110321964B (en) Image recognition model updating method and related device
CN111401193A (en) Method and device for obtaining expression recognition model and expression recognition method and device
CN110457704A (en) Determination method, apparatus, storage medium and the electronic device of aiming field
CN115457573B (en) Character determining method and device, storage medium and electronic device
CN109472307A (en) A kind of method and apparatus of training image disaggregated model
CN113128308B (en) Pedestrian detection method, device, equipment and medium in port scene
CN115457329B (en) Training method of image classification model, image classification method and device
CN111611917A (en) Model training method, feature point detection device, feature point detection equipment and storage medium
CN114170484B (en) Picture attribute prediction method and device, electronic equipment and storage medium
CN113822373B (en) Image classification model training method based on integration and knowledge distillation
CN110059742A (en) Safety protector wearing recognition methods and equipment based on deep learning
CN113887630A (en) Image classification method and device, electronic equipment and storage medium
CN111242146A (en) POI information classification based on convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant