CN115457573A - Character determination method and device, storage medium and electronic device - Google Patents

Character determination method and device, storage medium and electronic device Download PDF

Info

Publication number
CN115457573A
CN115457573A CN202211400525.5A CN202211400525A CN115457573A CN 115457573 A CN115457573 A CN 115457573A CN 202211400525 A CN202211400525 A CN 202211400525A CN 115457573 A CN115457573 A CN 115457573A
Authority
CN
China
Prior art keywords
target
training data
feature
network
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211400525.5A
Other languages
Chinese (zh)
Other versions
CN115457573B (en
Inventor
赵之健
汪传坤
林亦宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shanma Zhijian Technology Co ltd
Hangzhou Shanma Zhiqing Technology Co Ltd
Shanghai Supremind Intelligent Technology Co Ltd
Original Assignee
Beijing Shanma Zhijian Technology Co ltd
Hangzhou Shanma Zhiqing Technology Co Ltd
Shanghai Supremind Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shanma Zhijian Technology Co ltd, Hangzhou Shanma Zhiqing Technology Co Ltd, Shanghai Supremind Intelligent Technology Co Ltd filed Critical Beijing Shanma Zhijian Technology Co ltd
Priority to CN202211400525.5A priority Critical patent/CN115457573B/en
Publication of CN115457573A publication Critical patent/CN115457573A/en
Application granted granted Critical
Publication of CN115457573B publication Critical patent/CN115457573B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/625License plates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/1918Fusion techniques, i.e. combining data from various sources, e.g. sensor fusion
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention provides a character determination method, a character determination device, a storage medium and an electronic device, wherein the method comprises the following steps: extracting first training data from the training data set according to a first sampling rule, and extracting second training data from the training data set according to a second sampling rule; inputting first training data into a first branch network included in the target network model, and determining first characteristics of the first training data; inputting second training data into a second branch network included in the target network model, and determining second characteristics of the second training data; fusing the first characteristic and the second characteristic based on the network parameters of the classification model included in the target network model to obtain a fused characteristic; and determining a target loss value of the target network model based on the fusion characteristics, and iteratively updating target network parameters of the target network model based on the target loss value. By the method and the device, the effect of improving the accuracy of character determination is achieved.

Description

Character determination method and device, storage medium and electronic device
Technical Field
The embodiment of the invention relates to the field of communication, in particular to a character determining method, a character determining device, a storage medium and an electronic device.
Background
With the development of convolutional neural networks, the field of image classification and recognition has advanced a long time. This advance is not separable from achieving high quality large-scale data sets. The common large-scale datasets ImageNet, COCO, places Database, etc. are classical classification datasets that are characterized by a substantially uniform distribution of class labels, but in real-world datasets there is generally a long tail effect of the distribution.
In the license plate identification data, long tail distribution is particularly obvious, because the difficulty degree of data acquisition of each province often causes serious non-uniformity of data distribution, for example, pictures of the provinces such as A, B, C are relatively easy to acquire, so that the number of the pictures of each province is large, but the difficulty degree of the image acquisition of the license plate in the area similar to D, E, F is relatively difficult, so that the number of the pictures of each province is relatively small. This extreme imbalance in the number distribution of license plates in different provinces can lead to the following consequences: the recognition accuracy of the provinces with more license plate recognition data is higher, and the recognition accuracy of the provinces with less license plate recognition data is lower, so that the overall recognition accuracy is influenced.
By long tail effect, it is meant that a few classes (head classes) in the data set occupy the vast majority of the sample size, and most classes (tail classes) have only a very small sample size. In the prior art, the optimization mode of the long tail effect often has the problems of data loss, high difficulty in subsequent processing, high requirement on model calculation capacity and the like.
Therefore, the problem that the character is determined inaccurately through the model due to the long tail effect of the model training data exists in the related art exists.
In view of the above problems in the related art, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides a character determining method, a character determining device, a storage medium and an electronic device, which are used for at least solving the problem that the character is determined inaccurately through a model due to long tail effect of model training data in the related technology.
According to an embodiment of the present invention, there is provided a character determination method including: inputting target data into a target network model, and determining target characteristics of the target data; determining a target character corresponding to the target data based on the target feature; wherein the target network model is trained by: extracting first training data from a training data set according to a first sampling rule, and extracting second training data from the training data set according to a second sampling rule, wherein the training data set comprises training data of a first type and training data of a second type, the number of the training data of the first type is greater than that of the training data of the second type, the first sampling rule comprises that the probability of each training data being extracted is the same, the second sampling rule comprises that the probability of the training data of the second type being extracted is greater than that of the training data of the first type, and the first training data, the second training data and the target data are data collected by a collecting device; inputting the first training data into a first branch network included in a target network model, and determining a first feature of the first training data; inputting the second training data into a second branch network included in the target network model, and determining a second feature of the second training data; fusing the first feature and the second feature based on network parameters of a classification model included in the target network model to obtain a fused feature; determining a target loss value of the target network model based on the fusion characteristics, and iteratively updating target network parameters of the target network model based on the target loss value.
According to another embodiment of the present invention, there is provided a character determination apparatus including: the first determining module is used for inputting target data into a target network model and determining target characteristics of the target data; the second determination module is used for determining a target character corresponding to the target data based on the target characteristic; wherein the target network model is trained by: extracting first training data from a training data set according to a first sampling rule, and extracting second training data from the training data set according to a second sampling rule, wherein the training data set comprises training data of a first type and training data of a second type, the number of the training data of the first type is greater than that of the training data of the second type, the first sampling rule comprises that the probability of each training data being extracted is the same, the second sampling rule comprises that the probability of the training data of the second type being extracted is greater than that of the training data of the first type, and the first training data, the second training data and the target data are data collected by a collecting device; inputting the first training data into a first branch network included in a target network model, and determining a first feature of the first training data; inputting the second training data into a second branch network included in the target network model, and determining a second feature of the second training data; fusing the first feature and the second feature based on network parameters of a classification model included in the target network model to obtain a fused feature; determining a target loss value of the target network model based on the fusion characteristics, and iteratively updating target network parameters of the target network model based on the target loss value.
According to yet another embodiment of the invention, there is also provided a computer-readable storage medium having a computer program stored therein, wherein the computer program, when executed by a processor, implements the steps of the method as set forth in any of the above.
According to yet another embodiment of the present invention, there is also provided an electronic device, including a memory in which a computer program is stored and a processor configured to execute the computer program to perform the steps in any of the above method embodiments.
According to the invention, the target data is input into the target network model, the target characteristics of the target data are determined, and the target characters corresponding to the target data are determined according to the target characteristics, wherein the target network model is trained in the following way: first training data is extracted from the set of training data according to a first sampling rule, and second training data is extracted from the set of training data according to a second sampling rule. Inputting the first training data into a first branch network included in the target network model, determining first characteristics of the first training data, inputting the second training data into a second branch network included in the target network model, and determining second characteristics of the second training data. And fusing the first characteristic and the second characteristic according to the network parameters of the classification model included in the target network model to obtain a fused characteristic, determining a target loss value of the target network model according to the fused characteristic, and iteratively updating the target network parameters of the target network model according to the target loss value. When model training is carried out, training data extracted according to different sampling rules are respectively input into branch networks with different values, the characteristics output by each branch network are determined, the first characteristics and the second characteristics are fused according to the network parameters of the classification model, and the target loss value is determined according to the fused characteristics, so that the first type of training data and the second type of training data can be considered, and the precision of the trained model is improved. Therefore, the problem that the character is determined inaccurately through the model due to the long tail effect of the model training data in the related technology can be solved, and the effect of improving the accuracy of character determination is achieved.
Drawings
Fig. 1 is a block diagram of a hardware configuration of a mobile terminal according to a method for determining a character according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method of determining a character according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of the long tail effect according to an exemplary embodiment of the present invention;
FIG. 4 is a flow chart of a method of model training in a method of character determination in accordance with a specific embodiment of the present invention;
fig. 5 is a block diagram of a structure of a character determination apparatus according to an embodiment of the present invention.
Detailed Description
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings in conjunction with the embodiments.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
The method embodiments provided in the embodiments of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking the mobile terminal as an example, fig. 1 is a block diagram of a hardware structure of the mobile terminal of a character determination method according to an embodiment of the present invention. As shown in fig. 1, the mobile terminal may include one or more processors 102 (only one is shown in fig. 1) (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA, etc.) and a memory 104 for storing data, wherein the mobile terminal may further include a transmission device 106 for communication functions and an input-output device 108. It will be understood by those of ordinary skill in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the mobile terminal. For example, the mobile terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 may be used to store a computer program, for example, a software program and a module of an application software, such as a computer program corresponding to the character determination method in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer program stored in the memory 104, so as to implement the method described above. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the mobile terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal. In one example, the transmission device 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.
In this embodiment, a method for determining a character is provided, and fig. 2 is a flowchart of a method for determining a character according to an embodiment of the present invention, as shown in fig. 2, the flowchart includes the following steps:
step S202, inputting target data into a target network model, and determining target characteristics of the target data;
step S204, determining a target character corresponding to the target data based on the target feature;
wherein the target network model is trained by: extracting first training data from a training data set according to a first sampling rule, and extracting second training data from the training data set according to a second sampling rule, wherein the training data set comprises a first type of training data and a second type of training data, the number of the first type of training data is greater than that of the second type of training data, the first sampling rule comprises that the probability of each training data being extracted is the same, the second sampling rule comprises that the probability of the second type of training data being extracted is greater than that of the first type of training data, and the first training data, the second training data and the target data are data collected by a collecting device; inputting the first training data into a first branch network included in a target network model, and determining a first feature of the first training data; inputting the second training data into a second branch network included in the target network model, and determining a second feature of the second training data; fusing the first feature and the second feature based on network parameters of a classification model included in the target network model to obtain a fused feature; determining a target loss value of the target network model based on the fusion characteristics, and iteratively updating target network parameters of the target network model based on the target loss value.
In the above embodiment, the target character corresponding to the target data may be identified by using the target network model. The target data may be images, video frames, etc. captured by a capturing device, such as a camera device, a monitoring device, etc. The target character may be a character included in the target data. When the target data is a license plate image, the target character may be a license plate number.
In the above embodiment, the training data set includes a first type of training data and a second type of training data. Wherein the number of training data of the first type is greater than the number of training data of the second type. The first type of training data is head data and the second type of training data is tail data. Thus, long tail effects are present in the training dataset. The long tail effect is schematically shown in FIG. 3.
In the above embodiment, the first type of training data may be data acquired in an area where the number of the acquisition devices is large, and the second type of training data may be data acquired in an area where the number of the acquisition devices is small. When the target data is an image, each type of training data includes an image and a label character of the image. During training, loss values can be determined according to the predicted characters and the label characters determined by the network model, and parameters of the network model are updated iteratively according to the loss values.
In the above embodiment, the first training data may be extracted according to a first sampling rule, and the second training data may be extracted according to a second sampling rule. The first sampling rule may be normal sampling, that is, uniform sampling of the picture data is performed according to the original distribution of the training sample data, that is, each sample in the training set is sampled only once with the same probability. The batch of uniformly sampled data retains the sample data distribution characteristics in the original data set, thereby being beneficial to overall feature representation learning. The second sampling rule may be inverse sampling, in which the sampling probability of each class is proportional to the inverse of the sample capacity, that is, the larger the sample capacity of a certain class is, the less the possibility of being sampled is, and conversely, the smaller the sample capacity of a certain class is, the greater the possibility of being sampled is. By the sampling mode, the network can pay more attention to the characteristics of the data of the tail class, and the effect on the tail class is better.
In the above embodiment, the target network model may include two branch networks, i.e., a first branch network and a second branch network. The first branch network and the second branch network have the same structure, and can extract the features by using the same residual error network, thereby being beneficial to fusing the features in the process of cumulative learning. And aiming at head class data and tail class data which are not uniformly distributed, a convolution learning branch and a rebalancing branch are respectively adopted. The first branch network can be a convolution learning branch, and training pictures are collected in a normal sampling mode. Sending the normal sampling picture into a convolutional neural network in a convolutional learning branch for training to obtain a feature vector f c I.e. the first feature. The second branch network can be a rebalance branch and collects training pictures in a reverse sampling mode. Will sample inverselyThe picture is sent into a convolutional neural network in a rebalancing branch for training to obtain a characteristic vector f r I.e. the second feature. Wherein, the first feature and the second feature can be feature vectors extracted by the convolutional neural network, such as three-dimensional matrix vectors, which are expressed as (Height, width, channel).
After the first feature and the second feature are obtained, the first feature and the second feature may be fused according to network parameters of a classification model included in the target network model to obtain a fused feature. And determining a target loss value according to the fusion characteristics, and iteratively updating the target network parameters of the target network model by using the target loss value.
Optionally, the main body of the above steps may be a background processor or other devices with similar processing capabilities, and may also be a machine integrated with at least a data processing device, where the data processing device may include a terminal such as a computer, a mobile phone, and the like, but is not limited thereto.
According to the invention, target data are input into a target network model, the target characteristics of the target data are determined, and the target characters corresponding to the target data are determined according to the target characteristics, wherein the target network model is trained in the following way: first training data is extracted from the set of training data according to a first sampling rule, and second training data is extracted from the set of training data according to a second sampling rule. Inputting the first training data into a first branch network included in the target network model, determining a first feature of the first training data, inputting the second training data into a second branch network included in the target network model, and determining a second feature of the second training data. And fusing the first characteristic and the second characteristic according to the network parameters of the classification model included in the target network model to obtain a fusion characteristic, determining a target loss value of the target network model according to the fusion characteristic, and iteratively updating the target network parameters of the target network model according to the target loss value. When model training is carried out, training data extracted according to different sampling rules are respectively input into branch networks with different values, the characteristics output by each branch network are determined, the first characteristics and the second characteristics are fused according to the network parameters of the classification model, and the target loss value is determined according to the fused characteristics, so that the first type of training data and the second type of training data can be considered, and the precision of the trained model is improved. Therefore, the problem that the character is determined inaccurately through the model due to the long tail effect of the model training data in the related technology can be solved, and the effect of improving the accuracy of character determination is achieved.
In an exemplary embodiment, fusing the first feature and the second feature based on the classification network parameters of the classification model included in the target network model, and obtaining a fused feature includes: determining a first weight corresponding to the first branch network and a second weight corresponding to the second branch network; multiplying the first feature by the first weight to obtain a third feature, and multiplying the second feature by the second weight to obtain a fourth feature; and fusing the third feature and the fourth feature based on the classification network parameters to obtain the fused feature. In this embodiment, when the first feature and the second feature are fused, a learnable adaptive compromise parameter is introduced to control the feature vector obtained by the branch, for example, a first weight corresponding to the first branch network and a second weight corresponding to the second branch network may be determined. And adjusting the first characteristic by using the first weight to obtain a third characteristic, and adjusting the second characteristic by using the second weight to obtain a fourth characteristic. And fusing the third characteristic and the fourth characteristic to obtain fused characteristics. I.e. a learnable adaptive compromise parameter a can be initialized. A is compared with the first characteristic f c Multiplying to obtain weighted convolution learning branch network characteristic vector alpha f c . Mixing (1-alpha) with f r Multiplying to obtain weighted rebalance branch network characteristic vector (1-alpha) f r . Then alpha f is adjusted c And (1-. Alpha.) f r Fusion is performed.
It should be noted that α is the first weight, and 1- α is the second weight, and is used to control the feature vector f obtained by the convolution learning branch and the rebalancing branch c And f r The weight of (c). The parameter is less than 1 and greater than 0.
At one positionIn an exemplary embodiment, fusing the third feature and the fourth feature based on the classification network parameter to obtain the fused feature includes: determining a first sub-classification network parameter and a second sub-classification network parameter included in the classification network parameter, wherein the first sub-classification network parameter is a parameter of a first sub-classification network, the first sub-classification network is connected with the first branch network, the second sub-classification network parameter is a parameter of a second sub-classification network, and the second sub-classification network is connected with the second branch network; multiplying the third feature by the first sub-classification network parameter to obtain a fifth feature, and multiplying the fourth feature by the second sub-classification network parameter to obtain a sixth feature; and connecting the fifth feature and the sixth feature to obtain the fused feature. In this embodiment, the first sub-classification network parameter may be set as
Figure 364254DEST_PATH_IMAGE001
Setting the second sub-classification network parameter to w r The first feature may be expressed as f c The second characteristic can be expressed as f r . Assuming that the first weight is α, the second weight may be represented as 1- α, and the third feature may be represented as α f c The fourth feature can be expressed as (1-. Alpha.) f r . The weighted feature vector α f can be weighted c And (1-. Alpha.) f r Are sent to the classifiers W of the corresponding branches respectively c And W r The fifth feature can be expressed as alphaw c *f c The sixth feature may be represented as w r (1-α)f r . The final fusion of the outputs from the two branches is performed in the form of element addition, and the fusion characteristic can be expressed as z = α W c f c +(1-α)W r f r . And the dimensionality of the first sub-classification network parameter and the dimensionality of the second sub-classification network parameter are consistent with the dimensionality of the first feature and the dimensionality of the second feature.
In the above embodiment, the cumulative learning part multiplies the features obtained by the bilateral branch with the compromise parameters α and (1- α), and then sends the features to the discriminators Wc and Wr to perform parameter re-fitting, then performs the fusion operation, and finally sends the features to the Softmax function to perform the normalization operation. The series of operations are a parameter relearning process, so that fitting of network parameters is facilitated, and a network can be converged as soon as possible.
In one exemplary embodiment, determining the target loss value for the target network model based on the fused feature comprises: determining a first loss value for the first branch network based on the fifth characteristic; determining a second loss value for the second branch network based on the sixth characteristic; determining a target loss value for the target network model based on the first loss value and the second loss value. In the present embodiment, in determining the target loss value, the first loss value of the first branch network may be determined according to the fifth feature. Such as determining a first loss value based on the fifth characteristic and the corresponding true characteristic of the fifth characteristic. The second loss value may be determined according to the sixth characteristic and the true characteristic corresponding to the sixth characteristic, and the target loss value may be determined according to the first loss value and the second loss value. For example, weights of the first branch network and the second branch network may be determined, and the target loss value may be determined from a sum of the weights.
In one exemplary embodiment, iteratively updating the target network parameters of the target network model based on the target loss values comprises: determining a first loss value corresponding to the first branch network and a second loss value corresponding to the second branch network, which are included in the target loss value; updating a first weight and a second weight included in the target network parameter based on the first loss value and the second loss value, wherein the first weight is a weight corresponding to the first branch network, and the second weight is a weight corresponding to the second branch network. In this embodiment, the network parameter variables, i.e. the first weight and the second weight, are automatically adjusted according to the network loss, rather than being decreased as the training times increase. The output weight of the bilateral branch is controlled through the weight of the classification network, so that the aim of transferring the attention of network learning to the position between the two branches in the training stage is fulfilled, and the effect of accurately training a target network is achieved.
In one exemplary embodiment, iteratively updating the target network parameters of the target network model based on the target loss values comprises: determining a network parameter update gradient based on the target loss value; and updating the network parameters of the first branch network, the second branch network and the classification model included in the target network model according to the network parameter updating gradient. In this embodiment, after the target loss value is determined, the network parameters of the first branch network, the second branch network, and the classification model may be determined and updated according to the target loss value. And after updating, training again to determine a target loss value, and when the target loss value is smaller than the preset loss value or the training times reach the preset times, quitting the training to obtain the final target network parameters.
In an exemplary embodiment, determining the target character corresponding to the target data based on the target feature includes: determining a similarity score of each feature included in the target features with features of characters included in a feature dictionary table; and determining the target character corresponding to the target data based on the similarity score. In this embodiment, when the target data is a license plate image, the license plate generally includes a 7-bit or 8-bit (new energy vehicle) combination of letters and numbers, in order to ensure complete coverage, the length is generally set to be 10, and a license plate identification feature representation finally obtains 10 number sequences with the length of 71. The license plate recognition dictionary includes: "Wanhuhujin Yuji jin Meng Liao Ji Hehesu Zhe Beijing Min Jiangyu Yu Gui Qiongchuan Guiyun Zangshan Gan Qingning New policeman hanging emergency ABCDEFGGHJKLMNPQRSTUVWXYZ 0123456789-" wherein "-" means a blank part, the length is 71, and is consistent with the sequence length represented by the license plate identification feature. Each number sequence with the length of 71 in the license plate recognition feature characterization corresponds to the score of the characters in the license plate recognition dictionary. And taking the one with the highest score as a character of the license plate recognition. Thus, for example, the following results are obtained: "XXXXXXXX- - -" maps 10 number sequences with length 71 to the plate recognition dictionary to get the final plate number. The final license plate obtained after post-treatment is as follows: "XXXXXXXX".
The following describes a method for determining a character in accordance with a specific embodiment:
fig. 4 is a flowchart of a model training method in the method for determining characters according to an embodiment of the present invention, and as shown in fig. 4, the target network model includes a bilateral branch network (a convolution learning branch corresponding to the first branch network and a rebalancing branch corresponding to the second branch network) for solving the problem of long tail distribution in the license plate recognition data. Aiming at the license plate data of head type and tail type which are not uniformly distributed, two branches are arranged, namely a convolution learning branch and a rebalancing branch. The network structures of the two branches are the same, and the same residual error network structure is used for feature extraction. A debugger: a learnable adaptive compromise parameter α (corresponding to the first weight described above) is introduced to control the branch-derived feature vectors. And (3) accumulating the learning strategy: the output weight of the bilateral branch is controlled through the parameters of the debugger, and the aim of transferring the attention of network learning to the position between the two branches in the training phase is achieved.
Wherein, the (1) normal sampling means that the uniform sampling of the picture data is performed according to the original distribution of the training sample data, that is, each sample in the training set is sampled only once with the same probability. The batch of uniformly sampled data retains the sample data distribution characteristics in the original data set, thereby being beneficial to the whole feature representation learning.
(2) The inverse sampling means that in the inverse sampling, the sampling probability of each class is proportional to the reciprocal of the sample capacity, that is, the larger the sample capacity of a certain class is, the less the probability of being sampled is, and conversely, the smaller the sample capacity of a certain class is, the greater the probability of being sampled is. By the sampling mode, the network can pay more attention to the characteristics of the license plate identification data of the tail type, and the effect on the tail type is better.
(3) License plate recognition characteristics: the license plate generally comprises 7-bit or 8-bit (new energy vehicle) letter and number combinations, the length is generally set to be 10 in order to ensure complete coverage, and the license plate recognition feature characterization finally obtains 10 number sequences with the length of 71. The license plate recognition dictionary includes: "Wanhuhujin Yuji Meng Liao Ji Hehesu Zhe Beijing Min Jianlu Yu Gui Qiongchuan Guiyun Zang shan Gan Qingning New policeman hanging emergency ABCDEFGGHJKLMNPQRSTUVWXYZ 0123456789-" wherein "-" means a blank part, the length is 71, and is consistent with the sequence length of the license plate recognition feature characterization. And each number sequence with the length of 71 in the license plate recognition characteristic representation corresponds to the score of the characters in the license plate recognition dictionary. And taking the one with the highest score as a character of the license plate recognition. Thus, for example, the following results are obtained: "XXXXXXXX- - -" maps 10 number sequences with length 71 to the plate recognition dictionary to get the final plate number. The final license plate obtained after post-treatment is as follows: "XXXXXXXX".
Wherein, f c And f r : the feature vector extracted by the convolutional neural network is usually a three-dimensional (Height, width, channel) matrix vector. α: adaptive compromise parameter alpha for controlling the eigenvectors f obtained from the convolutional learning branch and the rebalancing branch c And f r The weight of (c). The parameter is less than 1 and greater than 0.W c And W r : learnable network parameters, dimensions and f in recognizer c And f r And the consistency is maintained. Softmax function: also called normalized exponential function, in order to show the result of multi-classification in the form of probability. The method is used for converting the network characteristics into license plate recognition classification characteristics. p: and (5) identifying a license plate.
And the convolution learning branch acquires training pictures in a normal sampling mode. Sending the normal sampling picture into a convolutional neural network in a convolutional learning branch for training to obtain a feature vector f c . And the rebalance branch acquires training pictures in a reverse sampling mode. Sending the reverse sampling picture into a convolutional neural network in a rebalancing branch for training to obtain a characteristic vector f r . The debugger initializes a learnable adaptive compromise parameter alpha. A and f are c Multiplying to obtain weighted convolution learning branch network characteristic vector alpha f c . A reaction of (1-alpha) and f r Multiplying to obtain weighted rebalance branch network characteristic vector (1-
Figure 37681DEST_PATH_IMAGE002
)f r . And (3) cumulative learning: weighted feature vector alphaf x And (1-. Alpha.) f r Are sent to the classifiers W of the corresponding branches respectively c And W r Performing classification feature learning. The final fusion is performed on the outputs obtained by the two branches in an element-by-element addition mode, so that the fused features can be expressed as follows: z = α W c f c +(1-α)W r f r . And processing the fused features by a Softmax function to obtain a final license plate recognition feature sequence p. And (5) carrying out license plate recognition post-processing on the features p to obtain a final license plate recognition result.
In the foregoing embodiment, the convolutional learning branch learns the sample data distribution characteristics in the original data set, so that its result represents the global features. The rebalancing branch focuses more on the characteristics of the license plate identification data of the tail class, so that the learned result is more on the characteristics representing the tail class. The final result is to combine the results of the two-sided output, compromise the parameters
Figure 594564DEST_PATH_IMAGE002
It is the effect of the control branch. This compromise parameter
Figure 923914DEST_PATH_IMAGE002
The influence of the two characteristics on the final result is controlled in a weighted manner. The problem of low license plate recognition accuracy caused by long tail distribution in the license plate recognition data can be effectively solved.
The bilateral branch network respectively learns the uniform data sample characteristics and the tail category characteristics, and finally fuses the uniform data sample characteristics and the tail category characteristics in an accumulative learning mode, so that the accuracy of the head category is not lost, and meanwhile, the data distribution of the tail category can be well considered, and the final improvement of the accuracy is realized. Namely, the bilateral branch is used for independently learning respective features, and then an accumulative learning strategy is used for summarizing and then learning the two features. Therefore, on one hand, the generalization capability of the network is improved, and on the other hand, the problem that the network parameters are difficult to fit due to direct addition can be avoided.
Compared with a reassignment strategy, the method does not need to define labels for a large number of head and tail data categories, has fewer network parameters, is simple and easy to adjust, and is easy to expand in a large sample data set.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
In this embodiment, a device for determining a character is further provided, and the device is used to implement the foregoing embodiments and preferred embodiments, and the description of the device that has been already made is omitted. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 5 is a block diagram showing a structure of a character determination apparatus according to an embodiment of the present invention, and as shown in fig. 5, the apparatus includes:
a first determining module 52, configured to input target data into a target network model, and determine a target feature of the target data;
a second determining module 54, configured to determine, based on the target feature, a target character corresponding to the target data;
wherein the target network model is trained by: the training data collection device is used for extracting first training data from a training data set according to a first sampling rule and extracting second training data from the training data set according to a second sampling rule, the training data set comprises a first type of training data and a second type of training data, the number of the first type of training data is larger than that of the second type of training data, the first sampling rule comprises that the probability of each training data being extracted is the same, the second sampling rule comprises that the probability of the second type of training data being extracted is larger than that of the first type of training data, and the first training data, the second training data and the target data are collected by a collection device; inputting the first training data into a first branch network included in a target network model, and determining a first feature of the first training data; inputting the second training data into a second branch network included in the target network model, and determining a second feature of the second training data; fusing the first feature and the second feature based on network parameters of a classification model included in the target network model to obtain a fused feature; determining a target loss value of the target network model based on the fusion characteristics, and iteratively updating target network parameters of the target network model based on the target loss value.
In an exemplary embodiment, the apparatus may implement fusing the first feature and the second feature based on classification network parameters of a classification model included in the target network model to obtain a fused feature by: determining a first weight corresponding to the first branch network and a second weight corresponding to the second branch network; multiplying the first feature by the first weight to obtain a third feature, and multiplying the second feature by the second weight to obtain a fourth feature; and fusing the third feature and the fourth feature based on the classification network parameters to obtain the fused feature.
In an exemplary embodiment, the apparatus may implement fusing the third feature and the fourth feature based on the classification network parameter to obtain the fused feature by: determining a first sub-classification network parameter and a second sub-classification network parameter included in the classification network parameter, wherein the first sub-classification network parameter is a parameter of a first sub-classification network, the first sub-classification network is connected with the first branch network, the second sub-classification network parameter is a parameter of a second sub-classification network, and the second sub-classification network is connected with the second branch network; multiplying the third characteristic by the first sub-classification network parameter to obtain a fifth characteristic, and multiplying the fourth characteristic by the second sub-classification network parameter to obtain a sixth characteristic; and connecting the fifth feature and the sixth feature to obtain the fused feature.
In an exemplary embodiment, the apparatus may determine the target loss value of the target network model based on the fusion feature by: determining a first loss value for the first branch network based on the fifth characteristic; determining a second loss value for the second branch network based on the sixth characteristic; determining a target loss value for the target network model based on the first loss value and the second loss value.
In an exemplary embodiment, the apparatus may enable iteratively updating the target network parameter of the target network model based on the target loss value by: determining a first loss value corresponding to the first branch network and a second loss value corresponding to the second branch network, which are included in the target loss value; updating a first weight and a second weight included in the target network parameter based on the first loss value and the second loss value, wherein the first weight is a weight corresponding to the first branch network, and the second weight is a weight corresponding to the second branch network.
In an exemplary embodiment, the apparatus may enable iteratively updating the target network parameter of the target network model based on the target loss value by: determining a network parameter update gradient based on the target loss value; and updating the network parameters of the first branch network, the second branch network and the classification model included in the target network model according to the network parameter updating gradient.
In an exemplary embodiment, the second determination module 54 may determine the target character corresponding to the target data based on the target feature by: determining a similarity score for each feature included in the target features and features of characters included in a feature dictionary table; determining the target character corresponding to the target data based on the similarity score.
It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.
Embodiments of the present invention also provide a computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the steps of the method as set forth in any of the above.
In an exemplary embodiment, the computer readable storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.
In an exemplary embodiment, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
For specific examples in this embodiment, reference may be made to the examples described in the foregoing embodiments and exemplary implementations, and details of this embodiment are not repeated herein.
It will be apparent to those skilled in the art that the various modules or steps of the invention described above may be implemented using a general purpose computing device, they may be centralized on a single computing device or distributed across a network of computing devices, and they may be implemented using program code executable by the computing devices, such that they may be stored in a memory device and executed by the computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into various integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method for determining characters, comprising:
inputting target data into a target network model, and determining target characteristics of the target data;
determining a target character corresponding to the target data based on the target feature;
wherein the target network model is trained by: extracting first training data from a training data set according to a first sampling rule, and extracting second training data from the training data set according to a second sampling rule, wherein the training data set comprises training data of a first type and training data of a second type, the number of the training data of the first type is greater than that of the training data of the second type, the first sampling rule comprises that the probability of each training data being extracted is the same, the second sampling rule comprises that the probability of the training data of the second type being extracted is greater than that of the training data of the first type, and the first training data, the second training data and the target data are data collected by a collecting device; inputting the first training data into a first branch network included in the target network model, and determining a first feature of the first training data; inputting the second training data into a second branch network included in the target network model, and determining a second feature of the second training data; fusing the first feature and the second feature based on network parameters of a classification model included in the target network model to obtain a fused feature; determining a target loss value of the target network model based on the fusion characteristics, and iteratively updating target network parameters of the target network model based on the target loss value.
2. The method of claim 1, wherein fusing the first feature and the second feature based on classification network parameters of a classification model included in the target network model to obtain a fused feature comprises:
determining a first weight corresponding to the first branch network and a second weight corresponding to the second branch network;
multiplying the first feature by the first weight to obtain a third feature, and multiplying the second feature by the second weight to obtain a fourth feature;
and fusing the third feature and the fourth feature based on the classification network parameters to obtain the fused feature.
3. The method of claim 2, wherein fusing the third feature and the fourth feature based on the classification network parameters to obtain the fused feature comprises:
determining a first sub-classification network parameter and a second sub-classification network parameter included in the classification network parameter, wherein the first sub-classification network parameter is a parameter of a first sub-classification network, the first sub-classification network is connected with the first branch network, the second sub-classification network parameter is a parameter of a second sub-classification network, and the second sub-classification network is connected with the second branch network;
multiplying the third feature by the first sub-classification network parameter to obtain a fifth feature, and multiplying the fourth feature by the second sub-classification network parameter to obtain a sixth feature;
and connecting the fifth feature and the sixth feature to obtain the fused feature.
4. The method of claim 3, wherein determining a target loss value for the target network model based on the fused feature comprises:
determining a first loss value for the first branch network based on the fifth characteristic;
determining a second loss value for the second branch network based on the sixth characteristic;
determining a target loss value for the target network model based on the first loss value and the second loss value.
5. The method of claim 1, wherein iteratively updating the target network parameters of the target network model based on the target loss values comprises:
determining a first loss value corresponding to the first branch network and a second loss value corresponding to the second branch network, which are included in the target loss value;
updating a first weight and a second weight included in the target network parameter based on the first loss value and the second loss value, wherein the first weight is a weight corresponding to the first branch network, and the second weight is a weight corresponding to the second branch network.
6. The method of claim 1, wherein iteratively updating the target network parameters of the target network model based on the target loss values comprises:
determining a network parameter update gradient based on the target loss value;
and updating the network parameters of the first branch network, the second branch network and the classification model included in the target network model according to the network parameter updating gradient.
7. The method of claim 1, wherein determining the target character corresponding to the target data based on the target feature comprises:
determining a similarity score for each feature included in the target features and features of characters included in a feature dictionary table;
and determining the target character corresponding to the target data based on the similarity score.
8. An apparatus for determining a character, comprising:
the first determining module is used for inputting target data into a target network model and determining target characteristics of the target data;
the second determination module is used for determining a target character corresponding to the target data based on the target characteristic;
wherein the target network model is trained by: extracting first training data from a training data set according to a first sampling rule, and extracting second training data from the training data set according to a second sampling rule, wherein the training data set comprises training data of a first type and training data of a second type, the number of the training data of the first type is greater than that of the training data of the second type, the first sampling rule comprises that the probability of each training data being extracted is the same, the second sampling rule comprises that the probability of the training data of the second type being extracted is greater than that of the training data of the first type, and the first training data, the second training data and the target data are data collected by a collecting device; inputting the first training data into a first branch network included in a target network model, and determining a first feature of the first training data; inputting the second training data into a second branch network included in the target network model, and determining a second feature of the second training data; fusing the first feature and the second feature based on network parameters of a classification model included in the target network model to obtain a fused feature; determining a target loss value of the target network model based on the fusion characteristics, and iteratively updating target network parameters of the target network model based on the target loss value.
9. A computer-readable storage medium, in which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
10. An electronic device comprising a memory and a processor, wherein the memory has a computer program stored therein, and the processor is configured to execute the computer program to perform the method of any of claims 1 to 7.
CN202211400525.5A 2022-11-09 2022-11-09 Character determining method and device, storage medium and electronic device Active CN115457573B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211400525.5A CN115457573B (en) 2022-11-09 2022-11-09 Character determining method and device, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211400525.5A CN115457573B (en) 2022-11-09 2022-11-09 Character determining method and device, storage medium and electronic device

Publications (2)

Publication Number Publication Date
CN115457573A true CN115457573A (en) 2022-12-09
CN115457573B CN115457573B (en) 2023-04-28

Family

ID=84311574

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211400525.5A Active CN115457573B (en) 2022-11-09 2022-11-09 Character determining method and device, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN115457573B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190095730A1 (en) * 2017-09-25 2019-03-28 Beijing University Of Posts And Telecommunications End-To-End Lightweight Method And Apparatus For License Plate Recognition
CN111126224A (en) * 2019-12-17 2020-05-08 成都通甲优博科技有限责任公司 Vehicle detection method and classification recognition model training method
CN112232506A (en) * 2020-09-10 2021-01-15 北京迈格威科技有限公司 Network model training method, image target recognition method, device and electronic equipment
CN114627319A (en) * 2022-05-16 2022-06-14 杭州闪马智擎科技有限公司 Target data reporting method and device, storage medium and electronic device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190095730A1 (en) * 2017-09-25 2019-03-28 Beijing University Of Posts And Telecommunications End-To-End Lightweight Method And Apparatus For License Plate Recognition
CN111126224A (en) * 2019-12-17 2020-05-08 成都通甲优博科技有限责任公司 Vehicle detection method and classification recognition model training method
CN112232506A (en) * 2020-09-10 2021-01-15 北京迈格威科技有限公司 Network model training method, image target recognition method, device and electronic equipment
CN114627319A (en) * 2022-05-16 2022-06-14 杭州闪马智擎科技有限公司 Target data reporting method and device, storage medium and electronic device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨昌东;余烨;徐珑刀;付源梓;路强;: "基于AT-PGGAN的增强数据车辆型号精细识别" *

Also Published As

Publication number Publication date
CN115457573B (en) 2023-04-28

Similar Documents

Publication Publication Date Title
Suganuma et al. Attention-based adaptive selection of operations for image restoration in the presence of unknown combined distortions
CN113255694B (en) Training image feature extraction model and method and device for extracting image features
US8239336B2 (en) Data processing using restricted boltzmann machines
US20190294928A1 (en) Image processing method and apparatus, and computer-readable storage medium
CN111160191B (en) Video key frame extraction method, device and storage medium
CN109614933B (en) Motion segmentation method based on deterministic fitting
CN110321892B (en) Picture screening method and device and electronic equipment
CN111833372A (en) Foreground target extraction method and device
CN112036261A (en) Gesture recognition method and device, storage medium and electronic device
CN115829027A (en) Comparative learning-based federated learning sparse training method and system
CN111401193A (en) Method and device for obtaining expression recognition model and expression recognition method and device
CN117315237B (en) Method and device for determining target detection model and storage medium
CN112200862B (en) Training method of target detection model, target detection method and device
CN110457704A (en) Determination method, apparatus, storage medium and the electronic device of aiming field
CN107193979B (en) Method for searching homologous images
CN113223614A (en) Chromosome karyotype analysis method, system, terminal device and storage medium
CN115457573A (en) Character determination method and device, storage medium and electronic device
CN111191065A (en) Homologous image determining method and device
CN114170484B (en) Picture attribute prediction method and device, electronic equipment and storage medium
CN111738310A (en) Material classification method and device, electronic equipment and storage medium
CN110059742A (en) Safety protector wearing recognition methods and equipment based on deep learning
CN113822373B (en) Image classification model training method based on integration and knowledge distillation
CN116011550A (en) Model pruning method, image processing method and related devices
CN115830342A (en) Method and device for determining detection frame, storage medium and electronic device
CN113762019B (en) Training method of feature extraction network, face recognition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant