CN114399766A - Optical character recognition model training method, device, equipment and medium - Google Patents

Optical character recognition model training method, device, equipment and medium Download PDF

Info

Publication number
CN114399766A
CN114399766A CN202210056338.3A CN202210056338A CN114399766A CN 114399766 A CN114399766 A CN 114399766A CN 202210056338 A CN202210056338 A CN 202210056338A CN 114399766 A CN114399766 A CN 114399766A
Authority
CN
China
Prior art keywords
data set
recognition model
character
data
character recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210056338.3A
Other languages
Chinese (zh)
Other versions
CN114399766B (en
Inventor
吴天学
刘鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202210056338.3A priority Critical patent/CN114399766B/en
Publication of CN114399766A publication Critical patent/CN114399766A/en
Application granted granted Critical
Publication of CN114399766B publication Critical patent/CN114399766B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)

Abstract

The invention relates to the field of artificial intelligence, and discloses an optical character recognition model training method, which comprises the following steps: screening error data of an original picture set and an original data set in actual production by using a search engine, and determining that the error data forms a negative sample data set and non-error data forms a positive sample data set; recognizing a positive sample data set, a negative sample data set and a predicted character set of an original image set by using an optical character recognition model; and calculating loss values of the predicted character set, the real character labeling set and the error character labeling set, and if the loss values do not meet preset conditions, adjusting parameters of the model until the loss values meet the preset conditions to obtain the trained optical character recognition model. The invention also relates to a block chain technology, and the trained optical character recognition model can be stored in the block chain link points. The invention also provides an optical character recognition model training device, equipment and a medium. The invention can improve the efficiency and accuracy of the training of the optical character recognition model.

Description

Optical character recognition model training method, device, equipment and medium
Technical Field
The invention relates to the field of artificial intelligence, in particular to an optical character recognition model training method and device, electronic equipment and a computer readable storage medium.
Background
With the research and development of artificial intelligence technology, higher and higher requirements are provided for the recognition accuracy of an optical character recognition model (such as an OCR deep learning recognition model), and as a mature OCR deep learning recognition model needs dozens of iterations or even hundreds of iterations, some scientific and technological enterprises invest a large amount of manpower and material resources for obtaining the mature OCR deep learning recognition model, so that the rapid development and iteration of the OCR deep learning recognition model are realized, and the requirement of business growth is met.
However, the traditional optical character recognition model has difference in distribution of development environment training data and production environment data during training, so that the optical character recognition model with better recognition effect in the development environment can not always achieve the same good recognition effect in the production environment, and the accuracy of the optical character recognition model is low; when the recognition effect is not good, the test data is continuously and repeatedly constructed for testing, so that the training efficiency of the optical character recognition model is low, and the accuracy rate cannot be improved.
Disclosure of Invention
The invention provides an optical character recognition model training method, an optical character recognition model training device, electronic equipment and a computer readable storage medium, and mainly aims to improve the efficiency and accuracy of optical character recognition model training.
In order to achieve the above object, the present invention provides a training method for an optical character recognition model, comprising:
acquiring an original picture set and an original data set corresponding to the original picture set in actual production, and storing the original picture set and the original data set into a preset message queue channel;
when a preset search engine is idle, acquiring an original data set corresponding to the original picture set from the message queue channel by using the search engine, screening error data of the original data set, and determining that the screened error data form a negative sample data set and non-error data except the error data form a positive sample data set;
acquiring a real character label set corresponding to the positive sample data set and an error character label set corresponding to the negative sample data set, wherein the error character label set is dynamically updated in real time;
inputting the positive sample data set, the negative sample data set and the original picture set as training data sets into a preset optical character recognition model, and recognizing a predicted character set of the training data sets by using the optical character recognition model;
and obtaining loss values of the predicted character set, the real character label set and the error character label set through calculation, if the loss values do not meet preset conditions, adjusting parameters of the optical character recognition model until the loss values meet the preset conditions, and obtaining the trained optical character recognition model.
Optionally, the performing error data screening on the original data set, determining that the screened error data constitutes a negative sample data set, and non-error data other than the error data constitutes a positive sample data set, includes:
acquiring the sequence length of the original data in the original data set by using a preset search engine, and setting a sequence length index by using a preset screening statement in the preset search engine;
and comparing the sequence length with the sequence length index, forming a negative sample data set by using the original data corresponding to the sequence length with inconsistent length of the sequence length index, and forming a positive sample data set by using the original data corresponding to the sequence length with consistent length of the sequence length index.
Optionally, after determining that the screened error data constitutes a negative sample data set and non-error data other than the error data constitutes a positive sample data set, the method further includes:
acquiring data fields of the positive sample data set and the negative sample data set, and identifying sensitive fields in the data fields;
and carrying out desensitization operation on the sensitive field by using a preset desensitization function.
Optionally, the recognizing the predicted character set of the training data set by using a preset optical character recognition model includes:
extracting a characteristic sequence of the training data set by using a convolutional layer in a preset optical character recognition model to obtain a character vector set;
predicting a character tag set of the character vector set using a loop layer in the optical character recognition model;
and integrating the character label set by utilizing a transcription layer in the optical character recognition model to obtain a predicted character set.
Optionally, the predicting a character tag set of the character vector set using a loop layer in the optical character recognition model includes:
calculating a state value of the character vector set by using an input gate in the circulation layer;
calculating an activation value of the character vector set by using a forgetting gate in the circulation layer;
calculating a state update value of the character vector set according to the state arrival and activation values;
and calculating the character label set of the state updating value by using an output gate in the circulation layer to obtain the character label set of the character vector set.
Optionally, the integrating the character tag set by using a transcription layer in the optical character recognition model to obtain a predicted character set includes:
acquiring all path probabilities of the character label set by utilizing the transcription layer, and searching the maximum path probability corresponding to each character label from the path probabilities;
and combining the maximum path probabilities to obtain a predicted character set of the character label set.
Optionally, before storing the original picture set and the original data set in a preset message queue channel, the method further includes:
establishing a link between the original data set and the original picture set and the message middleware, and forming a message queue channel through the link;
storing the raw picture set and raw data set through the message queue channel.
In order to solve the above problem, the present invention further provides an optical character recognition model training apparatus, including:
the data set acquisition module is used for acquiring an original picture set and an original data set corresponding to the original picture set in actual production and storing the original picture set and the original data set into a preset message queue channel;
the data set screening module is used for acquiring an original data set corresponding to the original picture set from the message queue channel by using a preset search engine when the search engine is idle, screening error data of the original data set, and determining that screened error data form a negative sample data set and non-error data except the error data form a positive sample data set;
the data set marking module is used for acquiring a real character marking set corresponding to the positive sample data set and an error character marking set corresponding to the negative sample data set, wherein the error character marking set is dynamically updated in real time;
a training data set identification module, configured to input the positive sample data set, the negative sample data set, and the original image set as training data sets to a preset optical character recognition model, and identify a predicted character set of the training data set by using the optical character recognition model;
and the model training module is used for obtaining loss values of the predicted character set, the real character label set and the error character label set through calculation, if the loss values do not meet preset conditions, adjusting parameters of the optical character recognition model until the loss values meet the preset conditions, and obtaining the trained optical character recognition model.
In order to solve the above problem, the present invention also provides an electronic device, including:
a memory storing at least one computer program; and
and the processor executes the computer program stored in the memory to realize the optical character recognition model training method.
In order to solve the above problem, the present invention further provides a computer-readable storage medium, in which at least one computer program is stored, and the at least one computer program is executed by a processor in an electronic device to implement the optical character recognition model training method described above.
In the embodiment of the invention, the original picture set in actual production and the original data set corresponding to the original picture set are obtained firstly, so that the data distribution difference in a development environment and a production environment can be avoided, and the accuracy of subsequent model training is improved; secondly, when a preset search engine is idle, the search engine is utilized to obtain an original data set corresponding to the original picture set from the message queue channel, the search engine is utilized to carry out error data identification on the original data set, error data can be directly screened by the search engine instead of manual error data screening, manpower and time are saved, the iteration cycle of subsequent model training is shortened, and the training efficiency of the subsequent model is improved; furthermore, by marking the real characters corresponding to the positive sample data and marking the error characters corresponding to the negative sample data, the subsequent model can be conveniently trained; and finally, recognizing the predicted characters of the training data by using the optical character recognition model, respectively calculating loss values of the predicted characters, the real characters and the error characters, if the loss values do not meet preset conditions, adjusting parameters of the optical character recognition model, further improving the accuracy of the model until the loss values meet the preset conditions, and obtaining the trained optical character recognition model. Therefore, the optical character recognition model training method, the optical character recognition model training device, the electronic equipment and the storage medium provided by the embodiment of the invention can improve the efficiency and accuracy of optical character recognition model training.
Drawings
FIG. 1 is a schematic flow chart illustrating a training method for an optical character recognition model according to an embodiment of the present invention;
FIG. 2 is a block diagram of an optical character recognition model training apparatus according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an internal structure of an electronic device implementing a training method for an optical character recognition model according to an embodiment of the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The embodiment of the invention provides an optical character recognition model training method. The execution subject of the optical character recognition model training method includes, but is not limited to, at least one of the electronic devices that can be configured to execute the method provided by the embodiments of the present application, such as a server, a terminal, and the like. In other words, the optical character recognition model training method may be performed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.
Referring to fig. 1, which is a schematic flow diagram of an optical character recognition model training method according to an embodiment of the present invention, in an embodiment of the present invention, the optical character recognition model training method includes:
in detail, the training method of the optical character recognition model comprises the following steps:
s1, acquiring an original picture set in actual production and an original data set corresponding to the original picture set, and storing the original picture set and the original data set into a preset message queue channel.
In the embodiment of the invention, the original picture set is unstructured data acquired from an actual production environment process by using a preset optical character recognition interface, the original data set is character information extracted from the original picture set by using the optical character recognition interface, and the character information is structured data.
The structured data is row data, can be stored in a database, and is realized by two-dimensional logic expression; the unstructured data refers to data that cannot be realized with two-dimensional logic expression, such as text, picture, XML, HTML, audio, video, and the like.
Specifically, in this embodiment, a preset optical character recognition interface is used to send the recognized structured data and unstructured data to a message queue channel, and then the data is processed according to the user requirement.
In detail, before storing the original picture set and the original data set in a preset message queue channel, the method further includes:
establishing a link between the original data set and the original picture set and the message middleware, and forming a message queue channel through the link;
storing the raw picture set and raw data set through the message queue channel.
In the embodiment of the invention, the message queue channel is a channel which is formed by linking original data and message middleware and can receive, store and send information.
Preferably, the link may be a TCP link.
Preferably, the message middleware may be kafka.
In another embodiment of the present invention, the identified structured data may be sent to the message queue channel by using a preset optical character recognition interface, the unstructured data is stored in the NAS disk, and then the data is processed according to the user requirement.
And S2, when a preset search engine is idle, acquiring an original data set corresponding to the original picture set from the message queue channel by using the search engine, screening error data of the original data set, and determining that the screened error data form a negative sample data set and non-error data except the error data form a positive sample data set.
In the embodiment of the invention, the original picture set and the original data set can be asynchronously stored through the preset message queue channel, when the original picture set and the original data set are transmitted to the message queue channel, the message queue channel can firstly not process the original picture set and the original data set, and the time for processing the original picture set and the original data set is determined according to the requirements of users; the original data set processed in the message queue channel is transmitted to a preset search engine, the original data set corresponding to the original picture set in actual production is obtained, the preset screening data of the search engine can be used for screening error data in the original data set corresponding to the original picture set in the message queue channel, the error data is error character information in the original data set corresponding to the original picture set, the error data forms a negative sample data set, and non-error data except the rest error data forms a positive sample data set.
Preferably, the predetermined search engine may be an ElasticSearch.
In detail, the screening of the error data from the original data set, determining that the screened error data constitutes a negative sample data set, and that non-error data other than the error data constitutes a positive sample data set, includes:
acquiring the sequence length of the original data in the original data set by using a preset search engine, and setting a sequence length index by using a preset screening statement in the preset search engine;
and comparing the sequence length with the sequence length index, forming a negative sample data set by using the original data corresponding to the sequence length with inconsistent length of the sequence length index, and forming a positive sample data set by using the original data corresponding to the sequence length with consistent length of the sequence length index.
In the embodiment of the present invention, the sequence length may be a character length corresponding to the original data, and the sequence length index set by using the preset filtering statement is an index set for a fixed length of a character in the original data.
For example, if a certain original picture is a car license plate picture, and the sequence length included in the original data corresponding to the original picture is seven digits (that is, the length of the characters included in the original picture is seven digits), the screening statement sets the sequence length index to be length 7, and if the sequence length in the original data is identified to be 7, the original data is determined to be positive sample data; and if the sequence length in the original data is not identified to be 7, determining that the original data is negative sample data.
Further, after determining that the screened error data constitutes a negative sample data set and non-error data other than the error data constitutes a positive sample data set, the method further includes:
acquiring data fields of the positive sample data set and the negative sample data set, and identifying sensitive fields in the data fields;
and carrying out desensitization operation on the sensitive field by using a preset desensitization function.
In the embodiment of the invention, fields related to personal privacy information such as a personal name, identity card information, a mobile phone number and the like are sensitive fields, and after the sensitive fields are identified, data replacement or mask shielding processing can be performed on the sensitive fields to realize desensitization, for example, data replacement is performed on the middle four digits of the mobile phone number to obtain 13800001248, or middle four digits of the mobile phone number are masked to obtain 138 x 1248.
For example, desensitization is performed on individual identification information, cell phone numbers, bank card information, etc. collected by institutions and businesses.
In the embodiment of the invention, the privacy information in the positive sample data and the negative sample data can be shielded or hidden by desensitizing the sensitive field by the desensitizing function, so that the safety of the privacy data of the user in the real production environment is protected.
For example, because the embodiment of the invention uses data in the actual production process to train the model, confidential data of an enterprise may be involved, a preset data use authority may be used, and the use limitation is performed in the data use authority, so that a developer cannot view or download the data.
In one embodiment of the invention, the preset data use permission can also ensure that developers can only use data but cannot check or download the data when performing model iteration and training, thereby avoiding the risk of data leakage.
S3, acquiring a real character labeling set corresponding to the positive sample data set and an error character labeling set corresponding to the negative sample data set, wherein the error character labeling set is dynamically updated in real time.
In the embodiment of the present invention, the positive sample data set and the negative sample data set may be transmitted to a preset labeling platform, and a preset labeling platform is used to label a real character label of the positive sample data set and label corresponding position information of the positive sample data set in an original picture (i.e. position information of a real character in the original picture), and the real character label of the positive sample data set and the position information of the real character in the original picture are combined into a real character labeling set; similarly, a preset labeling platform is used for labeling the character label of the negative sample data set, and labeling the corresponding position information of the negative sample data set in the original picture (namely the position information of the error character in the original picture), and the character label of the negative sample data set and the position information of the character in the original picture are combined to form an error character labeling set.
For example, seven digits of license plate numbers XA · xxxxxx can be used for enabling the labeling platform to label the real character label of the positive sample data XA · xxxxxx and the labeled real character respectively corresponds to the digit of the license plate number by calling the labeling platform interface, and combining the real character label of the positive sample data XA · xxxxxx and the labeled real character respectively corresponding to the digit of the license plate number into a real character labeling set; the license plate number of the negative sample data is XB XXX, the marking platform can be used for marking the character marking of the negative sample data XB XXX by calling the marking platform interface, the marked characters respectively correspond to the number of digits of the license plate number, and the character marking of the negative sample data XB XXX and the marked characters respectively correspond to the number of digits of the license plate number to form an error character marking set.
In the embodiment of the invention, the real-time dynamic updating of the wrong character label set means that after the label platform finds the wrong character label set, the negative sample data set corresponding to the wrong character label can be subjected to real-time data updating, and the negative sample data set is used as an updated training data set and is input into a subsequent model, so that the accuracy of subsequent model training is improved.
And S4, inputting the positive sample data set, the negative sample data set and the original picture set as training data sets into a preset optical character recognition model, and recognizing a predicted character set of the training data sets by using the optical character recognition model.
In an embodiment of the present invention, the preset optical character recognition model may be a deep learning model of a CRNN structure, where the CRNN structure includes: CNN + LSTM + CTC, and the optical character recognition model comprises: convolutional layer (CNN), cyclic Layer (LSTM), transcriptional layer (CTC), and loss function.
In detail, the recognizing the predicted characters of the training data set by using a preset optical character recognition model includes:
extracting a characteristic sequence of the training data set by using a convolutional layer in a preset optical character recognition model to obtain a character vector set;
predicting a character tag set of the character vector set using a loop layer in the optical character recognition model;
and integrating the character label set by utilizing a transcription layer in the optical character recognition model to obtain a predicted character set.
In one embodiment of the invention, the convolutional layer comprises a convolutional sublayer and a pooling layer, and the feature extraction can be performed on the training data set through the convolutional sublayer to obtain a feature map; and extracting the feature sequence vector in the feature map by using the pooling layer pair to obtain a character vector set.
In another embodiment of the present invention, the loop layer is mainly composed of the variant LTSM of RNN, and because RNN has the problem of gradient disappearance and cannot acquire more context information, LSTM replaces RNN to better extract context information, wherein the loop layer includes an input gate, a forgetting gate and an output gate.
In detail, the predicting the character tag set of the character vector set using a loop layer in the optical character recognition model includes:
calculating a state value of the character vector set by using an input gate in the circulation layer;
calculating an activation value of the character vector set by using a forgetting gate in the circulation layer;
calculating a state update value of the character vector set according to the state arrival and activation values;
and calculating the character label set of the state updating value by using an output gate in the circulation layer to obtain the character label set of the character vector set.
In the embodiment of the invention, the input gate can control the quantity of the character vector sets entering and exiting and the quantity of the character vector sets passing through the gate; the forgetting gate is used for controlling the number of the character vector sets flowing from the previous moment to the current moment; the state updating value is that if the character vector set passing through the forgetting gate is not selected to be forgotten by the forgetting gate, the character vector set is taken as the state updating value; the output gate may output a set of character tags of a set of character vectors.
In the embodiment of the present invention, the transcription layer is mainly composed of ctc (connectionist Temporal classification), and mainly functions to convert a predicted character tag set in the LSTM into a predicted character set with tags.
Further, the integrating the character tag set by using the transcription layer in the optical character recognition model to obtain a predicted character set includes:
acquiring all path probabilities of the character label set by utilizing the transcription layer, and searching the maximum path probability corresponding to each character label from the path probabilities;
and combining the maximum path probabilities to obtain the predicted character of the character label set.
In the embodiment of the invention, the predicted character can be obtained by the following formula:
Figure BDA0003476389940000081
in the embodiment of the invention, P (pi | x) is the path probability of all character tags, B (pi) is the path set of all character tags, pi is the maximum path probability corresponding to each character tag, and y is the predicted character corresponding to the character tag.
S5, obtaining loss values of the predicted character set, the real character label set and the error character label set through calculation, if the loss values do not meet preset conditions, adjusting parameters of the optical character recognition model until the loss values meet the preset conditions, and obtaining the trained optical character recognition model.
In an embodiment of the present invention, a first loss value of the predicted character set and the real character label set is calculated, a second loss value of the predicted character set and the error character label set is calculated, the first loss value and the second loss value are fused to obtain a loss value, and if the loss value does not satisfy a preset condition, a parameter of the optical character recognition model is adjusted until the loss value satisfies the preset condition, so as to obtain a trained optical character recognition model.
For example, the preset condition may be a preset threshold of 0.1, and when the loss value is less than 0.1, the model parameters are adjusted until the loss value is greater than or equal to 0.1, so as to obtain the trained optical character recognition model.
In the embodiment of the invention, the original picture set in actual production and the original data set corresponding to the original picture set are obtained firstly, so that the data distribution difference in a development environment and a production environment can be avoided, and the accuracy of subsequent model training is improved; secondly, when a preset search engine is idle, the search engine is utilized to obtain an original data set corresponding to the original picture set from the message queue channel, the search engine is utilized to carry out error data identification on the original data set, error data can be directly screened by the search engine instead of manual error data screening, manpower and time are saved, the iteration cycle of subsequent model training is shortened, and the training efficiency of the subsequent model is improved; furthermore, by marking the real characters corresponding to the positive sample data and marking the error characters corresponding to the negative sample data, the subsequent model can be conveniently trained; and finally, recognizing the predicted characters of the training data by using the optical character recognition model, respectively calculating loss values of the predicted characters, the real characters and the error characters, if the loss values do not meet preset conditions, adjusting parameters of the optical character recognition model, further improving the accuracy of the model until the loss values meet the preset conditions, and obtaining the trained optical character recognition model. Therefore, the training method of the optical character recognition model provided by the embodiment of the invention can improve the efficiency and accuracy of training the optical character recognition model.
FIG. 2 is a functional block diagram of the training apparatus for OCR model according to the present invention.
The optical character recognition model training apparatus 100 according to the present invention can be installed in an electronic device. According to the implemented functions, the optical character recognition model training apparatus may include a data set obtaining module 101, a data set screening module 102, a data set labeling module 103, a training data set recognition module 104, and a model training module 105, which may also be referred to as a unit, and refers to a series of computer program segments that can be executed by a processor of an electronic device and can perform fixed functions, and are stored in a memory of the electronic device.
In the present embodiment, the functions regarding the respective modules/units are as follows:
the data set obtaining module 101 is configured to obtain an original picture set and an original data set corresponding to the original picture set in actual production, and store the original picture set and the original data set in a preset message queue channel.
In the embodiment of the invention, the original picture set is unstructured data acquired from an actual production environment process by using a preset optical character recognition interface, the original data set is character information extracted from the original picture set by using the optical character recognition interface, and the character information is structured data.
The structured data is row data, can be stored in a database, and is realized by two-dimensional logic expression; the unstructured data refers to data that cannot be realized with two-dimensional logic expression, such as text, picture, XML, HTML, audio, video, and the like.
Specifically, in this embodiment, a preset optical character recognition interface is used to send the recognized structured data and unstructured data to a message queue channel, and then the data is processed according to the user requirement.
The dataset acquisition module may be to:
establishing a link between the original data set and the original picture set and the message middleware, and forming a message queue channel through the link;
storing the raw picture set and raw data set through the message queue channel.
In the embodiment of the invention, the message queue channel is a channel which is formed by linking original data and message middleware and can receive, store and send information.
Preferably, the link may be a TCP link.
Preferably, the message middleware may be kafka.
In another embodiment of the present invention, the identified structured data may be sent to the message queue channel by using a preset optical character recognition interface, the unstructured data is stored in the NAS disk, and then the data is processed according to the user requirement.
The data set screening module 102 is configured to, when a preset search engine is idle, acquire an original data set corresponding to the original picture set from the message queue channel by using the search engine, perform error data screening on the original data set, determine that screened error data forms a negative sample data set, and determine that non-error data other than the error data forms a positive sample data set.
In the embodiment of the invention, the original picture set and the original data set can be asynchronously stored through the preset message queue channel, when the original picture set and the original data set are transmitted to the message queue channel, the message queue channel can firstly not process the original picture set and the original data set, and the time for processing the original picture set and the original data set is determined according to the requirements of users; the original data set processed in the message queue channel is transmitted to a preset search engine, the original data set corresponding to the original picture set in actual production is obtained, the preset screening data of the search engine can be used for screening error data in the original data set corresponding to the original picture set in the message queue channel, the error data is error character information in the original data set corresponding to the original picture set, the error data forms a negative sample data set, and non-error data except the rest error data forms a positive sample data set.
Preferably, the predetermined search engine may be an ElasticSearch.
In detail, the data set filtering module 102 performs error data filtering on the original data set by performing the following operations, and determines that the filtered error data constitutes a negative sample data set and non-error data other than the error data constitutes a positive sample data set, including:
acquiring the sequence length of the original data in the original data set by using a preset search engine, and setting a sequence length index by using a preset screening statement in the preset search engine;
and comparing the sequence length with the sequence length index, forming a negative sample data set by using the original data corresponding to the sequence length with inconsistent length of the sequence length index, and forming a positive sample data set by using the original data corresponding to the sequence length with consistent length of the sequence length index.
In the embodiment of the present invention, the sequence length may be a character length corresponding to the original data, and the sequence length index set by using the preset filtering statement is an index set for a fixed length of a character in the original data.
For example, if a certain original picture is a car license plate picture, and the sequence length included in the original data corresponding to the original picture is seven digits (that is, the length of the characters included in the original picture is seven digits), the screening statement sets the sequence length index to be length 7, and if the sequence length in the original data is identified to be 7, the original data is determined to be positive sample data; and if the sequence length in the original data is not identified to be 7, determining that the original data is negative sample data.
The dataset screening module is further operable to:
acquiring data fields of the positive sample data set and the negative sample data set, and identifying sensitive fields in the data fields;
and carrying out desensitization operation on the sensitive field by using a preset desensitization function.
In the embodiment of the invention, fields related to personal privacy information such as a personal name, identity card information, a mobile phone number and the like are sensitive fields, and after the sensitive fields are identified, data replacement or mask shielding processing can be performed on the sensitive fields to realize desensitization, for example, data replacement is performed on the middle four digits of the mobile phone number to obtain 13800001248, or middle four digits of the mobile phone number are masked to obtain 138 x 1248.
For example, desensitization is performed on individual identification information, cell phone numbers, bank card information, etc. collected by institutions and businesses.
In the embodiment of the invention, the privacy information in the positive sample data and the negative sample data can be shielded or hidden by desensitizing the sensitive field by the desensitizing function, so that the safety of the privacy data of the user in the real production environment is protected.
For example, because the embodiment of the invention uses data in the actual production process to train the model, confidential data of an enterprise may be involved, a preset data use authority may be used, and the use limitation is performed in the data use authority, so that a developer cannot view or download the data.
In one embodiment of the invention, the preset data use permission can also ensure that developers can only use data but cannot check or download the data when performing model iteration and training, thereby avoiding the risk of data leakage.
The data set labeling module 103 is configured to obtain a real character labeling set corresponding to the positive sample data set and an error character labeling set corresponding to the negative sample data set, where the error character labeling set is dynamically updated in real time.
In the embodiment of the present invention, the positive sample data set and the negative sample data set may be transmitted to a preset labeling platform, and a preset labeling platform is used to label a real character label of the positive sample data set and label corresponding position information of the positive sample data set in an original picture (i.e. position information of a real character in the original picture), and the real character label of the positive sample data set and the position information of the real character in the original picture are combined into a real character labeling set; similarly, a preset labeling platform is used for labeling the character label of the negative sample data set, and labeling the corresponding position information of the negative sample data set in the original picture (namely the position information of the error character in the original picture), and the character label of the negative sample data set and the position information of the character in the original picture are combined to form an error character labeling set.
For example, seven digits of license plate numbers XA · xxxxxx can be used for enabling the labeling platform to label the real character label of the positive sample data XA · xxxxxx and the labeled real character respectively corresponds to the digit of the license plate number by calling the labeling platform interface, and combining the real character label of the positive sample data XA · xxxxxx and the labeled real character respectively corresponding to the digit of the license plate number into a real character labeling set; the license plate number of the negative sample data is XB XXX, the marking platform can be used for marking the character marking of the negative sample data XB XXX by calling the marking platform interface, the marked characters respectively correspond to the number of digits of the license plate number, and the character marking of the negative sample data XB XXX and the marked characters respectively correspond to the number of digits of the license plate number to form an error character marking set.
In the embodiment of the invention, the real-time dynamic updating of the wrong character label set means that after the label platform finds the wrong character label set, the negative sample data set corresponding to the wrong character label can be subjected to real-time data updating, and the negative sample data set is used as an updated training data set and is input into a subsequent model, so that the accuracy of subsequent model training is improved.
The training data set identification module 104 is configured to input the positive sample data set, the negative sample data set, and the original image set as training data sets to a preset optical character recognition model, and identify a predicted character set of the training data set by using the optical character recognition model.
In an embodiment of the present invention, the preset optical character recognition model may be a deep learning model of a CRNN structure, where the CRNN structure includes: CNN + LSTM + CTC, and the optical character recognition model comprises: convolutional layer (CNN), cyclic Layer (LSTM), transcriptional layer (CTC), and loss function.
In detail, the training data set recognition module 104 recognizes the predicted characters of the training data set by using a preset optical character recognition model by performing the following operations, including:
extracting a characteristic sequence of the training data set by using a convolutional layer in a preset optical character recognition model to obtain a character vector set;
predicting a character tag set of the character vector set using a loop layer in the optical character recognition model;
and integrating the character label set by utilizing a transcription layer in the optical character recognition model to obtain a predicted character set.
In one embodiment of the invention, the convolutional layer comprises a convolutional sublayer and a pooling layer, and the feature extraction can be performed on the training data set through the convolutional sublayer to obtain a feature map; and extracting the feature sequence vector in the feature map by using the pooling layer pair to obtain a character vector set.
In another embodiment of the present invention, the loop layer is mainly composed of the variant LTSM of RNN, and because RNN has the problem of gradient disappearance and cannot acquire more context information, LSTM replaces RNN to better extract context information, wherein the loop layer includes an input gate, a forgetting gate and an output gate.
In detail, the predicting the character tag set of the character vector set using a loop layer in the optical character recognition model includes:
calculating a state value of the character vector set by using an input gate in the circulation layer;
calculating an activation value of the character vector set by using a forgetting gate in the circulation layer;
calculating a state update value of the character vector set according to the state arrival and activation values;
and calculating the character label set of the state updating value by using an output gate in the circulation layer to obtain the character label set of the character vector set.
In the embodiment of the invention, the input gate can control the quantity of the character vector sets entering and exiting and the quantity of the character vector sets passing through the gate; the forgetting gate is used for controlling the number of the character vector sets flowing from the previous moment to the current moment; the state updating value is that if the character vector set passing through the forgetting gate is not selected to be forgotten by the forgetting gate, the character vector set is taken as the state updating value; the output gate may output a set of character tags of a set of character vectors.
In the embodiment of the present invention, the transcription layer is mainly composed of ctc (connectionist Temporal classification), and mainly functions to convert a predicted character tag set in the LSTM into a predicted character set with tags.
Further, the integrating the character tag set by using the transcription layer in the optical character recognition model to obtain a predicted character set includes:
acquiring all path probabilities of the character label set by utilizing the transcription layer, and searching the maximum path probability corresponding to each character label from the path probabilities;
and combining the maximum path probabilities to obtain the predicted character of the character label set.
In the embodiment of the invention, the predicted character can be obtained by the following formula:
Figure BDA0003476389940000131
in the embodiment of the invention, P (pi | x) is the path probability of all character tags, B (pi) is the path set of all character tags, pi is the maximum path probability corresponding to each character tag, and y is the predicted character corresponding to the character tag.
The model training module 105 is configured to obtain loss values of the predicted character set, the real character tagging set, and the error character tagging set through calculation, and adjust parameters of the optical character recognition model if the loss values do not satisfy preset conditions until the loss values satisfy the preset conditions, so as to obtain a trained optical character recognition model.
In an embodiment of the present invention, a first loss value of the predicted character set and the real character label set is calculated, a second loss value of the predicted character set and the error character label set is calculated, the first loss value and the second loss value are fused to obtain a loss value, and if the loss value does not satisfy a preset condition, a parameter of the optical character recognition model is adjusted until the loss value satisfies the preset condition, so as to obtain a trained optical character recognition model.
For example, the preset condition may be a preset threshold of 0.1, and when the loss value is less than 0.1, the model parameters are adjusted until the loss value is greater than or equal to 0.1, so as to obtain the trained optical character recognition model.
In the embodiment of the invention, the original picture set in actual production and the original data set corresponding to the original picture set are obtained firstly, so that the data distribution difference in a development environment and a production environment can be avoided, and the accuracy of subsequent model training is improved; secondly, when a preset search engine is idle, the search engine is utilized to obtain an original data set corresponding to the original picture set from the message queue channel, the search engine is utilized to carry out error data identification on the original data set, error data can be directly screened by the search engine instead of manual error data screening, manpower and time are saved, the iteration cycle of subsequent model training is shortened, and the training efficiency of the subsequent model is improved; furthermore, by marking the real characters corresponding to the positive sample data and marking the error characters corresponding to the negative sample data, the subsequent model can be conveniently trained; and finally, recognizing the predicted characters of the training data by using the optical character recognition model, respectively calculating loss values of the predicted characters, the real characters and the error characters, if the loss values do not meet preset conditions, adjusting parameters of the optical character recognition model, further improving the accuracy of the model until the loss values meet the preset conditions, and obtaining the trained optical character recognition model. Therefore, the optical character recognition model training device provided by the embodiment of the invention can improve the efficiency and accuracy of optical character recognition model training.
Fig. 3 is a schematic structural diagram of an electronic device implementing the optical character recognition model training method according to the present invention.
The electronic device may include a processor 10, a memory 11, a communication bus 12, and a communication interface 13, and may further include a computer program, such as an optical character recognition model training program, stored in the memory 11 and executable on the processor 10.
The memory 11 includes at least one type of media, which includes flash memory, removable hard disk, multimedia card, card type memory (e.g., SD or DX memory, etc.), magnetic memory, local disk, optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device, for example a removable hard disk of the electronic device. The memory 11 may also be an external storage device of the electronic device in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device. The memory 11 may be used not only to store application software installed in the electronic device and various types of data, such as codes of an optical character recognition model training program, but also to temporarily store data that has been output or is to be output.
The processor 10 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device by running or executing programs or modules (e.g., an optical character recognition model training program, etc.) stored in the memory 11 and calling data stored in the memory 11.
The communication bus 12 may be a PerIPheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus. The bus may be divided into an address bus, a data bus, a control bus, etc. The communication bus 12 is arranged to enable connection communication between the memory 11 and at least one processor 10 or the like. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
Fig. 3 shows only an electronic device having components, and those skilled in the art will appreciate that the structure shown in fig. 3 does not constitute a limitation of the electronic device, and may include fewer or more components than those shown, or some components may be combined, or a different arrangement of components.
For example, although not shown, the electronic device may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management and the like are realized through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
Optionally, the communication interface 13 may include a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), which is generally used to establish a communication connection between the electronic device and other electronic devices.
Optionally, the communication interface 13 may further include a user interface, which may be a Display (Display), an input unit (such as a Keyboard (Keyboard)), and optionally, a standard wired interface, or a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable, among other things, for displaying information processed in the electronic device and for displaying a visualized user interface.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
The optical character recognition model training program stored in the memory 11 of the electronic device is a combination of a plurality of computer programs, and when running in the processor 10, can realize:
acquiring an original picture set and an original data set corresponding to the original picture set in actual production, and storing the original picture set and the original data set into a preset message queue channel;
when a preset search engine is idle, acquiring an original data set corresponding to the original picture set from the message queue channel by using the search engine, screening error data of the original data set, and determining that the screened error data form a negative sample data set and non-error data except the error data form a positive sample data set;
acquiring a real character label set corresponding to the positive sample data set and an error character label set corresponding to the negative sample data set, wherein the error character label set is dynamically updated in real time;
inputting the positive sample data set, the negative sample data set and the original picture set as training data sets into a preset optical character recognition model, and recognizing a predicted character set of the training data sets by using the optical character recognition model;
and obtaining loss values of the predicted character set, the real character label set and the error character label set through calculation, if the loss values do not meet preset conditions, adjusting parameters of the optical character recognition model until the loss values meet the preset conditions, and obtaining the trained optical character recognition model.
Specifically, the processor 10 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the computer program, which is not described herein again.
Further, the electronic device integrated module/unit, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in a computer readable medium. The computer readable medium may be non-volatile or volatile. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).
Embodiments of the present invention may also provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor of an electronic device, the computer program may implement:
acquiring an original picture set and an original data set corresponding to the original picture set in actual production, and storing the original picture set and the original data set into a preset message queue channel;
when a preset search engine is idle, acquiring an original data set corresponding to the original picture set from the message queue channel by using the search engine, screening error data of the original data set, and determining that the screened error data form a negative sample data set and non-error data except the error data form a positive sample data set;
acquiring a real character label set corresponding to the positive sample data set and an error character label set corresponding to the negative sample data set, wherein the error character label set is dynamically updated in real time;
inputting the positive sample data set, the negative sample data set and the original picture set as training data sets into a preset optical character recognition model, and recognizing a predicted character set of the training data sets by using the optical character recognition model;
and obtaining loss values of the predicted character set, the real character label set and the error character label set through calculation, if the loss values do not meet preset conditions, adjusting parameters of the optical character recognition model until the loss values meet the preset conditions, and obtaining the trained optical character recognition model.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
In the embodiments provided by the present invention, it should be understood that the disclosed media, devices, apparatuses and methods may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. A method for training an optical character recognition model, the method comprising:
acquiring an original picture set and an original data set corresponding to the original picture set in actual production, and storing the original picture set and the original data set into a preset message queue channel;
when a preset search engine is idle, acquiring an original data set corresponding to the original picture set from the message queue channel by using the search engine, screening error data of the original data set, and determining that the screened error data form a negative sample data set and non-error data except the error data form a positive sample data set;
acquiring a real character label set corresponding to the positive sample data set and an error character label set corresponding to the negative sample data set, wherein the error character label set is dynamically updated in real time;
inputting the positive sample data set, the negative sample data set and the original picture set as training data sets into a preset optical character recognition model, and recognizing a predicted character set of the training data sets by using the optical character recognition model;
and obtaining loss values of the predicted character set, the real character label set and the error character label set through calculation, if the loss values do not meet preset conditions, adjusting parameters of the optical character recognition model until the loss values meet the preset conditions, and obtaining the trained optical character recognition model.
2. The method for training an optical character recognition model according to claim 1, wherein the performing error data filtering on the original data set, determining that the filtered error data constitutes a negative sample data set, and non-error data other than the error data constitutes a positive sample data set, comprises:
acquiring the sequence length of the original data in the original data set by using a preset search engine, and setting a sequence length index by using a preset screening statement in the preset search engine;
and comparing the sequence length with the sequence length index, forming a negative sample data set by using the original data corresponding to the sequence length with inconsistent length of the sequence length index, and forming a positive sample data set by using the original data corresponding to the sequence length with consistent length of the sequence length index.
3. The method for training an optical character recognition model according to claim 1, wherein after determining that the filtered erroneous data constitutes a negative sample data set and non-erroneous data other than the erroneous data constitutes a positive sample data set, the method further comprises:
acquiring data fields of the positive sample data set and the negative sample data set, and identifying sensitive fields in the data fields;
and carrying out desensitization operation on the sensitive field by using a preset desensitization function.
4. The method for training an optical character recognition model according to claim 1, wherein the recognizing the predicted character set of the training data set by using a preset optical character recognition model comprises:
extracting a characteristic sequence of the training data set by using a convolutional layer in a preset optical character recognition model to obtain a character vector set;
predicting a character tag set of the character vector set using a loop layer in the optical character recognition model;
and integrating the character label set by utilizing a transcription layer in the optical character recognition model to obtain a predicted character set.
5. The method of training an optical character recognition model of claim 4, wherein predicting the set of character labels for the set of character vectors using a loop layer in the optical character recognition model comprises:
calculating a state value of the character vector set by using an input gate in the circulation layer;
calculating an activation value of the character vector set by using a forgetting gate in the circulation layer;
calculating a state update value of the character vector set according to the state arrival and activation values;
and calculating the character label set of the state updating value by using an output gate in the circulation layer to obtain the character label set of the character vector set.
6. The method for training an optical character recognition model according to claim 4, wherein the integrating the character tag set by using a transcription layer in the optical character recognition model to obtain a predicted character set comprises:
acquiring all path probabilities of the character label set by utilizing the transcription layer, and searching the maximum path probability corresponding to each character label from the path probabilities;
and combining the maximum path probabilities to obtain a predicted character set of the character label set.
7. The method for training an optical character recognition model according to claim 1, wherein before storing the original image set and the original data set in a predetermined message queue channel, the method further comprises:
establishing a link between the original data set and the original picture set and the message middleware, and forming a message queue channel through the link;
storing the raw picture set and raw data set through the message queue channel.
8. An optical character recognition model training apparatus, characterized in that the apparatus comprises:
the data set acquisition module is used for acquiring an original picture set and an original data set corresponding to the original picture set in actual production and storing the original picture set and the original data set into a preset message queue channel;
the data set screening module is used for acquiring an original data set corresponding to the original picture set from the message queue channel by using a preset search engine when the search engine is idle, screening error data of the original data set, and determining that screened error data form a negative sample data set and non-error data except the error data form a positive sample data set;
the data set marking module is used for acquiring a real character marking set corresponding to the positive sample data set and an error character marking set corresponding to the negative sample data set, wherein the error character marking set is dynamically updated in real time;
a training data set identification module, configured to input the positive sample data set, the negative sample data set, and the original image set as training data sets to a preset optical character recognition model, and identify a predicted character set of the training data set by using the optical character recognition model;
and the model training module is used for obtaining loss values of the predicted character set, the real character label set and the error character label set through calculation, if the loss values do not meet preset conditions, adjusting parameters of the optical character recognition model until the loss values meet the preset conditions, and obtaining the trained optical character recognition model.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores computer program instructions executable by the at least one processor to cause the at least one processor to perform the method of training an optical character recognition model according to any one of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method for training an optical character recognition model according to any one of claims 1 to 7.
CN202210056338.3A 2022-01-18 2022-01-18 Optical character recognition model training method, device, equipment and medium Active CN114399766B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210056338.3A CN114399766B (en) 2022-01-18 2022-01-18 Optical character recognition model training method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210056338.3A CN114399766B (en) 2022-01-18 2022-01-18 Optical character recognition model training method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN114399766A true CN114399766A (en) 2022-04-26
CN114399766B CN114399766B (en) 2024-05-10

Family

ID=81230820

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210056338.3A Active CN114399766B (en) 2022-01-18 2022-01-18 Optical character recognition model training method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN114399766B (en)

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160140451A1 (en) * 2014-11-17 2016-05-19 Yahoo! Inc. System and method for large-scale multi-label learning using incomplete label assignments
CN107679531A (en) * 2017-06-23 2018-02-09 平安科技(深圳)有限公司 Licence plate recognition method, device, equipment and storage medium based on deep learning
CN108875722A (en) * 2017-12-27 2018-11-23 北京旷视科技有限公司 Character recognition and identification model training method, device and system and storage medium
US20190251369A1 (en) * 2018-02-11 2019-08-15 Ilya Popov License plate detection and recognition system
CN110866524A (en) * 2019-11-15 2020-03-06 北京字节跳动网络技术有限公司 License plate detection method, device, equipment and storage medium
CN112052850A (en) * 2020-09-03 2020-12-08 平安科技(深圳)有限公司 License plate recognition method and device, electronic equipment and storage medium
CN112560453A (en) * 2020-12-18 2021-03-26 平安银行股份有限公司 Voice information verification method and device, electronic equipment and medium
WO2021073266A1 (en) * 2019-10-18 2021-04-22 平安科技(深圳)有限公司 Image detection-based test question checking method and related device
CN112733911A (en) * 2020-12-31 2021-04-30 平安科技(深圳)有限公司 Entity recognition model training method, device, equipment and storage medium
US20210142093A1 (en) * 2019-11-08 2021-05-13 Tricentis Gmbh Method and system for single pass optical character recognition
CN112836748A (en) * 2021-02-02 2021-05-25 太原科技大学 Casting identification character recognition method based on CRNN-CTC
WO2021139342A1 (en) * 2020-07-27 2021-07-15 平安科技(深圳)有限公司 Training method and apparatus for ocr recognition model, and computer device
WO2021174839A1 (en) * 2020-03-06 2021-09-10 平安科技(深圳)有限公司 Data compression method and apparatus, and computer-readable storage medium
CN113392814A (en) * 2021-08-16 2021-09-14 冠传网络科技(南京)有限公司 Method and device for updating character recognition model and storage medium
CN113743415A (en) * 2021-08-05 2021-12-03 杭州远传新业科技有限公司 Method, system, electronic device and medium for identifying and correcting image text
CN113822428A (en) * 2021-08-06 2021-12-21 中国工商银行股份有限公司 Neural network training method and device and image segmentation method

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160140451A1 (en) * 2014-11-17 2016-05-19 Yahoo! Inc. System and method for large-scale multi-label learning using incomplete label assignments
CN107679531A (en) * 2017-06-23 2018-02-09 平安科技(深圳)有限公司 Licence plate recognition method, device, equipment and storage medium based on deep learning
CN108875722A (en) * 2017-12-27 2018-11-23 北京旷视科技有限公司 Character recognition and identification model training method, device and system and storage medium
US20190251369A1 (en) * 2018-02-11 2019-08-15 Ilya Popov License plate detection and recognition system
WO2021073266A1 (en) * 2019-10-18 2021-04-22 平安科技(深圳)有限公司 Image detection-based test question checking method and related device
US20210142093A1 (en) * 2019-11-08 2021-05-13 Tricentis Gmbh Method and system for single pass optical character recognition
CN110866524A (en) * 2019-11-15 2020-03-06 北京字节跳动网络技术有限公司 License plate detection method, device, equipment and storage medium
WO2021174839A1 (en) * 2020-03-06 2021-09-10 平安科技(深圳)有限公司 Data compression method and apparatus, and computer-readable storage medium
WO2021139342A1 (en) * 2020-07-27 2021-07-15 平安科技(深圳)有限公司 Training method and apparatus for ocr recognition model, and computer device
CN112052850A (en) * 2020-09-03 2020-12-08 平安科技(深圳)有限公司 License plate recognition method and device, electronic equipment and storage medium
CN112560453A (en) * 2020-12-18 2021-03-26 平安银行股份有限公司 Voice information verification method and device, electronic equipment and medium
CN112733911A (en) * 2020-12-31 2021-04-30 平安科技(深圳)有限公司 Entity recognition model training method, device, equipment and storage medium
CN112836748A (en) * 2021-02-02 2021-05-25 太原科技大学 Casting identification character recognition method based on CRNN-CTC
CN113743415A (en) * 2021-08-05 2021-12-03 杭州远传新业科技有限公司 Method, system, electronic device and medium for identifying and correcting image text
CN113822428A (en) * 2021-08-06 2021-12-21 中国工商银行股份有限公司 Neural network training method and device and image segmentation method
CN113392814A (en) * 2021-08-16 2021-09-14 冠传网络科技(南京)有限公司 Method and device for updating character recognition model and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘正琼等: "基于字符编码与卷积神经网络的汉字识别", 电子测量与仪器学报, vol. 34, no. 02, 29 February 2020 (2020-02-29), pages 143 - 149 *
王逸铭等: "基于神经网络模型的扫描电镜图像字符识别方法", 制造业自动化, vol. 42, no. 07, 31 July 2020 (2020-07-31), pages 18 - 20 *

Also Published As

Publication number Publication date
CN114399766B (en) 2024-05-10

Similar Documents

Publication Publication Date Title
CN112052850A (en) License plate recognition method and device, electronic equipment and storage medium
CN112396005A (en) Biological characteristic image recognition method and device, electronic equipment and readable storage medium
CN113157927A (en) Text classification method and device, electronic equipment and readable storage medium
CN113961473A (en) Data testing method and device, electronic equipment and computer readable storage medium
CN113298159A (en) Target detection method and device, electronic equipment and storage medium
CN114491047A (en) Multi-label text classification method and device, electronic equipment and storage medium
CN113268665A (en) Information recommendation method, device and equipment based on random forest and storage medium
CN115205225A (en) Training method, device and equipment of medical image recognition model and storage medium
CN112541688B (en) Service data verification method and device, electronic equipment and computer storage medium
CN113627160A (en) Text error correction method and device, electronic equipment and storage medium
CN113658002A (en) Decision tree-based transaction result generation method and device, electronic equipment and medium
CN113434542A (en) Data relation identification method and device, electronic equipment and storage medium
CN113313211A (en) Text classification method and device, electronic equipment and storage medium
CN112269875A (en) Text classification method and device, electronic equipment and storage medium
CN112733551A (en) Text analysis method and device, electronic equipment and readable storage medium
CN112486957A (en) Database migration detection method, device, equipment and storage medium
CN115496166A (en) Multitasking method and device, electronic equipment and storage medium
CN113221888B (en) License plate number management system test method and device, electronic equipment and storage medium
CN114996386A (en) Business role identification method, device, equipment and storage medium
CN112580505B (en) Method and device for identifying network point switch door state, electronic equipment and storage medium
CN115544566A (en) Log desensitization method, device, equipment and storage medium
CN115203364A (en) Software fault feedback processing method, device, equipment and readable storage medium
CN115146064A (en) Intention recognition model optimization method, device, equipment and storage medium
CN114399766B (en) Optical character recognition model training method, device, equipment and medium
CN113515591A (en) Text bad information identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant