CN114417987A - Model training method, data identification method, device and equipment - Google Patents

Model training method, data identification method, device and equipment Download PDF

Info

Publication number
CN114417987A
CN114417987A CN202210028772.0A CN202210028772A CN114417987A CN 114417987 A CN114417987 A CN 114417987A CN 202210028772 A CN202210028772 A CN 202210028772A CN 114417987 A CN114417987 A CN 114417987A
Authority
CN
China
Prior art keywords
model
training samples
training
character
trained
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210028772.0A
Other languages
Chinese (zh)
Inventor
王可
孟昌华
王维强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202210028772.0A priority Critical patent/CN114417987A/en
Publication of CN114417987A publication Critical patent/CN114417987A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioethics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the specification provides a model training method, a data identification method, a device and equipment, and the method comprises the following steps: obtaining a plurality of training samples, inputting the training samples into a first model, and determining a first prediction probability that a character of each character bit in a character sequence corresponding to the training samples is a preset character; inputting a plurality of training samples into a second model, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples; the second model is trained based on the first number of training samples, and the first model is trained based on the second number of training samples until the trained first model and/or the trained second model satisfy corresponding convergence conditions.

Description

Model training method, data identification method, device and equipment
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a model training method, a data recognition device, and a model training apparatus.
Background
In supervised learning, a large number of accurately labeled samples are often required. However, in the actual application process, a lot of manpower, material resources and financial resources are needed for marking the sample. The quality of the labeled sample is also affected by human subjective factors to some extent, so that the actually obtained labeled sample may contain a certain proportion of label noise (for example, the correct label information of a certain verification code is 657I, and the last english alphabet I of the label information is likely to be incorrectly labeled as number 1, that is, the verification code is labeled as 6571, etc. in the process of labeling the verification code). If the label noise is carried in the sample for training the neural network in the process of training the neural network, the information in the label noise learned by the neural network can be caused, the performance of the trained neural network is interfered, and the identification accuracy of the trained neural network is influenced. Therefore, a technical scheme for effectively improving the model training accuracy and the model performance is needed to be provided.
Disclosure of Invention
An object of the embodiments of the present disclosure is to provide a model training method, a data recognition method, an apparatus, and a device, so as to provide a technical solution capable of effectively improving model training accuracy and model performance.
In order to solve the technical problem, the embodiment of the present specification is implemented as follows:
in a first aspect, an embodiment of the present specification provides a model training method, including: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character. And inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples. Training the second model based on the first number of training samples, and training the first model based on the second number of training samples. And if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
In a second aspect, an embodiment of the present specification provides a model training method, including: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model until the trained first model meets the corresponding convergence condition.
In a third aspect, an embodiment of the present specification provides a data identification method, including: acquiring data to be recognized, wherein the data comprises a character sequence formed by a plurality of characters. And inputting the data to be recognized into a first model or a second model, and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample. The training process of the first model and the second model comprises the following steps: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character. And inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples. Training the second model based on the first number of training samples, and training the first model based on the second number of training samples. And if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition. Or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model or a second model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
In a fourth aspect, an embodiment of the present specification provides a model training method, applied to a block chain system, including: receiving model training rule information sent by first equipment, generating a first intelligent contract based on the model training rule information, and deploying the first intelligent contract in the block chain system. When a model training request sent by the first device is acquired, executing the following processing based on the first intelligent contract: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model pre-trained based on the first intelligent contract, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters. Inputting the training samples into a second model pre-trained based on the first intelligent contract, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples. Training the second model based on the first number of training samples, and training the first model based on the second number of training samples. And if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
In a fifth aspect, an embodiment of the present specification provides a model training method, applied to a block chain system, including: receiving second model training rule information sent by the first equipment, generating a second intelligent contract based on the second model training rule information, and deploying the second intelligent contract in the block chain system. When a model training request sent by the first device is acquired, executing the following processing based on the second intelligent contract: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model pre-trained based on the second intelligent contract, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model until the trained first model meets the corresponding convergence condition.
In a sixth aspect, an embodiment of the present specification provides a data identification method, which is applied to a block chain system, and includes: receiving data identification rule information sent by second equipment, generating a third intelligent contract based on the data identification rule information, and deploying the third intelligent contract in the blockchain system. When a data identification request sent by the second device is acquired, executing the following processing based on the third intelligent contract: acquiring data to be recognized, wherein the data comprises a character sequence formed by a plurality of characters. And inputting the data to be recognized into a first model or a second model, and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample. The training process of the first model and the second model comprises the following steps: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character. And inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples. Training the second model based on the first number of training samples, and training the first model based on the second number of training samples. And if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition. Or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model or a second model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
In a seventh aspect, an embodiment of the present specification provides a model training apparatus, including: the first acquisition module acquires a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. The first processing module is used for inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character. And inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model. The first selecting module selects a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples. A first training module to train the second model based on the first number of training samples and to train the first model based on the second number of training samples. And if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
In an eighth aspect, an embodiment of the present specification provides a model training apparatus, including: the second acquisition module acquires a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. The second processing module is used for inputting the training samples into the first model, predicting the character of each character bit in the character sequence corresponding to the training samples and determining the first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character. And the second selection module selects a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And the second training module is used for training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, the steps of obtaining a plurality of training samples and training the first model are executed again until the trained first model meets the corresponding convergence condition.
In a ninth aspect, an embodiment of the present specification provides a data identification apparatus, including: and the third acquisition module acquires data to be recognized, wherein the data comprises a character sequence formed by a plurality of characters. And the output module is used for inputting the data to be recognized into a first model or a second model and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample. The training process of the first model and the second model comprises the following steps: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character. And inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples. Training the second model based on the first number of training samples, and training the first model based on the second number of training samples. And if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition. Or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model or a second model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
In a tenth aspect, an embodiment of the present specification provides a model training apparatus, where the apparatus is an apparatus in a blockchain system, and includes: the first receiving module receives model training rule information sent by first equipment, generates a first intelligent contract based on the model training rule information, and deploys the first intelligent contract in the block chain system. The third processing module, when acquiring the model training request sent by the first device, executes the following processing based on the first intelligent contract: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model pre-trained based on the first intelligent contract, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters. Inputting the training samples into a second model pre-trained based on the first intelligent contract, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples. Training the second model based on the first number of training samples, and training the first model based on the second number of training samples. And if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
In an eleventh aspect, an embodiment of the present specification provides a model training apparatus, where the apparatus is an apparatus in a blockchain system, and includes: the second receiving module receives second model training rule information sent by the first device, generates a second intelligent contract based on the second model training rule information, and deploys the second intelligent contract in the block chain system. The fourth processing module, when acquiring the model training request sent by the first device, executes the following processing based on the second intelligent contract: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model pre-trained based on the second intelligent contract, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model until the trained first model meets the corresponding convergence condition.
In a twelfth aspect, an embodiment of the present disclosure provides a data identification apparatus, where the apparatus is an apparatus in a blockchain system, and includes: and the third receiving module is used for receiving the data identification rule information sent by the second equipment, generating a third intelligent contract based on the data identification rule information, and deploying the third intelligent contract in the block chain system. A fifth processing module, configured to, when acquiring the data identification request sent by the second device, execute the following processing based on the third intelligent contract: acquiring data to be recognized, wherein the data comprises a character sequence formed by a plurality of characters. And inputting the data to be recognized into a first model or a second model, and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample. The training process of the first model and the second model comprises the following steps: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character. And inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples. Training the second model based on the first number of training samples, and training the first model based on the second number of training samples. And if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition. Or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model or a second model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
In a thirteenth aspect, an embodiment of the present specification provides a model training apparatus, including: a processor. And a memory arranged to store computer executable instructions that, when executed, cause the processor to: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character. And inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples. Training the second model based on the first number of training samples, and training the first model based on the second number of training samples. And if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
In a fourteenth aspect, an embodiment of the present specification provides a model training apparatus, including: a processor. And a memory arranged to store computer executable instructions that, when executed, cause the processor to: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model until the trained first model meets the corresponding convergence condition.
In a fifteenth aspect, an embodiment of the present specification provides a data identification device, including: a processor. And a memory arranged to store computer executable instructions that, when executed, cause the processor to: acquiring data to be recognized, wherein the data comprises a character sequence formed by a plurality of characters. And inputting the data to be recognized into a first model or a second model, and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample. The training process of the first model and the second model comprises the following steps: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character. And inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples. Training the second model based on the first number of training samples, and training the first model based on the second number of training samples. And if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition. Or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model or a second model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
In a sixteenth aspect, an embodiment of the present specification provides a model training apparatus, where the apparatus is an apparatus in a blockchain system, and includes: a processor. And a memory arranged to store computer executable instructions that, when executed, cause the processor to: receiving model training rule information sent by first equipment, generating a first intelligent contract based on the model training rule information, and deploying the first intelligent contract in the block chain system. When a model training request sent by the first device is acquired, executing the following processing based on the first intelligent contract: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model pre-trained based on the first intelligent contract, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters. Inputting the training samples into a second model pre-trained based on the first intelligent contract, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples. Training the second model based on the first number of training samples, and training the first model based on the second number of training samples. And if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
In a seventeenth aspect, an embodiment of the present specification provides a model training apparatus, where the apparatus is an apparatus in a blockchain system, and includes: a processor. And a memory arranged to store computer executable instructions that, when executed, cause the processor to: receiving second model training rule information sent by the first equipment, generating a second intelligent contract based on the second model training rule information, and deploying the second intelligent contract in the block chain system. When a model training request sent by the first device is acquired, executing the following processing based on the second intelligent contract: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model pre-trained based on the second intelligent contract, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model until the trained first model meets the corresponding convergence condition.
In an eighteenth aspect, an embodiment of the present specification provides a data identification device, where the data identification device is a device in a blockchain system, and includes: a processor. And a memory arranged to store computer executable instructions that, when executed, cause the processor to: receiving data identification rule information sent by second equipment, generating a third intelligent contract based on the data identification rule information, and deploying the third intelligent contract in the blockchain system. When a data identification request sent by the second device is acquired, executing the following processing based on the third intelligent contract: acquiring data to be recognized, wherein the data comprises a character sequence formed by a plurality of characters. And inputting the data to be recognized into a first model or a second model, and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample. The training process of the first model and the second model comprises the following steps: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character. And inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples. Training the second model based on the first number of training samples, and training the first model based on the second number of training samples. And if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition. Or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model or a second model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
In a nineteenth aspect, embodiments of the present specification provide a storage medium for storing computer-executable instructions, which when executed implement the following: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character. And inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples. Training the second model based on the first number of training samples, and training the first model based on the second number of training samples. And if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
In a twentieth aspect, embodiments of the present specification provide a storage medium for storing computer-executable instructions, which when executed implement the following: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model until the trained first model meets the corresponding convergence condition.
In a twenty-first aspect, embodiments of the present specification provide a storage medium for storing computer-executable instructions, which when executed implement the following flow: acquiring data to be recognized, wherein the data comprises a character sequence formed by a plurality of characters. And inputting the data to be recognized into a first model or a second model, and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample. The training process of the first model and the second model comprises the following steps: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character. And inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples. Training the second model based on the first number of training samples, and training the first model based on the second number of training samples. And if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition. Or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model or a second model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
In a twenty-second aspect, embodiments of the present specification provide a storage medium for storing computer-executable instructions, which when executed implement the following: receiving model training rule information sent by first equipment, generating a first intelligent contract based on the model training rule information, and deploying the first intelligent contract in the block chain system. When a model training request sent by the first device is acquired, executing the following processing based on the first intelligent contract: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model pre-trained based on the first intelligent contract, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters. Inputting the training samples into a second model pre-trained based on the first intelligent contract, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples. Training the second model based on the first number of training samples, and training the first model based on the second number of training samples. And if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
In a twenty-third aspect, embodiments of the present specification provide a storage medium for storing computer-executable instructions, which when executed implement the following flow: receiving second model training rule information sent by the first equipment, generating a second intelligent contract based on the second model training rule information, and deploying the second intelligent contract in the block chain system. When a model training request sent by the first device is acquired, executing the following processing based on the second intelligent contract: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model pre-trained based on the second intelligent contract, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model until the trained first model meets the corresponding convergence condition.
In a twenty-fourth aspect, embodiments of the present specification provide a storage medium for storing computer-executable instructions, which when executed implement the following flow: receiving data identification rule information sent by second equipment, generating a third intelligent contract based on the data identification rule information, and deploying the third intelligent contract in the blockchain system. When a data identification request sent by the second device is acquired, executing the following processing based on the third intelligent contract: acquiring data to be recognized, wherein the data comprises a character sequence formed by a plurality of characters. And inputting the data to be recognized into a first model or a second model, and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample. The training process of the first model and the second model comprises the following steps: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character. And inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples. Training the second model based on the first number of training samples, and training the first model based on the second number of training samples. And if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition. Or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters. Inputting the training samples into a first model or a second model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters. Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples. And training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
Drawings
In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present specification, and for those skilled in the art, other drawings can be obtained according to the drawings without any creative effort.
FIG. 1A is a first schematic flow chart of a model training method provided in an embodiment of the present disclosure;
FIG. 1B is a first schematic diagram of a model training process provided in an embodiment of the present disclosure;
FIG. 2A is a schematic flow chart of a model training method provided in the embodiments of the present disclosure;
FIG. 2B is a second schematic diagram of a model training process provided in the embodiments of the present disclosure;
fig. 3A is a third schematic flow chart of a data identification method provided in an embodiment of the present disclosure;
fig. 3B is a third schematic diagram of a data identification process provided in the embodiments of the present disclosure;
FIG. 4A is a fourth flowchart illustrating a model training method provided in an embodiment of the present disclosure;
FIG. 4B is a fourth schematic diagram of a model training process provided in the embodiments of the present disclosure;
FIG. 5A is a schematic flow chart of a fifth method for training a model provided in an embodiment of the present disclosure;
FIG. 5B is a fifth schematic diagram of a model training process provided in the embodiments of the present disclosure;
fig. 6A is a sixth flowchart of a data identification method provided in an embodiment of the present disclosure;
fig. 6B is a sixth schematic diagram of a data identification process provided in the embodiments of the present disclosure;
FIG. 7 is a schematic diagram illustrating a first module of a model training apparatus according to an embodiment of the present disclosure;
FIG. 8 is a schematic diagram illustrating a second module of a model training apparatus according to an embodiment of the present disclosure;
fig. 9 is a schematic diagram illustrating a first module of a data recognition device according to an embodiment of the present disclosure;
FIG. 10 is a schematic diagram illustrating a third module of a model training apparatus according to an embodiment of the present disclosure;
FIG. 11 is a schematic diagram illustrating a fourth module of a model training apparatus according to an embodiment of the present disclosure;
fig. 12 is a schematic diagram illustrating a second module of a data recognition device according to an embodiment of the present disclosure;
fig. 13 is a schematic structural diagram of a model training apparatus provided in an embodiment of this specification.
Detailed Description
The embodiment of the specification provides a model training method, a data identification device and model training equipment.
In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all of the embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification without any inventive step should fall within the scope of protection of the present specification.
As shown in fig. 1A and 1B, an execution subject of the method may be a server, where the server may be an independent server or a server cluster composed of multiple servers, and the server may train a first model to be trained based on multiple acquired training samples.
The method may specifically comprise the steps of:
in step S102, a plurality of training samples are obtained, where the training samples include a character sequence composed of a plurality of characters.
As an example, the training sample may be an image sample, or may also be an audio sample, etc. The training samples may include noise data that is not related to the sequence of characters, and the noise data may include one or more of chinese characters, letters, numbers, symbols, graphics, and lines. The training sample may further include a character sequence formed by a plurality of characters, where the character sequence may be a verification code, or may also be other character sequences used for identity verification, and the like, where the characters included in the character sequence may include one or more of chinese characters, letters, numbers, symbols, and figures, and a specific representation form of the training sample, a specific content of the character sequence, and a specific representation form of the characters forming the character sequence are not specifically limited in this specification.
In practice, a large number of accurately labeled samples are often required during supervised learning. However, in the actual application process, a lot of manpower, material resources and financial resources are needed for marking the sample. The quality of the labeled sample is also influenced by human subjective factors to some extent, so that the actually obtained labeled sample may contain a certain proportion of label noise. If the label noise is carried in the sample for training the neural network in the process of training the neural network, the information in the label noise learned by the neural network can be caused, the performance of the trained neural network is interfered, and the identification accuracy of the trained neural network is influenced. Therefore, a technical scheme for effectively improving the model training accuracy and the model performance is needed to be provided.
In an optional implementation manner, the execution subject may obtain a plurality of training samples through a preset interface, or the execution subject may obtain a plurality of training samples and label information corresponding to the training samples through a preset interface.
Considering that some label noise may exist in training sample data used in training a neural network, for example, the correct label information of a certain verification code is 657I, and it is highly likely that the last english alphabet I of the label information is incorrectly labeled as number 1 in the process of labeling the verification code, that is, the verification code is labeled as 6571, and the verification code 6571 may be considered as label noise or may also be considered as a noise label. Therefore, in the process of training the neural network, if a large amount of the label noise is contained in the training sample, the performance of the trained neural network is interfered, and the identification accuracy of the neural network is influenced. Therefore, after the executing agent obtains a plurality of training samples through the processing of step S102, the training samples may be screened, training samples without label noise may be screened from the obtained training samples, and a model may be trained by using the screened training samples, so as to improve the model training accuracy and the model performance, and the specific process may refer to the specific implementation process of step S104 to step S108.
In step S104, inputting a plurality of training samples into the first model, predicting the character of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character; and inputting a plurality of training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model.
As an example, the first model and the second model may be two models determined based on a Co-training mechanism, and initial model parameters of the first model and the second model may be the same. Assuming that the characters are numbers, the predetermined characters can be any number between 0 and 9. Assuming that the character is a letter, the preset character may be any one of letters a to z, and the reference model may be a model constructed based on one or more different preset neural network algorithms, and the specific contents of the preset character and the reference model are not specifically limited in the embodiments of the present specification.
In an alternative implementation manner, taking the training sample as the verification code including the 4 number sequences as an example, assuming that the character bits may be determined according to a left-to-right sequence, the character bits of the verification code including the 4 number sequences may be determined as a first character bit, a second character bit, a third character bit, and a fourth character bit according to a left-to-right sequence. When the training sample is input into the first model, the character of each character bit in the character sequence corresponding to the training sample is predicted, and a first prediction probability (including 10 first prediction probabilities in total) of any number of 0 to 9 of the preset character corresponding to the first character bit, 10 first prediction probabilities of any number of 0 to 9 of the preset character corresponding to the second character bit, 10 first prediction probabilities of any number of 0 to 9 of the preset character corresponding to the third character bit, and 10 first prediction probabilities of any number of 0 to 9 of the preset character corresponding to the fourth character bit can be sequentially obtained. Similarly, when the training sample is input into the second model, the character of each character bit in the character sequence corresponding to the training sample is predicted, and the second prediction probability corresponding to each character bit can be sequentially obtained.
In step S106, a first number of training samples are selected from the plurality of training samples based on the first prediction probability and the label information of the training samples; and selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples.
In an optional implementation manner, a real character corresponding to each character bit in a character sequence corresponding to a training sample may be determined based on label information of the training sample, then, a first prediction probability corresponding to a preset character when the preset character is a real character may be determined from the first prediction probabilities corresponding to the character bits, then, training samples, of which the determined first prediction probabilities are all greater than a first preset threshold value, are selected from the plurality of training samples, and the selected training samples are used as a first number of training samples. Similarly, a second number of training samples may be selected from the plurality of training samples in the same manner.
In an optional implementation manner, taking an example that a training sample in the training samples is a verification code including 4 digits, a true label corresponding to the training sample is 6571, and assuming that the first preset threshold is 60%, the first prediction probability that a true character of the first character bit is 6 is 90%, the first prediction probability that a preset character of the second character bit is 5 is 80%, the first prediction probability that a preset character of the third character bit is 7 is 60%, and the first prediction probability that a preset character of the fourth character bit is 1 is 70%, it may be determined that the training sample is not a noise sample, and the training sample may be selected for performing model training on the second model. Similarly, if the first prediction probability of the preset character of a certain character bit in the 4 characters is less than 60%, the training sample may be considered as a noise sample, and the training sample may be eliminated, and the training sample will not participate in the model training of the first model.
In step S108, training the second model based on the first number of training samples, and training the first model based on the second number of training samples; and if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
In an optional implementation manner, the executing entity may input the first number of training samples into the second model to obtain a first number of prediction labels corresponding to the training samples, may then determine loss information corresponding to the second model based on loss information between the prediction labels corresponding to the training samples and their corresponding real labels, and train the second model based on the loss information, and if the loss information of the trained second model does not satisfy the corresponding convergence condition, re-execute the steps of obtaining a plurality of training samples and training the second model until the trained second model satisfies the corresponding convergence condition. Similarly, the executing entity may input the second number of training samples into the first model to obtain a second number of prediction labels corresponding to the training samples, then may determine loss information corresponding to the first model based on loss information between the prediction labels corresponding to the training samples and the corresponding real labels, train the first model based on the loss information, and if the loss information of the trained first model does not satisfy the corresponding convergence condition, re-execute the steps of obtaining a plurality of training samples and training the first model until the trained first model satisfies the corresponding convergence condition.
As can be seen from the technical solutions provided in the embodiments of the present specification, the same training samples are respectively input into the first model and the second model, which are two same models determined by the same reference model, and the first model selects a first number of training samples from the input training samples, the second model selects a second number of training samples from the input training samples, then the second model is trained based on the first number of training samples, and the first model is trained based on the second number of training samples until the trained first model and/or the trained second model satisfy the corresponding convergence condition, because the training samples of the first number and the training samples of the second number are both samples which are selected from the training samples and may not have noise, therefore, the method for training the first model and the second model based on the selected samples without noise effectively improves the model training accuracy and the model performance. In addition, the method for training the model based on the training samples selected by other models is realized by using the first number of training samples selected by the first model for training the second model and using the second number of training samples selected by the second model for training the first model, so that the accuracy and the model performance of model training are further improved, and the problem that when the model trains the model based on the training samples selected by the model itself, noise samples possibly exist in the selected training samples, and the model training accuracy and the model performance are reduced can be effectively avoided.
Further, the processing method of step S106 may be various, and an alternative processing method is provided below, which may specifically refer to the following specific processing procedures from step a2 to step a 6.
In step a2, based on the label information of the training sample, the character of each character bit in the character sequence corresponding to the training sample is determined to be the first character.
As an example, the first character may be a real character corresponding to the character bit. Taking the above-mentioned training sample as an example of a verification code including 4 digit sequences, which includes a first character bit, a second character bit, a third character bit and a fourth character bit, assuming that the label information corresponding to the training sample is 6571, the first character may be a digit 6 for the first character bit; for the second character bit, the corresponding first character may be the number 5; for the third character bit, the corresponding first character may be the number 7; for the fourth character bit, the corresponding first character may be the number 1.
In step a4, based on that the character of each character bit in the character sequence corresponding to the training sample is the first character and the first prediction probability, determining a confidence level of the character sequence corresponding to the training sample, where the confidence level is used to characterize the accuracy of the label information of the training sample.
In an optional implementation manner, the execution subject may determine a first prediction probability corresponding to the first character based on the first character and a first prediction probability that the character is a preset character, then may perform a summation operation based on the first prediction probability that the character of each character bit corresponding to the training sample is the first character, to obtain a first result corresponding to the training sample, and may determine the first result as a confidence of the character sequence corresponding to the training sample. As an example, a training sample in the above training samples is a verification code containing 4 digits, and the true label corresponding to the training sample is 6571. Assuming that the execution subject determines, based on the fact that the character of each character bit in the character sequence corresponding to the training sample is the first character, that the first prediction probability that the first character of the first character bit is 6 is 90%, the first prediction probability that the first character of the second character bit is 5 is 80%, the first prediction probability that the first character corresponding to the third character bit is 7 is 60%, and the first prediction probability that the first character corresponding to the fourth character bit is 1 is 70%, the execution subject may sum the first prediction probabilities that the character of each character bit corresponding to the training sample is the first character to obtain a first result, that is, 300%.
Or, in another alternative implementation, the execution subject may determine a first prediction probability corresponding to the first character based on the first character and a first prediction probability that the character bit is a preset character, and then may determine an average probability value of the first prediction probabilities that the character bits are the first character based on the first prediction probability that the character of each character bit corresponding to the training sample is the first character, and determine the average probability value as a confidence of the character sequence corresponding to the training sample.
In an alternative implementation manner, after the execution subject determines the confidence level of the character sequence corresponding to the training sample through the processing in step a4, the execution subject may select a training sample with a confidence level greater than a preset confidence level threshold from the plurality of training samples based on the confidence level, and determine the number of training samples with the selected confidence level greater than the preset threshold as the first number.
Therefore, the confidence coefficient is the accuracy degree of the label information used for representing the training sample, the greater the confidence coefficient is, the higher the accuracy degree of the label information capable of representing the training sample is, the smaller the possibility that the label of the training sample is a noise label is, and the training sample with the higher confidence coefficient is adopted for model training, so that the interference of noise data on model training can be effectively avoided, and the accuracy degree of the model training and the performance of the model are effectively improved. Further, the specific processing procedure of the step a4 can be varied, and an alternative processing method is provided below, which can be specifically referred to the specific processing procedure of the following step a 42-step a 44.
In step a42, for a target character bit in the character sequence corresponding to the training sample, a character prediction probability that the target character bit is the first character is obtained from the first prediction probability, and the target character bit is any character bit in the character sequence corresponding to the training sample.
In step a44, the product of the character prediction probabilities of the character bits in the character sequence corresponding to the training sample is used as the confidence of the character sequence corresponding to the training sample.
Based on the example in the step a4, the product of the character prediction probabilities of the plurality of character bits corresponding to the training sample is 90% x 80% x 60% x 70% ═ 30.24%, and 30.24% can be calculated as the confidence of the character sequence corresponding to the training sample.
The embodiment of the present specification does not specifically limit the above-mentioned specific implementation manner for determining the confidence of the character sequence corresponding to the training sample.
Further, the processing method of step a6 can be varied, and an alternative processing method is provided below, which can be specifically referred to the specific processing procedure of step a62 below.
In step a62, training samples with confidence levels greater than a preset confidence level threshold are determined from the plurality of training samples, and a first number of training samples are selected from the training samples with confidence levels greater than the preset confidence level threshold.
It should be noted that, in the embodiment of the present disclosure, a processing manner of selecting the second number of training samples from the plurality of training samples is the same as a processing manner of selecting the first number of training samples from the plurality of training samples in the above embodiment, and in order to avoid repetition, details are not repeated here.
As can be seen from the technical solutions provided in the embodiments of the present specification, the same training samples are respectively input into the first model and the second model, which are two same models determined by the same reference model, and the first model selects a first number of training samples from the input training samples, the second model selects a second number of training samples from the input training samples, then the second model is trained based on the first number of training samples, and the first model is trained based on the second number of training samples until the trained first model and/or the trained second model satisfy the corresponding convergence condition, because the training samples of the first number and the training samples of the second number are both samples which are selected from the training samples and may not have noise, thus, the method for training the first model and the second model based on the selected samples without noise effectively improves the model training accuracy and the model performance, and in addition, by using a first number of training samples selected from the first model for training the second model, and by using a second number of training samples selected from the second model for training the first model, thus, the method for training the model based on the training samples selected by other models further improves the accuracy and the performance of the model training, thereby effectively avoiding the problems that when the model trains the model based on the training sample selected by the model, there may be noise samples in the selected training samples, resulting in reduced model training accuracy and model performance.
Based on the same technical concept, a model training method is further provided in the embodiment of the present description, fig. 2A is a second schematic flow chart of the model training method provided in the embodiment of the present description, and fig. 2B is a second schematic diagram of the model training process provided in the embodiment of the present description, an execution subject of the model training method may be a server, where the server may be an independent server or a server cluster composed of a plurality of servers, and the server may obtain a plurality of training samples from a terminal device and train a first model based on the obtained plurality of samples, where the method specifically includes the following steps:
in step S202, a plurality of training samples are obtained, wherein the training samples include a character sequence composed of a plurality of characters.
The specific implementation process of step S202 can refer to the specific processing process of step S102 in the foregoing embodiment.
In step S204, a plurality of training samples are input into the first model, a character of each character bit in the character sequence corresponding to the training sample is predicted, and a first prediction probability that the character of each character bit in the character sequence corresponding to the training sample is a preset character is determined.
The specific implementation process of step S204 can refer to the specific processing process of step S104 in the foregoing embodiment.
In step S206, a first number of training samples are selected from the plurality of training samples based on the first prediction probability and the label information of the training samples.
The specific implementation process of step S206 can refer to the specific processing process of step S106 in the foregoing embodiment.
In step S208, the first model is trained based on the first number of training samples, and if the trained first model does not satisfy the corresponding convergence condition, the steps of obtaining a plurality of training samples and training the first model are executed again until the trained first model satisfies the corresponding convergence condition.
In an alternative implementation manner, after the executing entity determines a first number of training samples through the processing in step S206, the first number of training samples may be input into the first model to obtain a first number of prediction labels corresponding to the training samples, then, loss information corresponding to the first model may be determined based on loss information between the prediction label corresponding to each training sample and its corresponding real label, and the first model is trained based on the loss information, and if the loss information of the trained first model does not satisfy the corresponding convergence condition, the steps of obtaining a plurality of training samples and training the first model are re-executed until the trained first model satisfies the corresponding convergence condition.
As can be seen from the above technical solutions provided by the embodiments of the present specification, in the embodiments of the present specification, a method for inputting a plurality of training samples into a first model, predicting a character of each character bit in a character sequence corresponding to the training samples, determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character, then selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples, and training the first model based on the first number of training samples until the trained first model satisfies a corresponding convergence condition, is a method for training the first model based on the selected noise-free samples because the first number of training samples are samples that may not have noise, which are selected from the plurality of training samples, the model training accuracy and the model performance are effectively improved.
Based on the same technical concept, a model training method is further provided in the embodiment of the present description, fig. 3A is a first schematic flow diagram of a data recognition method provided in the embodiment of the present description, and fig. 3B is a first schematic diagram of a data recognition process provided in the embodiment of the present description, an execution main body of the data recognition method may be a server, where the server may be an independent server or a server cluster composed of a plurality of servers, and the server may recognize based on acquired data to be recognized and output a character sequence corresponding to the data to be recognized, where the method specifically includes the following steps:
in step S302, data to be recognized is acquired, the data including a character sequence made up of a plurality of characters.
As an example, the data to be identified may include at least: one or more of an image, text.
In step S304, data to be recognized is input into a first model or a second model, and a character sequence corresponding to the data is output, where the first model is a model pre-trained by a training sample, and the second model is a model pre-trained by the training sample; the training process of the first model and the second model comprises the following steps: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting a plurality of training samples into a first model, predicting characters of each character position in a character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character position in the character sequence corresponding to the training samples are preset characters; inputting a plurality of training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples; training the second model based on the first number of training samples, and training the first model based on the second number of training samples; and if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition. Or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting a plurality of training samples into a first model or a second model, predicting characters of each character position in a character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character position in the character sequence corresponding to the training samples are preset characters; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; and training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
Further, before executing the processing procedure of the step S302, the method may further include the following specific processing procedures of step D2-step D4.
In step D2, a plurality of verification samples are obtained, wherein the verification samples include a character sequence composed of a plurality of characters.
In step D4, inputting a plurality of verification samples into a first model, and obtaining a first verification result corresponding to the verification samples; and inputting a plurality of verification samples into a second model to obtain a second verification result corresponding to the verification samples.
In step D6, a first accuracy corresponding to the first model and a second accuracy corresponding to the second model are determined based on the first verification result and the second verification result.
In step D8, a target model to be input with respect to the data to be recognized is determined based on the first accuracy and the second accuracy, where the target model is a model corresponding to the accuracy with a larger value.
In an optional implementation manner, it is assumed that 100 verification samples are respectively input into the first model and the second model, so as to obtain a first verification result corresponding to the first model and a second verification result corresponding to the second model. Assuming that the first accuracy obtained based on the first verification result is 90% and the second accuracy obtained based on the second verification result is 85%, the first model may be used as the target model.
Further, the processing method of step S304 may be various, and an alternative processing method is provided below, which may specifically refer to the following specific processing procedure of step E2.
In step E2, data to be recognized is input to the target model, and a character sequence corresponding to the data is output.
In this way, the character sequence corresponding to the data to be recognized is output in a mode of inputting the data to be recognized into the target model with higher accuracy, so that the accuracy of recognizing the data to be recognized can be effectively improved.
As can be seen from the above technical solutions provided in the embodiments of the present specification, in the method for obtaining data to be recognized, inputting the data to be recognized into the first model or the second model, and outputting a character sequence corresponding to the data, since the first model or the second model is obtained by inputting a plurality of identical training samples into two identical models, namely the first model and the second model, respectively, which are determined by the same reference model, selecting a first number of training samples from the plurality of input training samples through the first model, selecting a second number of training samples from the plurality of input training samples through the second model, then training the second model based on the first number of training samples, and training the first model based on the second number of training samples, the method includes the steps that the first model and/or the second model are obtained after the trained first model and/or the trained second model meet corresponding convergence conditions, in the training process of the first model or the second model, the first number of training samples and the second number of training samples are selected from the plurality of training samples and possibly have no noise, so that the first model and the second model are trained on the basis of the selected samples without noise, model training accuracy and model performance are effectively improved, and therefore the method for recognizing the data to be recognized by using the trained first model or the trained second model provided by the embodiment of the specification effectively improves accuracy of recognizing the data to be recognized.
On the basis of the same technical concept, a model training method is further provided in the embodiment of the present description, fig. 4A is a second flowchart illustration of the model training method provided in the embodiment of the present description, and fig. 4B is a second illustration of the model training process provided in the embodiment of the present description, an execution subject of the model training method may be a server, where the server may be an independent server or a server cluster composed of a plurality of servers, and the server may obtain a plurality of training samples from a terminal device based on a first intelligent contract pre-deployed in a block chain system, and train a first model based on the obtained plurality of samples, where the method specifically includes the following steps:
in step S402, model training rule information sent by the first device is received, a first intelligent contract is generated based on the model training rule information, and the first intelligent contract is deployed in the blockchain system.
In an implementation, after receiving the model training rule information sent by the first device, the blockchain system may generate a first intelligent contract based on the model training rule information, and deploy the first intelligent contract in the blockchain system.
In step S404, when a model training request sent by the first device is acquired, the following processing is executed based on the first smart contract:
a plurality of training samples are obtained, wherein the training samples comprise a character sequence formed by a plurality of characters.
The specific processing procedure of the above steps can be referred to the specific processing procedure of step S104 in the foregoing embodiments.
Inputting a plurality of training samples into a first model pre-trained based on a first intelligent contract, predicting characters of each character bit in a character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; and inputting a plurality of training samples into a second model pre-trained based on a first intelligent contract, predicting the character of each character bit in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model.
In an alternative implementation, the obtained plurality of training samples may be input into a first model pre-trained based on the first smart contract. Other specific processing procedures of the above steps can be referred to the specific processing procedure of step S104 in the foregoing embodiments.
Selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; and selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples.
The specific processing procedure of the above steps can be referred to the specific processing procedure of step S106 in the foregoing embodiments.
Training the second model based on the first number of training samples, and training the first model based on the second number of training samples; and if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
The specific processing procedure of the above steps can be referred to the specific processing procedure of step S108 in the embodiments of the foregoing specification.
As can be seen from the technical solutions provided in the embodiments of the present specification, the same training samples are respectively input to the first model and the second model which are two same models determined by the same reference model, the first model selects a first number of training samples from the input training samples, the second model selects a second number of training samples from the input training samples, the second model is trained based on the first number of training samples, and the first model is trained based on the second number of training samples until the trained first model and/or the trained second model satisfy the corresponding convergence condition, because the training samples of the first number and the training samples of the second number are both samples which are selected from the training samples and may not have noise, in addition, the method for training the first model by using the first number of training samples selected from the first model and the method for training the first model by using the second number of training samples selected from the second model further improve the accuracy and the model performance of model training, and can effectively avoid the possibility that the noise samples exist in the selected training samples when the model trains the model based on the training samples selected from the model, so that the accuracy and the model performance of model training are reduced.
On the basis of the same technical concept, a model training method is further provided in the embodiment of the present description, fig. 5A is a second flowchart illustration of the model training method provided in the embodiment of the present description, and fig. 5B is a second illustration of the model training process provided in the embodiment of the present description, an execution subject of the model training method may be a server, where the server may be an independent server or a server cluster composed of a plurality of servers, and the server may obtain a plurality of training samples from a terminal device based on a second intelligent contract pre-deployed in a block chain system, and train a first model based on the obtained plurality of samples, where the method specifically includes the following steps:
in step S502, second model training rule information sent by the first device is received, a second intelligent contract is generated based on the second model training rule information, and the second intelligent contract is deployed in the blockchain system.
In an implementation, after receiving the second model training rule information sent by the first device, the blockchain system may generate a second intelligent contract based on the second model training rule information, and deploy the second intelligent contract in the blockchain system.
In step S504, when the model training request sent by the first device is acquired, the following processing is executed based on the second smart contract:
a plurality of training samples are obtained, wherein the training samples comprise a character sequence formed by a plurality of characters.
The specific processing procedure of the above steps can be referred to the specific processing procedure of step S202 in the foregoing embodiments.
Inputting a plurality of training samples into a first model pre-trained based on a second intelligent contract, predicting characters of each character bit in a character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters.
In an alternative implementation, the obtained training samples may be input into a first model pre-trained based on the second smart contract. Other specific processing procedures of the above steps can be referred to the specific processing procedure of step S204 in the embodiments of the foregoing specification.
A first number of training samples are selected from the plurality of training samples based on the first prediction probability and the label information of the training samples.
The specific processing procedure of the above steps can be referred to the specific processing procedure of step S206 in the embodiments of the foregoing specification.
And training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model until the trained first model meets the corresponding convergence condition.
The specific processing procedure of the above steps can be referred to the specific processing procedure of step S208 in the foregoing description embodiment.
As can be seen from the above technical solutions provided by the embodiments of the present specification, in the embodiments of the present specification, a method for inputting a plurality of training samples into a first model pre-trained based on a second smart contract, predicting a character of each character bit in a character sequence corresponding to a training sample, determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training sample is a preset character, then selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples, and training the first model based on the first number of training samples until the trained first model satisfies a corresponding convergence condition is provided, wherein the first number of training samples is a sample that may not have noise and is selected from the plurality of training samples, and thus training the first model based on the selected sample that does not have noise, the model training accuracy and the model performance are effectively improved.
Based on the same technical concept, an embodiment of the present disclosure provides a data recognition method, where fig. 6A is a second schematic flow diagram of the data recognition method provided by the embodiment of the present disclosure, and fig. 6B is a second schematic diagram of a data recognition process provided by the embodiment of the present disclosure, and an execution main body of the data recognition method may be a server, where the server may be an independent server or a server cluster composed of a plurality of servers, and the server may obtain data to be recognized from a terminal device based on a third intelligent contract pre-deployed in a block chain system, and recognize the data to be recognized, so as to output a character sequence corresponding to the data. The method may specifically comprise the steps of:
in step S602, data identification rule information sent by the second device is received, a third intelligent contract is generated based on the data identification rule information, and the third intelligent contract is deployed in the blockchain system.
In an implementation, after receiving the data identification rule information sent by the second device, the blockchain system may generate a third intelligent contract based on the data identification rule information, and deploy the third intelligent contract in the blockchain system.
In step S604, when the data identification request sent by the second device is acquired, the following processing is executed based on the third smart contract:
and acquiring data to be recognized, wherein the data comprises a character sequence formed by a plurality of characters.
The specific processing procedure of the above steps can be referred to the specific processing procedure of step S302 in the foregoing embodiments.
Inputting data to be recognized into a first model or a second model, and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample; the training process of the first model and the second model comprises the following steps: obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting a plurality of training samples into a first model, predicting characters of each character position in a character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character position in the character sequence corresponding to the training samples are preset characters; inputting a plurality of training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples; training the second model based on the first number of training samples, and training the first model based on the second number of training samples; and if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition. Or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting a plurality of training samples into a first model or a second model, predicting characters of each character position in a character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character position in the character sequence corresponding to the training samples are preset characters; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; and training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
The specific processing procedure of the above steps can be referred to the specific processing procedure of step S304 in the foregoing embodiments.
As can be seen from the above technical solutions provided in the embodiments of the present specification, in the method for obtaining data to be recognized, inputting the data to be recognized into the first model or the second model, and outputting a character sequence corresponding to the data, since the first model or the second model is obtained by inputting a plurality of identical training samples into two identical models, namely the first model and the second model, respectively, which are determined by the same reference model, selecting a first number of training samples from the plurality of input training samples through the first model, selecting a second number of training samples from the plurality of input training samples through the second model, then training the second model based on the first number of training samples, and training the first model based on the second number of training samples, the method includes the steps that the first model and/or the second model are obtained after the trained first model and/or the trained second model meet corresponding convergence conditions, in the training process of the first model or the second model, the first number of training samples and the second number of training samples are selected from the plurality of training samples and possibly have no noise, so that the first model and the second model are trained on the basis of the selected samples without noise, model training accuracy and model performance are effectively improved, and therefore the method for recognizing the data to be recognized by using the trained first model or the trained second model provided by the embodiment of the specification effectively improves accuracy of recognizing the data to be recognized.
On the basis of the same technical concept, a model training apparatus is further provided in the embodiment of the present disclosure, fig. 7 is a schematic diagram illustrating a first module of the model training apparatus provided in the embodiment of the present disclosure, where the model training apparatus is configured to execute the model training method described in fig. 1A or fig. 1B, and as shown in fig. 7, the apparatus includes:
the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module acquires a plurality of training samples, and the training samples comprise character sequences formed by a plurality of characters;
the first processing module is used for inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character; inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model;
a first selection module, configured to select a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples;
a first training module to train the second model based on the first number of training samples and to train the first model based on the second number of training samples; and if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
Optionally, the first selecting module includes:
the first determining unit is used for determining a first character of the character bit of each character bit in the character sequence corresponding to the training sample based on the label information of the training sample;
a second determining unit, configured to determine a confidence level of the character sequence corresponding to the training sample based on the first character of the character bit of each character bit in the character sequence corresponding to the training sample and the first prediction probability, where the confidence level is used to characterize an accuracy degree of the label information of the training sample;
and the first selection unit is used for selecting a first number of training samples from the plurality of training samples based on the confidence degrees of the character sequences corresponding to the training samples.
Optionally, the first selecting unit determines, from the training samples, the training sample whose confidence level is greater than a preset confidence level threshold, and selects a first number of training samples from the training samples whose confidence level is greater than the preset confidence level threshold.
Optionally, the second determining unit includes:
a first selecting subunit, configured to, for a target character bit in a character sequence corresponding to the training sample, obtain, from the first prediction probability, a character prediction probability that the target character bit is the first character, where the target character bit is any character bit in the character sequence corresponding to the training sample;
and the first determining subunit is used for taking the product of the character prediction probabilities of a plurality of character bits in the character sequence corresponding to the training sample as the confidence coefficient of the character sequence corresponding to the training sample.
As can be seen from the technical solutions provided in the embodiments of the present specification, the same training samples are respectively input into the first model and the second model, which are two same models determined by the same reference model, and the first model selects a first number of training samples from the input training samples, the second model selects a second number of training samples from the input training samples, then the second model is trained based on the first number of training samples, and the first model is trained based on the second number of training samples until the trained first model and/or the trained second model satisfy the corresponding convergence condition, because the training samples of the first number and the training samples of the second number are both samples which are selected from the training samples and may not have noise, in addition, the method for training the first model by using the first number of training samples selected from the first model and the method for training the first model by using the second number of training samples selected from the second model further improve the accuracy and the model performance of model training, and can effectively avoid the possibility that the noise samples exist in the selected training samples when the model trains the model based on the training samples selected from the model, so that the accuracy and the model performance of model training are reduced.
The model training device provided in the embodiments of the present specification can implement each process in the embodiments corresponding to the above model training method, and is not described here again to avoid repetition.
It should be noted that the model training apparatus provided in the embodiment of the present disclosure and the model training method provided in the embodiment of the present disclosure are based on the same inventive concept, and therefore, specific implementation of the embodiment may refer to implementation of the model training method, and repeated details are not described again.
On the basis of the same technical concept, a model training apparatus is further provided in the embodiment of the present disclosure, fig. 8 is a schematic diagram illustrating a first module of the model training apparatus provided in the embodiment of the present disclosure, where the model training apparatus is configured to execute the model training method described in fig. 2A or fig. 2B, and as shown in fig. 8, the apparatus includes:
the second acquisition module is used for acquiring a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters;
the second processing module is used for inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character;
a second selection module, configured to select a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples;
and the second training module is used for training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, the steps of obtaining a plurality of training samples and training the first model are executed again until the trained first model meets the corresponding convergence condition.
As can be seen from the above technical solutions provided in the embodiments of the present specification, a plurality of training samples are input into a first model pre-trained based on a second smart contract, a character of each character bit in a character sequence corresponding to the training sample is predicted, a first prediction probability that the character of each character bit in the character sequence corresponding to the training sample is a preset character is determined, then, based on the first prediction probability and label information of the training samples, a first number of training samples are selected from the plurality of training samples, and the first model is trained based on the first number of training samples until the trained first model satisfies a corresponding convergence condition, because the first number of training samples are samples that may not have noise and are selected from the plurality of training samples, therefore, the method for training the first model based on the selected sample without noise effectively improves the model training accuracy and the model performance.
The model training device provided in the embodiments of the present specification can implement each process in the embodiments corresponding to the above model training method, and is not described here again to avoid repetition.
It should be noted that the model training apparatus provided in the embodiment of the present disclosure and the model training method provided in the embodiment of the present disclosure are based on the same inventive concept, and therefore, specific implementation of the embodiment may refer to implementation of the model training method, and repeated details are not described again.
Based on the same technical concept, a data recognition apparatus is further provided in the embodiments of the present disclosure, fig. 9 is a schematic diagram of a first module of the data recognition apparatus provided in the embodiments of the present disclosure, where the data recognition apparatus is configured to execute the data recognition method described in fig. 3A or fig. 3B, and as shown in fig. 9, the apparatus includes:
the third acquisition module is used for acquiring data to be identified, wherein the data comprises a character sequence formed by a plurality of characters;
the output module is used for inputting the data to be recognized into a first model or a second model and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample; the training process of the first model and the second model comprises the following steps:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting the training samples into a first model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples; training the second model based on the first number of training samples and training the first model based on the second number of training samples; if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition;
or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting the training samples into a first model or a second model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; and training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
Optionally, the data at least includes: one or more of an image, text.
As can be seen from the above technical solutions provided in the embodiments of the present specification, in the method for obtaining data to be recognized, inputting the data to be recognized into the first model or the second model, and outputting a character sequence corresponding to the data, since the first model or the second model is obtained by inputting a plurality of identical training samples into two identical models, namely the first model and the second model, respectively, which are determined by the same reference model, selecting a first number of training samples from the plurality of input training samples through the first model, selecting a second number of training samples from the plurality of input training samples through the second model, then training the second model based on the first number of training samples, and training the first model based on the second number of training samples, the method includes the steps that the first model and/or the second model are obtained after the trained first model and/or the trained second model meet corresponding convergence conditions, in the training process of the first model or the second model, the first number of training samples and the second number of training samples are selected from the plurality of training samples and possibly have no noise, so that the first model and the second model are trained on the basis of the selected samples without noise, model training accuracy and model performance are effectively improved, and therefore the method for recognizing the data to be recognized by using the trained first model or the trained second model provided by the embodiment of the specification effectively improves accuracy of recognizing the data to be recognized.
The model training device provided in the embodiments of the present specification can implement each process in the embodiments corresponding to the above model training method, and is not described here again to avoid repetition.
It should be noted that the model training apparatus provided in the embodiment of the present disclosure and the model training method provided in the embodiment of the present disclosure are based on the same inventive concept, and therefore, specific implementation of the embodiment may refer to implementation of the model training method, and repeated details are not described again.
On the basis of the same technical concept, a model training apparatus is further provided in the embodiment of the present disclosure, fig. 10 is a schematic diagram illustrating a first module of the model training apparatus provided in the embodiment of the present disclosure, where the model training apparatus is configured to execute the model training method described in fig. 4A or fig. 4B, and as shown in fig. 10, the apparatus includes:
the first receiving module is used for receiving model training rule information sent by first equipment, generating a first intelligent contract based on the model training rule information and deploying the first intelligent contract in the block chain system;
the third processing module, when acquiring the model training request sent by the first device, executes the following processing based on the first intelligent contract:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters;
inputting the training samples into a first model pre-trained based on the first intelligent contract, predicting characters of each character bit in a character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; inputting the training samples into a second model pre-trained based on the first intelligent contract, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model;
selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples;
training the second model based on the first number of training samples and training the first model based on the second number of training samples; and if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
As can be seen from the technical solutions provided in the embodiments of the present specification, the same training samples are respectively input into the first model and the second model, which are two same models determined by the same reference model, and the first model selects a first number of training samples from the input training samples, the second model selects a second number of training samples from the input training samples, then the second model is trained based on the first number of training samples, and the first model is trained based on the second number of training samples until the trained first model and/or the trained second model satisfy the corresponding convergence condition, because the training samples of the first number and the training samples of the second number are both samples which are selected from the training samples and may not have noise, in addition, the method for training the first model by using the first number of training samples selected from the first model and the method for training the first model by using the second number of training samples selected from the second model further improve the accuracy and the model performance of model training, and can effectively avoid the possibility that the noise samples exist in the selected training samples when the model trains the model based on the training samples selected from the model, so that the accuracy and the model performance of model training are reduced.
The model training device provided in the embodiments of the present specification can implement each process in the embodiments corresponding to the above model training method, and is not described here again to avoid repetition.
It should be noted that the model training apparatus provided in the embodiment of the present disclosure and the model training method provided in the embodiment of the present disclosure are based on the same inventive concept, and therefore, specific implementation of the embodiment may refer to implementation of the model training method, and repeated details are not described again.
On the basis of the same technical concept, a model training apparatus is further provided in the embodiment of the present disclosure, fig. 11 is a schematic diagram of a first module of the model training apparatus provided in the embodiment of the present disclosure, and the model training apparatus is configured to execute the model training method described in fig. 5A or fig. 5B, as shown in fig. 11, and includes:
the second receiving module is used for receiving second model training rule information sent by the first equipment, generating a second intelligent contract based on the second model training rule information and deploying the second intelligent contract in the block chain system;
the fourth processing module, when acquiring the model training request sent by the first device, executes the following processing based on the second intelligent contract:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters;
inputting the training samples into a first model pre-trained based on the second intelligent contract, predicting characters of each character bit in a character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters;
selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples;
and training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model until the trained first model meets the corresponding convergence condition.
As can be seen from the above technical solutions provided in the embodiments of the present specification, a plurality of training samples are input into a first model pre-trained based on the second smart contract, a character of each character bit in a character sequence corresponding to the training sample is predicted, a first prediction probability that the character of each character bit in the character sequence corresponding to the training sample is a preset character is determined, then, based on the first prediction probability and label information of the training samples, a first number of training samples are selected from the plurality of training samples, and the first model is trained based on the first number of training samples until the trained first model satisfies a corresponding convergence condition, because the first number of training samples are samples that may not have noise and are selected from the plurality of training samples, therefore, the method for training the first model based on the selected sample without noise effectively improves the model training accuracy and the model performance.
The model training device provided in the embodiments of the present specification can implement each process in the embodiments corresponding to the above model training method, and is not described here again to avoid repetition.
It should be noted that the model training apparatus provided in the embodiment of the present disclosure and the model training method provided in the embodiment of the present disclosure are based on the same inventive concept, and therefore, specific implementation of the embodiment may refer to implementation of the model training method, and repeated details are not described again.
Based on the same technical concept, a data recognition apparatus is further provided in the embodiment of the present disclosure, fig. 12 is a schematic diagram of a first module of the data recognition apparatus provided in the embodiment of the present disclosure, where the data recognition apparatus is configured to execute the data recognition method described in fig. 6A or fig. 6B, and as shown in fig. 11, the apparatus includes:
the third receiving module is used for receiving data identification rule information sent by second equipment, generating a third intelligent contract based on the data identification rule information and deploying the third intelligent contract in the block chain system;
a fifth processing module, configured to, when acquiring the data identification request sent by the second device, execute the following processing based on the third intelligent contract:
acquiring data to be identified, wherein the data comprises a character sequence formed by a plurality of characters;
inputting the data to be recognized into a first model or a second model, and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample; the training process of the first model and the second model comprises the following steps:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting the training samples into a first model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples; training the second model based on the first number of training samples and training the first model based on the second number of training samples; if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition;
or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting the training samples into a first model or a second model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; and training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
As can be seen from the above technical solutions provided in the embodiments of the present specification, in the method for obtaining data to be recognized, inputting the data to be recognized into the first model or the second model, and outputting a character sequence corresponding to the data, since the first model or the second model is obtained by inputting a plurality of identical training samples into two identical models, namely the first model and the second model, respectively, which are determined by the same reference model, selecting a first number of training samples from the plurality of input training samples through the first model, selecting a second number of training samples from the plurality of input training samples through the second model, then training the second model based on the first number of training samples, and training the first model based on the second number of training samples, the method includes the steps that the first model and/or the second model are obtained after the trained first model and/or the trained second model meet corresponding convergence conditions, in the training process of the first model or the second model, the first number of training samples and the second number of training samples are selected from the plurality of training samples and possibly have no noise, so that the first model and the second model are trained on the basis of the selected samples without noise, model training accuracy and model performance are effectively improved, and therefore the method for recognizing the data to be recognized by using the trained first model or the trained second model provided by the embodiment of the specification effectively improves accuracy of recognizing the data to be recognized.
The model training device provided in the embodiments of the present specification can implement each process in the embodiments corresponding to the above model training method, and is not described here again to avoid repetition.
It should be noted that the model training apparatus provided in the embodiment of the present disclosure and the model training method provided in the embodiment of the present disclosure are based on the same inventive concept, and therefore, specific implementation of the embodiment may refer to implementation of the model training method, and repeated details are not described again.
Based on the same technical concept, an embodiment of the present specification further provides a model training device, as shown in fig. 13, fig. 13 is a schematic diagram of a hardware structure of the model training device provided in the embodiment of the present specification, and the model training device is configured to execute the model training methods described in fig. 1A, fig. 1B, fig. 2A, fig. 2B, fig. 4A, fig. 4B, and fig. 5A and fig. 5B, or may also be configured to execute the data recognition methods described in fig. 3A, fig. 3B, fig. 6A, and fig. 6B.
The model training apparatus may vary significantly depending on configuration or performance, and may include one or more processors 1301 and a memory 1302, where the memory 1302 may store one or more stored applications or data. Memory 1302 may be, among other things, transient or persistent storage. The application program stored in memory 1302 may include one or more modules (not shown), each of which may include a series of computer-executable instructions in a distribution facility for tasks. Still further, processor 1301 may be disposed in communication with memory 1302, executing a series of computer executable instructions in memory 1302 on a data access device. The data access apparatus may also include one or more power supplies 1303, one or more wired or wireless network interfaces 1304, one or more input-output interfaces 1305, one or more keyboards 1306.
In particular, in this embodiment, the model training apparatus includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions in the apparatus for assigning tasks, and the one or more programs configured to be executed by the one or more processors include computer-executable instructions for:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters;
inputting the training samples into a first model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model;
selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples;
training the second model based on the first number of training samples and training the first model based on the second number of training samples; and if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
Alternatively, the model training apparatus described above may also be computer-executable instructions for:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters;
inputting the training samples into a first model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters;
selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples;
and training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model until the trained first model meets the corresponding convergence condition.
Alternatively, the model training apparatus may also be an apparatus in a blockchain system, configured to perform the following computer-executable instructions:
receiving model training rule information sent by first equipment, generating a first intelligent contract based on the model training rule information, and deploying the first intelligent contract in the block chain system;
when a model training request sent by the first device is acquired, executing the following processing based on the first intelligent contract:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters;
inputting the training samples into a first model pre-trained based on the first intelligent contract, predicting characters of each character bit in a character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; inputting the training samples into a second model pre-trained based on the first intelligent contract, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model;
selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples;
training the second model based on the first number of training samples and training the first model based on the second number of training samples; and if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
Alternatively, the model training apparatus may also be an apparatus in a blockchain system, configured to perform the following computer-executable instructions:
receiving second model training rule information sent by first equipment, generating a second intelligent contract based on the second model training rule information, and deploying the second intelligent contract in the block chain system;
when a model training request sent by the first device is acquired, executing the following processing based on the second intelligent contract:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters;
inputting the training samples into a first model pre-trained based on the second intelligent contract, predicting characters of each character bit in a character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters;
selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples;
and training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model until the trained first model meets the corresponding convergence condition.
Alternatively, the device may be a data recognition device, and the data recognition device may be further configured to perform the following computer-executable instructions:
acquiring data to be identified, wherein the data comprises a character sequence formed by a plurality of characters;
inputting the data to be recognized into a first model or a second model, and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample; the training process of the first model and the second model comprises the following steps:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting the training samples into a first model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples; training the second model based on the first number of training samples and training the first model based on the second number of training samples; if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition;
or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting the training samples into a first model or a second model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; and training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
Alternatively, the device may also be a data identification device, and the data identification device may be a device in a blockchain system, and is configured to perform the following computer-executable instructions:
receiving data identification rule information sent by second equipment, generating a third intelligent contract based on the data identification rule information, and deploying the third intelligent contract in the block chain system;
when a data identification request sent by the second device is acquired, executing the following processing based on the third intelligent contract:
acquiring data to be identified, wherein the data comprises a character sequence formed by a plurality of characters;
inputting the data to be recognized into a first model or a second model, and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample; the training process of the first model and the second model comprises the following steps:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting the training samples into a first model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples; training the second model based on the first number of training samples and training the first model based on the second number of training samples; if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition;
or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting the training samples into a first model or a second model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; and training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
Further, corresponding to the model training method provided in the foregoing embodiment, an embodiment of the present specification further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by the processor 1101, the steps of the foregoing embodiment of the model training method are implemented, and the same technical effect can be achieved, and in order to avoid repetition, details are not described here again. The computer-readable storage medium may be a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
It should be noted that the model training device and the computer-readable storage medium provided in the embodiments of the present specification can implement each process in the above-described model training method embodiments, and are not described herein again to avoid repetition.
Further, corresponding to the data identification method provided in the foregoing embodiment, an embodiment of the present specification further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by the processor 1101, the steps of the foregoing data identification method embodiment are implemented, and the same technical effects can be achieved, and in order to avoid repetition, details are not described here again. The computer-readable storage medium may be a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
It should be noted that the data identification device and the computer-readable storage medium provided in the embodiments of the present specification can implement each process in the data identification method embodiments, and are not described herein again to avoid repetition.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The description has been presented with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the description. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It is to be understood that the embodiments described in this specification can be implemented in hardware, software, firmware, middleware, microcode, or any combination thereof. For a hardware implementation, the Processing units may be implemented within one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), general purpose processors, controllers, micro-controllers, microprocessors, other electronic units designed to perform the functions of the present description, or a combination thereof.
For software implementation, the techniques described above in this specification can be implemented by modules (e.g., procedures, functions, and so on) that perform the functions described above in this specification. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.
It should also be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the same element.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present specification may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the above methods of the embodiments of the present specification.
While the embodiments of the present disclosure have been described with reference to the accompanying drawings, the present disclosure is not limited to the above-described embodiments, which are intended to be illustrative rather than limiting, and that various modifications and changes may be made by those skilled in the art without departing from the spirit of the disclosure and the scope of the appended claims. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification.

Claims (20)

1. A method of model training, the method comprising:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters;
inputting the training samples into a first model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model;
selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples;
training the second model based on the first number of training samples and training the first model based on the second number of training samples; and if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
2. The method of claim 1, wherein selecting a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples comprises:
determining that the character of each character bit in the character sequence corresponding to the training sample is a first character based on the label information of the training sample;
determining a confidence coefficient of the character sequence corresponding to the training sample based on that the character of each character bit in the character sequence corresponding to the training sample is a first character and the first prediction probability, wherein the confidence coefficient is used for representing the accuracy of the label information of the training sample;
and selecting a first number of training samples from the plurality of training samples based on the confidence degrees of the character sequences corresponding to the training samples.
3. The method of claim 2, wherein selecting a first number of training samples from the plurality of training samples based on the confidence level of the character sequence corresponding to the training sample comprises:
and determining the training samples with the confidence degrees larger than a preset confidence degree threshold value from the plurality of training samples, and selecting a first number of training samples from the training samples with the confidence degrees larger than the preset confidence degree threshold value.
4. The method according to any one of claims 2 to 3, wherein the determining the confidence level of the character sequence corresponding to the training sample based on the character of each character bit in the character sequence corresponding to the training sample being the first character and the first prediction probability comprises:
aiming at a target character bit in a character sequence corresponding to the training sample, acquiring a character prediction probability that the target character bit is the first character from the first prediction probability, wherein the target character bit is any character bit in the character sequence corresponding to the training sample;
and taking the product of the character prediction probabilities of a plurality of character bits in the character sequence corresponding to the training sample as the confidence coefficient of the character sequence corresponding to the training sample.
5. The method of claim 1, wherein the characters included in the sequence of characters include one or more of chinese characters, letters, numbers, symbols, and graphics.
6. The method of claim 1, wherein the reference model is a model constructed based on one or more different pre-defined neural network algorithms.
7. The method of claim 1, further comprising noise data unrelated to the sequence of characters in the training sample, the noise data comprising one or more of chinese characters, letters, numbers, symbols, graphics, lines.
8. A method of model training, the method comprising:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters;
inputting the training samples into a first model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters;
selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples;
and training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model until the trained first model meets the corresponding convergence condition.
9. A method of data identification, the method comprising:
acquiring data to be identified, wherein the data comprises a character sequence formed by a plurality of characters;
inputting the data to be recognized into a first model or a second model, and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample; the training process of the first model and the second model comprises the following steps:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting the training samples into a first model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples; training the second model based on the first number of training samples and training the first model based on the second number of training samples; if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition;
or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting the training samples into a first model or a second model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; and training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
10. The method of claim 9, further comprising noise data unrelated to the sequence of characters in the data to be recognized, the noise data comprising one or more of chinese characters, letters, numbers, symbols, graphics, lines.
11. A model training method is applied to a block chain system, and comprises the following steps:
receiving model training rule information sent by first equipment, generating a first intelligent contract based on the model training rule information, and deploying the first intelligent contract in the block chain system;
when a model training request sent by the first device is acquired, executing the following processing based on the first intelligent contract:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters;
inputting the training samples into a first model pre-trained based on the first intelligent contract, predicting characters of each character bit in a character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; inputting the training samples into a second model pre-trained based on the first intelligent contract, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model;
selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples;
training the second model based on the first number of training samples and training the first model based on the second number of training samples; and if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
12. A model training apparatus, the apparatus comprising:
the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module acquires a plurality of training samples, and the training samples comprise character sequences formed by a plurality of characters;
the first processing module is used for inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character; inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model;
a first selection module, configured to select a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples;
a first training module to train the second model based on the first number of training samples and to train the first model based on the second number of training samples; and if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
13. A model training apparatus, the apparatus comprising:
the second acquisition module is used for acquiring a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters;
the second processing module is used for inputting the training samples into a first model, predicting the character of each character bit in the character sequence corresponding to the training samples and determining a first prediction probability that the character of each character bit in the character sequence corresponding to the training samples is a preset character;
a second selection module, configured to select a first number of training samples from the plurality of training samples based on the first prediction probability and the label information of the training samples;
and the second training module is used for training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, the steps of obtaining a plurality of training samples and training the first model are executed again until the trained first model meets the corresponding convergence condition.
14. A data recognition apparatus, the apparatus comprising:
the third acquisition module is used for acquiring data to be identified, wherein the data comprises a character sequence formed by a plurality of characters;
the output module is used for inputting the data to be recognized into a first model or a second model and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample; the training process of the first model and the second model comprises the following steps:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting the training samples into a first model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples; training the second model based on the first number of training samples and training the first model based on the second number of training samples; if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition;
or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting the training samples into a first model or a second model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; and training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
15. A model training apparatus, the apparatus comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters;
inputting the training samples into a first model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model;
selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples;
training the second model based on the first number of training samples and training the first model based on the second number of training samples; and if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
16. A model training apparatus, the apparatus comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters;
inputting the training samples into a first model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters;
selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples;
and training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model until the trained first model meets the corresponding convergence condition.
17. A data recognition device, the device comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
acquiring data to be identified, wherein the data comprises a character sequence formed by a plurality of characters;
inputting the data to be recognized into a first model or a second model, and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample; the training process of the first model and the second model comprises the following steps:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting the training samples into a first model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples; training the second model based on the first number of training samples and training the first model based on the second number of training samples; if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition;
or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting the training samples into a first model or a second model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; and training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
18. A storage medium for storing computer-executable instructions, which when executed implement the following:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters;
inputting the training samples into a first model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model;
selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples;
training the second model based on the first number of training samples and training the first model based on the second number of training samples; and if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition.
19. A storage medium for storing computer-executable instructions, which when executed implement the following:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters;
inputting the training samples into a first model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters;
selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples;
and training the first model based on the first number of training samples, and if the trained first model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model until the trained first model meets the corresponding convergence condition.
20. A storage medium for storing computer-executable instructions, which when executed implement the following:
acquiring data to be identified, wherein the data comprises a character sequence formed by a plurality of characters;
inputting the data to be recognized into a first model or a second model, and outputting a character sequence corresponding to the data, wherein the first model is a model pre-trained through a training sample, and the second model is a model pre-trained through the training sample; the training process of the first model and the second model comprises the following steps:
obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting the training samples into a first model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; inputting the training samples into a second model, predicting the character of each character position in the character sequence corresponding to the training samples, and determining a second prediction probability that the character of each character position in the character sequence corresponding to the training samples is a preset character, wherein the first model and the second model are determined by the same reference model; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; selecting a second number of training samples from the plurality of training samples based on the second prediction probability and the label information of the training samples; training the second model based on the first number of training samples and training the first model based on the second number of training samples; if the trained first model and/or the trained second model do not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model and the second model until the trained first model and/or the trained second model meet the corresponding convergence condition;
or obtaining a plurality of training samples, wherein the training samples comprise a character sequence formed by a plurality of characters; inputting the training samples into a first model or a second model, predicting characters of each character bit in the character sequence corresponding to the training samples, and determining a first prediction probability that the characters of each character bit in the character sequence corresponding to the training samples are preset characters; selecting a first number of training samples from the plurality of training samples based on the first prediction probability and label information of the training samples; and training the first model or the second model based on the first number of training samples, and if the trained first model or second model does not meet the corresponding convergence condition, re-executing the steps of obtaining a plurality of training samples and training the first model or second model until the trained first model or second model meets the corresponding convergence condition.
CN202210028772.0A 2022-01-11 2022-01-11 Model training method, data identification method, device and equipment Pending CN114417987A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210028772.0A CN114417987A (en) 2022-01-11 2022-01-11 Model training method, data identification method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210028772.0A CN114417987A (en) 2022-01-11 2022-01-11 Model training method, data identification method, device and equipment

Publications (1)

Publication Number Publication Date
CN114417987A true CN114417987A (en) 2022-04-29

Family

ID=81274191

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210028772.0A Pending CN114417987A (en) 2022-01-11 2022-01-11 Model training method, data identification method, device and equipment

Country Status (1)

Country Link
CN (1) CN114417987A (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228980A (en) * 2016-07-21 2016-12-14 百度在线网络技术(北京)有限公司 Data processing method and device
CN109189767A (en) * 2018-08-01 2019-01-11 北京三快在线科技有限公司 Data processing method, device, electronic equipment and storage medium
CN110991520A (en) * 2019-11-29 2020-04-10 汉海信息技术(上海)有限公司 Method and device for generating training sample
CN111209377A (en) * 2020-04-23 2020-05-29 腾讯科技(深圳)有限公司 Text processing method, device, equipment and medium based on deep learning
CN112149825A (en) * 2020-09-24 2020-12-29 创新奇智(上海)科技有限公司 Neural network model training method and device, electronic equipment and storage medium
CN112329470A (en) * 2020-11-09 2021-02-05 北京中科闻歌科技股份有限公司 Intelligent address identification method and device based on end-to-end model training
US20210241097A1 (en) * 2019-11-07 2021-08-05 Canon Kabushiki Kaisha Method and Apparatus for training an object recognition model
US20210256420A1 (en) * 2020-02-19 2021-08-19 Microsoft Technology Licensing, Llc System and method for improving machine learning models by detecting and removing inaccurate training data
CN113470031A (en) * 2021-09-03 2021-10-01 北京字节跳动网络技术有限公司 Polyp classification method, model training method and related device
CN113761925A (en) * 2021-07-23 2021-12-07 中国科学院自动化研究所 Named entity identification method, device and equipment based on noise perception mechanism

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228980A (en) * 2016-07-21 2016-12-14 百度在线网络技术(北京)有限公司 Data processing method and device
CN109189767A (en) * 2018-08-01 2019-01-11 北京三快在线科技有限公司 Data processing method, device, electronic equipment and storage medium
US20210241097A1 (en) * 2019-11-07 2021-08-05 Canon Kabushiki Kaisha Method and Apparatus for training an object recognition model
CN110991520A (en) * 2019-11-29 2020-04-10 汉海信息技术(上海)有限公司 Method and device for generating training sample
US20210256420A1 (en) * 2020-02-19 2021-08-19 Microsoft Technology Licensing, Llc System and method for improving machine learning models by detecting and removing inaccurate training data
CN111209377A (en) * 2020-04-23 2020-05-29 腾讯科技(深圳)有限公司 Text processing method, device, equipment and medium based on deep learning
CN112149825A (en) * 2020-09-24 2020-12-29 创新奇智(上海)科技有限公司 Neural network model training method and device, electronic equipment and storage medium
CN112329470A (en) * 2020-11-09 2021-02-05 北京中科闻歌科技股份有限公司 Intelligent address identification method and device based on end-to-end model training
CN113761925A (en) * 2021-07-23 2021-12-07 中国科学院自动化研究所 Named entity identification method, device and equipment based on noise perception mechanism
CN113470031A (en) * 2021-09-03 2021-10-01 北京字节跳动网络技术有限公司 Polyp classification method, model training method and related device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BO HAN: "Co-teaching: Robust Training Deep Neural Networks with Extremely Noisy Labels", ARXIV:1804.06872V2, 21 May 2018 (2018-05-21), pages 1 - 12 *
宫辰: "标签噪声鲁棒学习算法研究综述", 《航空兵器》, 30 June 2020 (2020-06-30) *

Similar Documents

Publication Publication Date Title
KR102170199B1 (en) Classify input examples using comparison sets
CN111615702B (en) Method, device and equipment for extracting structured data from image
CN110222330B (en) Semantic recognition method and device, storage medium and computer equipment
CN113254654B (en) Model training method, text recognition method, device, equipment and medium
CN112966713B (en) DGA domain name detection method and device based on deep learning and computer equipment
EP4002216A1 (en) Method for recommending object, neural network, computer program product and computer-readable storage medium
CN111414946A (en) Artificial intelligence-based medical image noise data identification method and related device
CN109597982B (en) Abstract text recognition method and device
CN113128536A (en) Unsupervised learning method, system, computer device and readable storage medium
CN115100659A (en) Text recognition method and device, electronic equipment and storage medium
CN110414622B (en) Classifier training method and device based on semi-supervised learning
CN114639096A (en) Text recognition method and device, electronic equipment and storage medium
CN113140012B (en) Image processing method, device, medium and electronic equipment
US20220092406A1 (en) Meta-feature training models for machine learning algorithms
CN110750637B (en) Text abstract extraction method, device, computer equipment and storage medium
CN114417987A (en) Model training method, data identification method, device and equipment
CN115374766A (en) Text punctuation recovery method and related equipment
CN112085594B (en) Identity verification method, device and readable storage medium
CN115373634A (en) Random code generation method and device, computer equipment and storage medium
CN113590812B (en) Junk text training sample screening method and device and electronic equipment
CN114358284A (en) Method, device and medium for training neural network step by step based on category information
CN113128496B (en) Method, device and equipment for extracting structured data from image
CN113468906A (en) Graphic code extraction model construction method, recognition method, device, equipment and medium
CN113827981A (en) Game loss user prediction method and system based on naive Bayes
CN111753548A (en) Information acquisition method and device, computer storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination