CN117351501A - Information input method, device, equipment and storage medium - Google Patents

Information input method, device, equipment and storage medium Download PDF

Info

Publication number
CN117351501A
CN117351501A CN202311319172.0A CN202311319172A CN117351501A CN 117351501 A CN117351501 A CN 117351501A CN 202311319172 A CN202311319172 A CN 202311319172A CN 117351501 A CN117351501 A CN 117351501A
Authority
CN
China
Prior art keywords
content
picture
preset
target
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311319172.0A
Other languages
Chinese (zh)
Inventor
欧阳高询
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Life Insurance Company of China Ltd
Original Assignee
Ping An Life Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Life Insurance Company of China Ltd filed Critical Ping An Life Insurance Company of China Ltd
Priority to CN202311319172.0A priority Critical patent/CN117351501A/en
Publication of CN117351501A publication Critical patent/CN117351501A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/19007Matching; Proximity measures
    • G06V30/19013Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of information processing, in particular to an information input method, which comprises the steps of classifying acquired pictures to be processed through a preset classification model to obtain classification pictures corresponding to classification results; performing mole pattern removal on the classified pictures through a mole pattern removal model to obtain target pictures; performing content identification on the target picture to obtain picture content; carrying out structuring treatment on the picture content to obtain structured content; performing text error correction on the structured content to obtain corrected content; and extracting information from the corrected content to obtain target content, and determining that the information is successfully input when the target content meets the preset requirement. The invention is applied to information input in financial or insurance business. The invention realizes the automatic correct ordering of the picture contents and the correction of the identification characters by structuring the picture contents and correcting the characters, thereby improving the subsequent auditing efficiency. And the mole patterns of the picture are removed, so that the quality of the picture is improved.

Description

Information input method, device, equipment and storage medium
Technical Field
The present invention relates to the field of information processing technologies, and in particular, to an information input method, an information input device, and a storage medium.
Background
With the continuous development of society, business of enterprises is rapidly developed, and more document information is required to be recorded in a computer. For example, in financial institutions such as banks, securities, insurance, etc., the volume of business is continuously expanding, and a large amount of business information needs to be recorded in a computer. Whereas existing OCR recognition techniques have good recognition results for samples under ideal conditions. However, in the information input of the actual scene, the actual effect of the prior art is not ideal aiming at the text recognition of complex scenes such as noisy background, moire, watermark interference of the seal and the like. At this time, a large amount of original picture texts need to be manually interfered to achieve a certain effect. And when the pictures are not uploaded in sequence, or part of the pictures are returned to be uploaded again without meeting the requirements, the content is identified to be chaotic, and the system can be input for auditing after the content of the continuous pictures is sequenced and corrected manually, so that the information input period is longer, the labor cost is increased, and the experience of customers is also influenced.
Disclosure of Invention
The embodiment of the invention provides an information input method, device, equipment and storage medium, which are used for solving the problem that in the prior art, the picture identification content is disordered and needs to be reordered and corrected, so that the information input is longer.
An information entry method, comprising:
acquiring at least one picture to be processed, and classifying each picture to be processed through a preset classification model to obtain a classification picture corresponding to each classification result;
a mole pattern removing model is obtained, and mole pattern removing is carried out on each classified picture through the mole pattern removing model, so that a target picture is obtained;
performing content identification on each target picture to obtain picture content corresponding to each target picture;
carrying out structuring treatment on each picture content to obtain structured content corresponding to each picture content;
performing text error correction on each structured content to obtain correction content corresponding to each structured content;
and extracting information from each correction content to obtain target content, and determining that the information is successfully input when the target content meets the preset requirement.
An information entry device, comprising:
the image screening module is used for acquiring at least one image to be processed, classifying each image to be processed through a preset classification model, and obtaining classified images corresponding to each classification result;
the mole pattern removing module is used for obtaining a mole pattern removing model, and mole pattern removing is carried out on each classified picture through the mole pattern removing model to obtain a target picture;
The content identification module is used for carrying out content identification on each target picture to obtain picture content corresponding to each target picture;
the structuring processing module is used for carrying out structuring processing on each picture content to obtain structured content corresponding to each picture content;
the text error correction module is used for performing text error correction on each structured content to obtain correction content corresponding to each structured content;
and the information extraction module is used for extracting information from each correction content to obtain target content, and determining that the information input is successful when the target content meets the preset requirement.
A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the above-described information entry method when executing the computer program.
A computer readable storage medium storing a computer program which when executed by a processor implements the above-described information entry method.
The invention provides an information input method, an information input device, information input equipment and a storage medium. The mole pattern removing model is used for removing the mole patterns of the pictures, so that the picture quality in financial or insurance business is improved, and the accuracy of the follow-up picture identification is further improved. By carrying out structural processing and text error correction on the identified picture content, correct ordering of the picture content in the financial or insurance business is realized, and error correction of the picture content in the financial or insurance business is realized, so that auditing efficiency in the financial or insurance business is improved, information input in the financial or insurance business is shortened, and labor cost in the financial or insurance business is reduced. And finally, extracting and verifying information of the corrected picture content, so that the extracted target content meets the requirements, and further, the successful information input in financial or insurance business is ensured.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic view of an application environment of an information entry method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method of information entry in an embodiment of the invention;
FIG. 3 is a flowchart of the information entry method step S40 according to an embodiment of the present invention;
FIG. 4 is a flowchart of the information entry method step S50 according to an embodiment of the present invention;
FIG. 5 is a schematic block diagram of an information entry device in an embodiment of the invention;
FIG. 6 is a schematic diagram of a computer device in accordance with an embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The information input method provided by the embodiment of the invention can be applied to an application environment shown in figure 1. Specifically, the information input method is applied to an information input device, the information input device comprises a client and a server as shown in fig. 1, and the client and the server communicate through a network and are used for solving the problem that in the prior art, the picture identification content is disordered and needs to be reordered and corrected, so that the information input is long. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms. The client is also called a user end, and refers to a program corresponding to a server for providing classified services for clients. The client may be installed on, but is not limited to, various computers, notebook computers, smartphones, tablet computers, and portable wearable devices.
In one embodiment, as shown in fig. 2, an information input method is provided, and the method is applied to the server in fig. 1 for illustration, and includes the following steps:
S10: at least one picture to be processed is obtained, and each picture to be processed is classified through a preset classification model, so that a classification picture corresponding to each classification result is obtained.
Understandably, the picture to be processed is a picture for which uploading information needs to be entered. The preset classification model is used for distinguishing the types of pictures, such as hospitalized medical records, examination reports or bill of charge, or invalid pictures. The classified pictures are pictures which are obtained after classification and need to be subjected to moire elimination.
Specifically, at least one uploaded picture to be processed is obtained, a preset classification model is called, all the pictures to be processed are input into the preset classification model, each picture to be processed is classified through the preset classification model, namely titles in each picture to be processed are identified, picture clustering is carried out by taking picture content containing the picture titles as a clustering center, and therefore all pictures relevant to the clustering center are obtained, and therefore classified pictures in all the pictures to be processed can be determined. For example, in the scene of insurance claims, the identity card picture, the insurance data picture, the accident identification responsibility book, the medical expense list, the medical record data and the like uploaded by the user are classified, so that the identity card picture, the insurance data picture and the accident identification responsibility book are obtained as proof pictures; the medical expense list is a bill picture; the medical record data is a medical picture. Wherein if a self-shot picture or a landscape picture with an uploading error is identified, the picture is deleted.
S20: and obtaining a mole pattern removing model, and carrying out mole pattern removing on each classified picture through the mole pattern removing model to obtain a target picture.
It is understood that moire is a stripe of high frequency interference that occurs in a photosensitive element on a digital camera or a scanner, etc., and is a stripe of high frequency irregularity that causes a picture to appear colored. The target picture refers to a picture that does not contain moire. The moire removal model is trained from a large number of sample data and incorporating an anti-loss function.
Specifically, a moire removal model is called, all the classified pictures are input into the moire removal model, and the moire removal is carried out on each classified picture through the moire removal model, so that target pictures corresponding to each classified picture are obtained. The multi-scale feature extraction can be performed on each classified picture, namely downsampling is performed on each classified picture by adopting a Unet network, so that the multi-scale feature is obtained. And then carrying out wavelet transformation or Fourier transformation on the multi-scale features to carry out mole pattern removal, thereby obtaining target features. And then carrying out inverse transformation on the target characteristics to obtain target pictures corresponding to the classified pictures. For example, in the case of a hospital in a computer, information such as an examination report and the like is photographed and uploaded in the case of the hospital in the computer, and the screen of the computer is photographed and generally contains mole marks, so that the uploaded picture needs to be removed, and the uploaded picture is removed through a mole mark removal model after training, so that a picture without mole marks can be obtained.
S30, carrying out content identification on each target picture to obtain picture content corresponding to each target picture.
The picture content is understandably text content in the target picture.
Specifically, after obtaining the target pictures, performing content recognition on each target picture, wherein the content recognition can be performed directly by adopting an OCR technology, the content recognition can be performed through a preset recognition model, and text line division can be performed on the target pictures to obtain text boxes corresponding to each text line. And marking the position of the text line in each text box, identifying the text line in each marked text box, and sorting according to the marked position to obtain the picture content corresponding to each target picture. Before the identification, each target picture is preprocessed, that is, the content identification is performed after the target picture is preprocessed such as binarization, noise removal, inclination correction and the like. For example, when the target picture is an inspection report, text information 1 corresponding to the text box 1, text information 2 corresponding to the text box 2, text information 3 corresponding to the text box 3, and the content in the text box is identified, so that text content corresponding to the text line in each text box can be obtained. And sequencing the text contents correspondingly identified according to the sequence of the text boxes, so as to obtain the picture contents corresponding to each target picture.
S40: and carrying out structuring treatment on each picture content to obtain structured content corresponding to each picture content.
The structured content is understandably obtained by ordering all picture contents corresponding to the same document.
Specifically, each picture content is structured by extracting, for each target picture, the last text content identified from the last line of text boxes and the first text content identified from the first line of text boxes. And then, carrying out semantic analysis on the last text content and the first text content corresponding to different target pictures, thereby obtaining a semantic analysis result. And ordering the first text content and the picture content corresponding to the last text content according to the semantic analysis result to obtain the structured content. For example, in a claim scene, when uploading a survey result during a survey, it is not necessarily required that all pictures are uploaded in a correct order, and therefore, it is necessary to sort the recognized picture contents. The method comprises the steps of carrying out digital identification on the bottommost part of a picture containing page codes, and sequencing according to the digital size to obtain the structured content. And carrying out semantic analysis and sequencing on the page-free codes to obtain the structured content.
S50: and performing text error correction on each structured content to obtain correction content corresponding to each structured content.
Correction content is understood to mean content that is corrected for erroneous text identified in the structured content.
Specifically, feature extraction is performed on characters in the structured content, namely, features of each character are extracted, so that character features corresponding to the characters are obtained. And carrying out error prediction on all the character features, namely predicting the error characters in all the structured contents, so as to obtain at least one error character. And then, performing word-tone similarity calculation on the error words through preset dictionary information to obtain at least one candidate word corresponding to each error word. And screening all the candidate characters to obtain target characters corresponding to each error character, and replacing the error characters with the target characters so as to obtain correction contents corresponding to each structured content.
S60: and extracting information from each correction content to obtain target content, and determining that the information is successfully input when the target content meets the preset requirement.
The target content is understood to mean the key content in the picture, such as the content of hospitalization medical history, hospitalization time, medication and the like.
Specifically, the title content in the correction content is determined through the identified correction content, template matching is performed based on the title content, so that an information extraction template corresponding to each title content is obtained, and the correction content corresponding to the title content is subjected to information extraction through the information extraction template, so that the target content can be obtained. Or, the corrected content is subjected to word segmentation processing to obtain target keywords, and then the weight value of each target keyword is calculated, namely, the word frequency (TF) and the inverse text frequency (IDF) are multiplied, so that the weight value of the target keyword is obtained. And screening all the target keywords according to the weights, screening out the target keywords with larger weight values, and determining the target keywords as target contents. Further, a weight threshold and a predetermined number may also be set. And if the number of the target keywords with the weight value exceeding the weight threshold value is larger than the preset number, screening the preset number of the target keywords from the large to the small according to the weight value, and taking the target keywords as target contents. And if the number of the target keywords with the weight value exceeding the weight threshold value is smaller than or equal to the preset number, determining the target keywords with the weight value exceeding the weight threshold value as target content. Finally, content verification is carried out on the extracted target content, namely the target content can be sent to a client for verification, and a feedback verification result is received; content verification can also be performed through a preset information template. And when the target content meets the preset requirement, determining that the information is successfully input. And when the target content does not meet the preset requirement, determining that the information input fails, and sending a prompt. For example, in the security claim scene, the extracted information content is judged, when the similarity between the extracted information content and the template information is greater than or equal to a threshold value, the condition that the information meets the preset requirement is determined, and the information is successfully input is determined. When the similarity between the extracted information content and the template information is smaller than a threshold value, determining that the information does not meet the preset requirement, and determining that the information input fails.
According to the information input method, the picture is classified through the preset classification model, and the accuracy of uploading the picture is ensured. The mole pattern removing model is used for removing the mole patterns of the pictures, so that the picture quality in financial or insurance business is improved, and the accuracy of the follow-up picture identification is further improved. By carrying out structural processing and text error correction on the identified picture content, correct ordering of the picture content in the financial or insurance business is realized, and error correction of the picture content in the financial or insurance business is realized, so that auditing efficiency in the financial or insurance business is improved, information input in the financial or insurance business is shortened, and labor cost in the financial or insurance business is reduced. And finally, extracting and verifying information of the corrected picture content, so that the extracted target content meets the requirements, and further, the successful information input in financial or insurance business is ensured.
In an embodiment, before step S10, that is, before classifying each of the pictures to be processed by the preset classification model, the method includes:
s101, acquiring a sample data set, wherein the sample data set comprises at least one sample data and a sample label corresponding to each sample data.
The sample data is understandably a picture of the material uploaded in the history data. The sample data set includes at least one sample data and a sample tag corresponding to each sample data. For example, in the scene of insurance claims, the identity card picture, the insurance data picture, the accident identification responsibility book, the medical expense list, the medical record data and the like uploaded by the user are classified, so that the identity card picture, the insurance data picture and the accident identification responsibility book are obtained as proof pictures; the medical expense list is a bill picture; the medical record data is a medical picture. The sample tags are used to characterize the class of sample data, e.g., proof class pictures, ticket class pictures, and medical class pictures, among others. Further, a sample data set is constructed from all sample data and sample tags corresponding to the respective sample data. Wherein, some negative samples can be added to train the preset training model, thereby improving the accuracy of the preset classification model.
S102, acquiring a preset training model, and classifying all the sample data through the preset training model to obtain a prediction label.
Understandably, the predictive label is a picture obtained by classifying the sample data by a preset training model.
Specifically, a preset training model is obtained, all sample data and sample labels are input into the preset training model, all sample data are classified through the preset training model, namely, the preset training model is subjected to iterative training through all sample data, so that the preset training model finishes classification of the categories of the sample data, and the prediction labels corresponding to the sample data are obtained. For example, in the case of a insurance claim, the data pictures are classified, that is, the landscape picture, the identity card picture, the accident identification responsibility book, the medical expense list and the outpatient or emergency medical record are classified, so that the categories corresponding to the pictures are obtained, wherein the landscape picture and the portrait picture are invalid picture categories.
And S103, determining a predicted loss value of the preset training model according to the sample label and the predicted label corresponding to the same sample data.
Understandably, the predictive loss value is generated during classification of the sample training data.
Specifically, after obtaining the prediction labels, arranging the prediction labels corresponding to each sample data according to the sequence of the sample data in the sample data set, and comparing the sample labels associated with the sample data with the prediction labels of the sample data with the same sequence; namely, according to the sample data sequence, comparing the sample label corresponding to the first sample data with the prediction label corresponding to the first sample data, and calculating a loss value between the sample label and the prediction label through a loss function; and comparing the sample label corresponding to the sample data positioned at the second with the prediction label corresponding to the sample data positioned at the second, and calculating the loss value between the sample label and the prediction label through the loss function until the loss values of all the sample labels and all the prediction labels are compared, so as to obtain the prediction loss value.
And S104, when the predicted loss value reaches a preset convergence condition, determining a preset classification model from the preset training model after convergence.
It is to be understood that the convergence condition may be a condition that the predicted loss value is smaller than a set threshold value, or may be a condition that the predicted loss value is small after 500 times of calculation and does not drop any more, and the training is stopped.
Specifically, after the predicted loss value is obtained, when the predicted loss value does not reach a preset convergence condition, the initial parameters of the preset training model are adjusted through the predicted loss value, all sample data and sample labels are input into the preset training model for adjusting the initial parameters again, and iterative training is carried out on the preset training model for adjusting the initial parameters, so that the corresponding predicted loss value can be obtained. And when the predicted loss value does not reach the preset convergence condition, the initial parameters of the preset training model are readjusted according to the predicted loss value, so that the predicted loss value of the preset training model with the initial parameters readjusted reaches the preset convergence condition. Therefore, the predicted classification results are continuously drawn to the correct results, the accuracy of the preset training model is higher and higher, and the preset training model after convergence is determined as the preset classification model until the predicted loss value of the preset training model reaches the preset convergence condition.
According to the embodiment of the invention, the preset training model is subjected to iterative training through a large amount of sample data, and the integral loss value of the preset training model is calculated through the loss function, so that the determination of the predicted loss value of the preset training model is realized. And the initial parameters of the preset training model are adjusted through the predicted loss value until the model converges, so that the determination of the preset classification model is realized, and further, the higher accuracy of the preset classification model is ensured.
In one embodiment, before step S20, that is, before the moire removal model is obtained, the method includes:
s201, acquiring a picture data set, wherein the picture data set comprises at least one pair of moire pictures and non-moire pictures.
A moire picture is understood to mean a picture containing moire, which is a historical picture for information entry in financial or insurance business. The non-moire picture is a picture which does not contain moire, and is obtained by removing the moire of the moire picture in other modes. And at least one pair of moire pictures and non-moire pictures are called from the database, and then a picture data set is constructed according to all the obtained moire pictures and non-moire pictures.
S202, acquiring an countermeasure generation network, and inputting each moire picture into the countermeasure generation network to obtain a generated picture corresponding to each moire picture.
Specifically, an countermeasure generation network is acquired, each moire picture is input into the countermeasure generation network, that is, only an original image including the moire is input into the countermeasure generation network, and the original image including the moire is subjected to picture processing, that is, the moire picture is subjected to moire removal processing, by a generator in the countermeasure generation network, thereby obtaining a generated picture corresponding to each moire picture. For example, in the claim scene, the photographed inpatient history and the pictures of the examination report are input into the countermeasure generation network, and further the moire removal processing is performed, so that the generated pictures corresponding to the pictures can be obtained. Generating a picture refers to a picture similar thereto.
S203, inputting all the moire pictures, the non-moire pictures and the generated pictures into a preset removal model, and performing iterative training on the preset removal model through a gradient back propagation algorithm to obtain a moire removal model.
Specifically, a preset removal model is obtained, all the moire pictures, the non-moire pictures and the generated pictures are input into the preset removal model, iterative training is carried out on the preset removal model through a gradient back propagation algorithm, namely mole pattern removal processing is carried out on each moire picture and each generated picture through the preset removal model, and therefore a first picture corresponding to each generated picture and a second picture corresponding to each mole pattern picture are calculated, and a first loss value and a second loss value among the first picture, the second picture and the non-mole pattern picture are calculated respectively. And then, calculating the loss function through the first loss value and the second loss value, so as to obtain a preset loss function of the preset removal model. And then, carrying out parameter adjustment on the preset removal model through a gradient back propagation algorithm, namely calculating the partial derivative of each parameter on the preset loss function by adopting a chain rule, namely calculating the local gradient of each node layer by layer from an output layer, and finally obtaining the partial derivative of each parameter on the preset loss function. And then, updating the parameters through the partial derivative of each parameter by using a gradient descent algorithm until the preset removal model reaches a convergence condition, and determining the converged preset removal model as a moire removal model.
According to the embodiment of the invention, the loss function is calculated by calculating the loss values among the first picture, the second picture and the non-moire picture. And the parameter adjustment is carried out on the preset removal model through a gradient back propagation algorithm, so that the iterative training of the preset removal model is realized, the higher accuracy of the mole pattern removal model is ensured, and the model training speed is improved.
In an embodiment, as shown in fig. 3, in step S40, each piece of picture content is structured to obtain structured content corresponding to each piece of picture content, which includes:
s401, carrying out semantic analysis on the text contents corresponding to the coordinate information to obtain a semantic analysis result.
And S402, ordering the picture content corresponding to each text content based on the semantic analysis result to obtain the structured content.
The picture content comprises at least one text content and coordinate information corresponding to each text content.
Text content is understood to mean the content of each line in the target picture. The coordinate information refers to the text box position corresponding to each line of text.
Specifically, after obtaining the picture content, at least one text content corresponding to each picture content and coordinate information corresponding to each text content are obtained, semantic analysis is performed between the text contents corresponding to each coordinate information, that is, the first text content and the last text content in the first row and the last row in the picture content are determined and extracted through the coordinate information. And then, performing association degree calculation on the first text content and the last text content between the picture contents, namely extracting features of the first text content and the last text content, and performing single-hot coding on the first text content and the last text content by calculating Euclidean distance between the features, and calculating semantic similarity between coding vectors so as to obtain association degree values corresponding to each target picture. And comparing all the association degree values, and screening out the maximum association degree value from the comparison result to obtain a semantic analysis result. And ordering the text contents based on the semantic analysis result, namely ordering the picture contents corresponding to the relevance value, namely ordering the picture contents corresponding to the first text content in a mode that the picture contents corresponding to the last text content are before and the picture contents corresponding to the first text content are after, and thus ordering all the picture contents to obtain the structured content. The picture content containing the picture title is set as a first page, so that the subsequent picture content is ordered. For example, in the claim scene, when the hospitalized medical records are identified, because the picture contents are more and not ordered, the hospitalized medical records can be ordered according to the correct order by calculating the association degree between the first text content and the last text content in each picture content and ordering the two picture contents with the largest association degree value, and the ordering of staff is not needed.
According to the embodiment of the invention, through semantic analysis of text contents, the calculation of the association degree between the text contents is realized, and further, the determination of the semantic analysis result is realized. And sequencing the text contents based on the semantic analysis result, so that sequencing of the picture contents is automatically realized, and further, the acquisition of the structured contents is realized.
In one embodiment, as shown in fig. 4, in step S50, text error correction is performed on each of the structured contents to obtain corrected contents corresponding to each of the structured contents, including:
and S501, extracting features of characters in the structured content to obtain character features corresponding to the characters.
Specifically, after the structured content is obtained, feature extraction is performed on the characters in the structured content, namely, convolution processing is performed on the characters in the structured content through a convolution network, so as to obtain convolution features. And then, carrying out multi-scale feature extraction on the convolution features, and carrying out feature fusion on the extracted multi-scale features to obtain the character features corresponding to each character. Or inputting all the structured contents into a feature extraction model, and extracting features of characters in the structured contents through the feature extraction model, so as to obtain character features corresponding to the characters.
S502, performing error prediction on all the character features to obtain at least one error character.
S503, performing word tone similarity calculation on the error words through preset dictionary information to obtain at least one candidate word corresponding to each error word.
Specifically, a preset error correction model is obtained, all the character features are input into the preset error correction model, and the character features are embedded through an embedding layer of the preset error correction model, namely, the character features are subjected to single-heat encoding, so that content encoding is generated. And then, quantizing all content codes corresponding to the character features to obtain vectors to be corrected corresponding to the character contents. And then, carrying out MASK error correction on the vector to be corrected, namely carrying out MASK prediction on the vector to be corrected, so as to obtain at least one error word and an error position corresponding to the error word in the structured content. Further, obtaining preset dictionary information, performing word-tone similarity calculation on the wrong words through a shape near word confusion set and a pinyin confusion set in the preset dictionary information, namely calculating the similarity between the words in the shape near word confusion set and the wrong words, calculating the similarity between the pinyin of the words in the pinyin confusion set and the pinyin of the wrong words, determining the preset words with the similarity being larger than a preset similarity threshold value as candidate words, and screening at least one candidate word corresponding to each wrong word from the preset dictionary information.
In another embodiment, the error correction model may also be an encodable-Decoder mode, the Encoder uses a biglu to encode the text of the Chinese text according to the character level, the Decoder uses an LSTM, and when the Decoder decodes, the Decoder receives not only the feature vector of the last word, but also the feature vector that gathers the word features of the Encoder into a context information-containing feature vector through a attention mechanism, thereby achieving text error correction based on semantic information. The feature vector generated at each time t of the decoder is projected to the vector space through linear transformation for generating candidate characters, and is also used for determining whether the output result of the decoder at the current moment directly replicates the original text through a gating mechanism formed by a Softmax layer. Furthermore, when the gating mechanism does not select direct replication, and when the character to be predicted appears in a pre-set confusion set, the decoder selects the candidate character in the confusion set instead of in the entire dictionary vector space.
S504, screening all the candidate characters to obtain target characters corresponding to each error character, and replacing the error characters with the target characters to obtain correction contents.
Specifically, all candidate characters are screened, namely, the font similarity and the chord similarity corresponding to all the candidate characters are weighted, namely, the first weight is multiplied by the font similarity, the second weight is multiplied by the chord similarity, and the multiplication results are added, so that the character similarity value corresponding to each candidate character can be obtained. The target text corresponding to each erroneous text is selected and processed by comparing the magnitudes of the text similarity values of all candidate texts corresponding to the same erroneous text. And replacing the error characters according to the error positions, namely replacing all the error characters in the structured content with target characters, and obtaining the corrected content.
The embodiment of the invention realizes the determination of the character characteristics by extracting the characteristics of the characters in the structured content. By carrying out error prediction on the character characteristics, the determination of the error characters is realized. And performing word-tone similarity calculation on the wrong words through preset dictionary information, so that the candidate words are obtained.
Screening all candidate characters
In one embodiment, after step S60, that is, after extracting information from each correction content to obtain the target content, the method includes:
S601, acquiring a preset information template, and performing content verification on the target content through the preset information template to obtain a text similarity value.
S602, when the text similarity value is greater than or equal to a preset threshold value, confirming that the target content meets preset requirements.
And S603, when the text similarity value is smaller than a preset threshold value, confirming that the target content does not meet the preset requirement.
Understandably, the text similarity value refers to a similarity between the preset information template and the target content.
Specifically, a preset information template is obtained from a database, and the preset information template is obtained by matching picture titles. And then, carrying out content verification on the target content through a preset information template, namely carrying out similarity calculation on the target content and the template content corresponding to the same content, extracting corresponding content characteristics, and calculating the similarity between the content characteristics so as to obtain a text similarity value. And when the text similarity value is greater than or equal to a preset threshold value, confirming that the target content meets the preset requirement. And when the text similarity value is smaller than a preset threshold value, confirming that the target content does not meet the preset requirement. For example, in the insurance claim scene, text matching is performed on the target content and the template content, and the text similarity value is obtained by calculating the euclidean distance. And then, when the text similarity value is larger than or equal to a preset threshold value, confirming that the target content meets the preset requirement. And when the text similarity value is smaller than a preset threshold value, confirming that the target content does not meet the preset requirement.
According to the embodiment of the invention, the content verification is carried out on the target content and the preset information template, so that the calculation of the text similarity value is realized. And comparing the preset threshold value with the text similarity value, so as to judge whether the uploaded picture meets the requirement or not, and further ensure the accuracy of the uploaded picture.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean the sequence of execution, and the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention in any way.
In an embodiment, an information input device is provided, and the information input device corresponds to the information input method in the embodiment one by one. As shown in fig. 5, the information input device includes a picture screening module 11, a moire removing module 12, a content identifying module 13, a structuring processing module 14, a text error correcting module 15 and an information extracting module 16. The functional modules are described in detail as follows:
the picture screening module 11 is configured to obtain at least one picture to be processed, and classify each picture to be processed by using a preset classification model to obtain a classification picture corresponding to each classification result;
The moire removing module 12 is configured to obtain a moire removing model, and perform moire removing on each of the classified pictures through the moire removing model to obtain a target picture;
the content identification module 13 is configured to identify content of each target picture, so as to obtain picture content corresponding to each target picture;
a structuring processing module 14, configured to perform structuring processing on each of the picture contents, so as to obtain structured contents corresponding to each of the picture contents;
the text error correction module 15 is configured to perform text error correction on each of the structured contents, so as to obtain corrected contents corresponding to each of the structured contents;
the information extraction module 16 is configured to extract information from each correction content to obtain a target content, and determine that information input is successful when the target content meets a preset requirement.
In an embodiment, the picture content includes at least one text content and coordinate information corresponding to each of the text content; the structured processing module 14 comprises:
the semantic analysis unit is used for carrying out semantic analysis on the text contents corresponding to the coordinate information to obtain a semantic analysis result;
And the content ordering unit is used for ordering the picture content corresponding to each text content based on the semantic analysis result to obtain the structured content.
In one embodiment, the text error correction module 15 includes:
the feature extraction unit is used for extracting features of characters in the structured content to obtain character features corresponding to the characters;
the error prediction unit is used for carrying out error prediction on all the character features to obtain at least one error character;
the similarity calculation unit is used for calculating the word-tone similarity of the error words through preset dictionary information to obtain at least one candidate word corresponding to each error word;
and the character replacement unit is used for screening all the candidate characters to obtain target characters corresponding to each error character, and replacing the error characters with the target characters to obtain correction contents.
In one embodiment, the information extraction module 16 includes:
the content verification unit is used for acquiring a preset information template, and performing content verification on the target content through the preset information template to obtain a text similarity value;
The meeting requirements unit is used for confirming that the target content meets the meeting preset requirements when the text similarity value is larger than or equal to a preset threshold value;
and the non-conforming unit is used for confirming that the target content does not conform to the preset requirement when the text similarity value is smaller than a preset threshold value.
In one embodiment, the moire removal module 12 comprises:
a picture acquisition unit for acquiring a picture data set including at least one pair of moire pictures and non-moire pictures;
the generation picture unit is used for acquiring an countermeasure generation network, inputting each mole pattern picture into the countermeasure generation network and obtaining a generation picture corresponding to each mole pattern picture;
the iterative training unit is used for inputting all the moire pictures, the non-moire pictures and the generated pictures into a preset removal model, and carrying out iterative training on the preset removal model through a gradient back propagation algorithm to obtain a moire removal model.
In an embodiment, the picture screening module 11 includes:
a sample acquisition unit configured to acquire a sample data set including at least one sample data and a sample tag corresponding to each of the sample data;
The label prediction unit is used for acquiring a preset training model, classifying all the sample data through the preset training model, and obtaining a prediction label;
the loss prediction unit is used for determining a predicted loss value of the preset training model according to the sample label and the predicted label corresponding to the same sample data;
and the model convergence unit is used for determining a preset classification model from the converged preset training model when the predicted loss value reaches a preset convergence condition.
The specific limitation of the information input device can be referred to the limitation of the information input method hereinabove, and will not be repeated here. The respective modules in the above-described information entry apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 6. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing data used in the information input method in the above embodiment. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an information entry method.
In one embodiment, a computer device is provided that includes a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the above-described information entry method when executing the computer program.
In one embodiment, a computer readable storage medium is provided, which stores a computer program which, when executed by a processor, implements the above-described information entry method.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims (10)

1. An information entry method, comprising:
acquiring at least one picture to be processed, and classifying each picture to be processed through a preset classification model to obtain a classification picture corresponding to each classification result;
A mole pattern removing model is obtained, and mole pattern removing is carried out on each classified picture through the mole pattern removing model, so that a target picture is obtained;
performing content identification on each target picture to obtain picture content corresponding to each target picture;
carrying out structuring treatment on each picture content to obtain structured content corresponding to each picture content;
performing text error correction on each structured content to obtain correction content corresponding to each structured content;
and extracting information from each correction content to obtain target content, and determining that the information is successfully input when the target content meets the preset requirement.
2. The information entry method according to claim 1, wherein the picture content includes at least one text content and coordinate information corresponding to each of the text content;
the structuring processing is performed on each picture content to obtain structured content corresponding to each picture content, including:
carrying out semantic analysis on the text contents corresponding to the coordinate information to obtain a semantic analysis result;
and ordering the picture content corresponding to each text content based on the semantic analysis result to obtain the structured content.
3. The information entry method of claim 1, wherein performing text error correction on each of the structured contents to obtain corrected contents corresponding to each of the structured contents, comprises:
extracting characteristics of characters in the structured content to obtain character characteristics corresponding to the characters;
performing error prediction on all the character features to obtain at least one error character;
performing word-tone similarity calculation on the error words through preset dictionary information to obtain at least one candidate word corresponding to each error word;
and screening all the candidate characters to obtain target characters corresponding to each error character, and replacing the error characters with the target characters to obtain correction contents.
4. The information entry method as defined in claim 1, wherein the information extraction of each correction content to obtain the target content includes:
acquiring a preset information template, and performing content verification on the target content through the preset information template to obtain a text similarity value;
when the text similarity value is larger than or equal to a preset threshold value, confirming that the target content meets preset requirements;
And when the text similarity value is smaller than a preset threshold value, confirming that the target content does not meet the preset requirement.
5. The information entry method according to claim 1, wherein before the obtaining the moire removal model, comprising:
obtaining a picture data set, wherein the picture data set comprises at least one pair of moire pictures and non-moire pictures;
acquiring an countermeasure generation network, and inputting each mole pattern picture into the countermeasure generation network to obtain a generated picture corresponding to each mole pattern picture;
inputting all the moire pictures, the non-moire pictures and the generated pictures into a preset removal model, and performing iterative training on the preset removal model through a gradient back propagation algorithm to obtain a moire removal model.
6. The information entry method according to claim 1, wherein before classifying each of the pictures to be processed by a preset classification model, comprising:
obtaining a sample data set, wherein the sample data set comprises at least one sample data and a sample label corresponding to each sample data;
acquiring a preset training model, and classifying all the sample data through the preset training model to obtain a prediction label;
Determining a predicted loss value of the preset training model according to a sample label and a predicted label corresponding to the same sample data;
and when the predicted loss value reaches a preset convergence condition, determining a preset training model after convergence as a preset classification model.
7. An information entry device, comprising:
the image screening module is used for acquiring at least one image to be processed, classifying each image to be processed through a preset classification model, and obtaining classified images corresponding to each classification result;
the mole pattern removing module is used for obtaining a mole pattern removing model, and mole pattern removing is carried out on each classified picture through the mole pattern removing model to obtain a target picture;
the content identification module is used for carrying out content identification on each target picture to obtain picture content corresponding to each target picture;
the structuring processing module is used for carrying out structuring processing on each picture content to obtain structured content corresponding to each picture content;
the text error correction module is used for performing text error correction on each structured content to obtain correction content corresponding to each structured content;
And the information extraction module is used for extracting information from each correction content to obtain target content, and determining that the information input is successful when the target content meets the preset requirement.
8. The information entry device of claim 7, wherein the moire removal module comprises:
a sample acquisition unit configured to acquire a sample data set including at least one sample data and a sample tag corresponding to each of the sample data;
the label prediction unit is used for acquiring a preset training model, classifying all the sample data through the preset training model, and obtaining a prediction label;
the loss prediction unit is used for determining a predicted loss value of the preset training model according to the sample label and the predicted label corresponding to the same sample data;
and the model convergence unit is used for determining a preset training model after convergence as a preset classification model when the predicted loss value reaches a preset convergence condition.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the information entry method according to any one of claims 1 to 7 when executing the computer program.
10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the information entry method according to any one of claims 1 to 7.
CN202311319172.0A 2023-10-10 2023-10-10 Information input method, device, equipment and storage medium Pending CN117351501A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311319172.0A CN117351501A (en) 2023-10-10 2023-10-10 Information input method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311319172.0A CN117351501A (en) 2023-10-10 2023-10-10 Information input method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117351501A true CN117351501A (en) 2024-01-05

Family

ID=89358931

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311319172.0A Pending CN117351501A (en) 2023-10-10 2023-10-10 Information input method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117351501A (en)

Similar Documents

Publication Publication Date Title
CN111859916B (en) Method, device, equipment and medium for extracting key words of ancient poems and generating poems
CN113657354B (en) Answer sheet identification method and system based on deep learning
CN114596566B (en) Text recognition method and related device
CN110968689A (en) Training method of criminal name and law bar prediction model and criminal name and law bar prediction method
CN111666932B (en) Document auditing method, device, computer equipment and storage medium
CN114357174B (en) Code classification system and method based on OCR and machine learning
WO2022035942A1 (en) Systems and methods for machine learning-based document classification
CN115862040A (en) Text error correction method and device, computer equipment and readable storage medium
CN110889341A (en) Form image recognition method and device based on AI (Artificial Intelligence), computer equipment and storage medium
CN113806613A (en) Training image set generation method and device, computer equipment and storage medium
CN112307749A (en) Text error detection method and device, computer equipment and storage medium
CN115759758A (en) Risk assessment method, device, equipment and storage medium
US20200226162A1 (en) Automated Reporting System
CN115984886A (en) Table information extraction method, device, equipment and storage medium
US20230134218A1 (en) Continuous learning for document processing and analysis
CN115880702A (en) Data processing method, device, equipment, program product and storage medium
CN117351501A (en) Information input method, device, equipment and storage medium
CN115294593A (en) Image information extraction method and device, computer equipment and storage medium
CN115688166A (en) Information desensitization processing method and device, computer equipment and readable storage medium
CN114283429A (en) Material work order data processing method, device, equipment and storage medium
CN117133000A (en) Signature verification method, device, equipment and storage medium
CN113837169B (en) Text data processing method, device, computer equipment and storage medium
CN117058679A (en) Text error correction processing method, device, equipment and storage medium
RU2764705C1 (en) Extraction of multiple documents from a single image
CN112380860B (en) Sentence vector processing method, sentence matching device, sentence vector processing equipment and sentence matching medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination