CN116682116A - Text tampering identification method, apparatus, computer device and readable storage medium - Google Patents

Text tampering identification method, apparatus, computer device and readable storage medium Download PDF

Info

Publication number
CN116682116A
CN116682116A CN202310675608.3A CN202310675608A CN116682116A CN 116682116 A CN116682116 A CN 116682116A CN 202310675608 A CN202310675608 A CN 202310675608A CN 116682116 A CN116682116 A CN 116682116A
Authority
CN
China
Prior art keywords
image
text
falsification
identified
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310675608.3A
Other languages
Chinese (zh)
Inventor
余琦
刘邦贵
丁拥科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongan Online P&c Insurance Co ltd
Original Assignee
Zhongan Online P&c Insurance Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongan Online P&c Insurance Co ltd filed Critical Zhongan Online P&c Insurance Co ltd
Priority to CN202310675608.3A priority Critical patent/CN116682116A/en
Publication of CN116682116A publication Critical patent/CN116682116A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • G06V30/1801Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting

Abstract

The application relates to a text tampering identification method, a text tampering identification device, computer equipment and a readable storage medium. The method comprises the following steps: acquiring an image to be identified and at least one priori condition corresponding to the image to be identified, wherein the priori condition is determined based on a text identification result of the image to be identified; inputting the image to be identified and at least one priori condition corresponding to the image to be identified into a pre-trained text falsification identification model to carry out text falsification identification, and obtaining a text falsification identification result of the image to be identified. By adopting the method, the text falsification recognition model can be guided to carry out text falsification recognition on the image to be recognized by using one or more priori conditions, so that the text falsification recognition efficiency and the text falsification recognition precision are improved.

Description

Text tampering identification method, apparatus, computer device and readable storage medium
Technical Field
The application relates to the technical field of artificial intelligence, in particular to the technical field of computer vision and deep learning, and particularly relates to a text tampering identification method, a text tampering identification device, computer equipment and a readable storage medium.
Background
By means of development of new technologies such as big data, cloud computing and artificial intelligence, internet financial business expands sharply, and financial innovative products also emerge continuously. For example, text recognition of images submitted by users (invoices, certificates, etc.) is performed by OCR (optical character recognition, optical character recognition technology) to convert image information into computer input technology that can be used.
The existing text falsification recognition model is generally a single-task model, when falsification recognition is carried out on a plurality of characters in an abnormal (such as handwriting falsification) text region, the text falsification recognition model is required to respectively carry out text falsification recognition on different characters so as to realize relatively accurate recognition on multi-character falsification, but the text falsification recognition of the multi-characters is carried out in the mode, the recognition efficiency is lower, and the recognition cost is higher.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a text falsification recognition method, apparatus, computer device, and readable storage medium that can improve text falsification recognition efficiency and recognition accuracy.
In order to solve the technical problem, in a first aspect, a text tampering identification method is provided, including:
acquiring an image to be identified and at least one priori condition corresponding to the image to be identified, wherein the priori condition is determined based on a text identification result of the image to be identified;
inputting the image to be identified and at least one priori condition corresponding to the image to be identified into a pre-trained text falsification identification model to carry out text falsification identification, and obtaining a text falsification identification result of the image to be identified.
In one embodiment, obtaining at least one prior condition corresponding to the image to be identified includes:
performing text recognition on the image to be recognized to respectively obtain text information and text positions of the image to be recognized;
extracting pixel positions of each character in the text position of the image to be recognized and text information of each character in the text information of the image to be recognized;
and matching the pixel position of each character with the text information of each character to obtain at least one priori condition corresponding to the image to be recognized.
In one embodiment, inputting an image to be identified and at least one priori condition corresponding to the image to be identified into a pre-trained text falsification identification model for text falsification identification, and obtaining a text falsification identification result of the image to be identified includes:
inputting at least one priori condition corresponding to the image to be identified into a pre-trained text falsification identification model;
respectively extracting image features of an image to be identified and priori features of priori conditions by using a pre-trained text falsification identification model;
carrying out feature stacking on the image features and the prior features to obtain stacked features;
And processing the stacked features, outputting to obtain a classification vector, and confirming a text falsification recognition result according to the classification vector.
In one embodiment, the training process of the pre-trained text falsification recognition model includes:
acquiring part of training images in a training set and labels of each training image in the part of training images, wherein the labels of the training images represent the types of the training images, and the types of the training images comprise tampered training images and untampered training images;
taking a training image as input, and extracting image features of the training image through an extraction network layer of a text falsification recognition model to be trained; inputting the image characteristics into a detection network layer of the image falsification recognition model, acquiring falsification recognition results of the training image, and outputting the falsification recognition results by using an output layer of the image falsification recognition model;
and carrying out loss function calculation on the falsification recognition result and the label of the training image, optimizing parameters of the text falsification recognition model according to the calculated loss function result, and continuing training until the preset condition is met, and ending training.
In one embodiment, acquiring the training image set includes:
Acquiring an original image, wherein the original image is an untampered image;
performing data augmentation operation on the original image to obtain an augmented original image, and taking the augmented original image as an untampered training image;
and performing tampering operation on the untampered training image to obtain a tampered training image, wherein the tampering operation comprises one or a combination of a plurality of operations such as elastic transformation, image distortion, pixel offset, pixel pasting and the like.
In One embodiment, the a priori conditions are entered into the text tamper recognition model in the form of One-Hot encoding along with the image to be recognized.
In one embodiment, the prior condition is determined based on a text recognition result of the image to be recognized, including: and carrying out text recognition on the image to be recognized by utilizing an OCR technology, and determining the prior condition corresponding to the image to be recognized.
In order to solve the above technical problem, in a second aspect, there is provided a text falsification recognition apparatus, the apparatus including:
the acquisition module is used for acquiring the image to be identified and at least one priori condition corresponding to the image to be identified, wherein the priori condition is determined based on the text identification result of the image to be identified;
the processing module is used for inputting the image to be identified and the prior condition corresponding to the image to be identified into a pre-trained text falsification identification model to carry out falsification identification;
And the output module is used for outputting a falsification identification result of the image to be identified.
In order to solve the above technical problem, in a third aspect, there is provided a computer device including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program: the steps of the method of the first aspect described above are implemented when the processor executes the computer program.
In order to solve the above technical problem, in a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method described in the first aspect.
The beneficial effects of the application are as follows: compared with the prior art, the text falsification recognition method and device based on the text recognition result of the image to be recognized determine the priori condition, and input at least one priori condition and the image to be recognized into a pre-trained text falsification recognition model for text falsification recognition, so that the text falsification recognition model is used for guiding the text falsification recognition of the text to be recognized by the priori condition, the precision of text falsification recognition can be improved, and one or more priori conditions and the image to be recognized are input into the pre-trained text falsification recognition model for text falsification recognition, so that multitasking classification can be realized, and the text falsification recognition efficiency is improved.
Drawings
FIG. 1 is an application environment diagram of a text falsification recognition method in one embodiment;
FIG. 2 is a flow diagram of a text falsification recognition method in one embodiment;
FIG. 3 is a flow chart illustrating the corresponding acquisition priori conditions in step S201 in one embodiment;
FIG. 4 is a flow chart of a training process of a text falsification recognition model in another embodiment;
FIG. 5 is a flow diagram of a text falsification recognition method in one embodiment;
FIG. 6 is a schematic diagram of a text falsification recognition model in one embodiment;
FIG. 7 is a block diagram of a text falsification recognition apparatus in one embodiment;
fig. 8 is an internal structural diagram of a computer device in one embodiment.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
OCR technology is capable of recognizing images containing text, converting the text therein into retrievable information. That is, the text region in the image piece is searched and recognized as text information that can be represented by the computer. OCR technology is commonly applied in the fields of public opinion monitoring, document retrieval, subtitle recognition, screenshot recognition, network picture recognition, document data retrieval, paperless office work, manuscript editing and proofreading and the like. After integrating OCR technology in the office process of the company, not only the office efficiency is improved, but also the electronic storage and management are convenient, including later consulting, searching, editing and management and the like.
At present, with the development of new technologies such as big data, cloud computing, artificial intelligence and the like, abnormal (such as handwriting tampering) text areas can be detected by using a target detection or semantic segmentation model. In carrying out the present invention, the inventors have found that at least the following problems exist in the prior art: in the related art, a plurality of single-task models are used for falsifying and identifying a plurality of characters, but the overall flow of the mode is bloated, generalized characteristics of falsifying the characters are difficult to find, and the models are easy to be over-fitted, so that the identification accuracy is influenced; in addition, as the handwriting falsification method is changed in a lot, the text background is not destroyed, and the recall capability of a coarse-granularity detection model is insufficient, for example, lawless persons can falsify the text handwriting finely in an ink falsification mode before image acquisition so as to achieve the purpose of illegal profit, if the detection model is used for falsifying and identifying the text falsified finely, the construction of a data set is required to be higher during early model training, a large number of falsified samples are required to be designed and acquired manually, and a good identification effect can be obtained after corresponding labeling is completed, and more manpower and time cost are required to be consumed.
Therefore, the application provides a text falsification recognition method, which can improve the text falsification recognition efficiency and recognition accuracy by inputting the priori condition determined based on the text recognition result of the image to be recognized and the image to be recognized into a pre-trained text falsification recognition model to perform text falsification recognition.
The text tampering identification method provided by the application can be applied to an application environment shown in figure 1. Wherein the terminal 10 communicates with the server 11 via a network. The above text falsification recognition method can be applied to the server 11, and the server may be an independent server or a server cluster. Specifically, the server 11 can acquire the image to be recognized and at least one prior condition corresponding to the image to be recognized, wherein the prior condition is determined based on the text recognition result of the image to be recognized. Inputting the image to be identified and at least one priori condition corresponding to the image to be identified into a pre-trained text falsification identification model for text falsification identification, obtaining a text falsification identification result of the image to be identified, and displaying the text falsification identification result of the image to be identified on the terminal 10. According to the text falsification identification method, the text falsification identification model can be guided to conduct text falsification identification on the image to be identified by using the priori condition, so that the text falsification identification efficiency and the text falsification identification precision are improved.
The terminal 10 may be, but is not limited to, various electronic devices such as personal computers, notebook computers, smart phones, tablet computers, and the like.
Referring to fig. 2, fig. 2 is a flow chart illustrating an embodiment of a text falsification recognition method according to the present invention, the method includes:
step S201, acquiring at least one priori condition corresponding to the image to be identified.
Specifically, an image to be identified and one or more prior conditions corresponding to the image to be identified are acquired. The image to be identified is a text-containing image, which may be, for example, a text-containing image captured by an image capture device, such as a document image containing text captured in an online authentication scenario, or an image to be identified may be a text-containing image downloaded from the internet, uploaded by a user, or otherwise acquired. The invention does not limit the source of the image to be identified.
Further, text herein may include language words, punctuation marks, numbers, etc. of each country. The invention does not limit the text type of the image to be identified.
In this embodiment, the a priori condition is determined based on the text recognition result of the image to be recognized. Referring specifically to fig. 3, fig. 3 is a flow chart illustrating the corresponding acquisition priori conditions in step S201 in one embodiment.
Step S301, performing text recognition on the image to be recognized to respectively obtain text information and text positions of the image to be recognized.
Alternatively, text recognition can be performed on the image to be recognized by using an OCR technology, specifically, line cutting is performed on the image to be recognized, and line text information of the image to be recognized and line text positions of the image to be recognized are extracted.
The existing OCR technology is well established, and in the related art, when text recognition is performed by using the OCR technology, a plurality of small frames with a certain width are formed for any text detection line, template matching is performed on the contents in the small frames, index, content and confidence information of each small frame are output in the recognition process, and then the text is recognized and output according to the information.
Step S302, extracting pixel positions of each character in the text position of the image to be recognized and text information of each character in the text information of the image to be recognized.
Because the precision unit of the segmentation of the text by the OCR technology is a line, in order to realize the precise segmentation of each character in the text, the invention designs that after the line text information of the image to be recognized and the line text position of the image to be recognized are obtained, the line characters can be finely segmented and extracted by using a CRACT model so as to obtain the pixel position of each character in the text position of the image to be recognized and the text information of each character in the text information of the image to be recognized.
And step S303, matching the pixel position of each character with the text information of each character to obtain at least one priori condition corresponding to the image to be recognized.
Specifically, the pixel position of each character in the text position of the obtained image to be recognized and the text information of each character in the text information of the image to be recognized are matched to obtain the information of each character, and the information of each character at the moment comprises the text information and the position information of each character. The matching mode can be KMP algorithm, boyer-Moore algorithm, rabin-Karp algorithm, etc. The information of each character is converted into a vector matrix, so that the prior condition corresponding to the image to be identified is obtained, the prior condition can be set according to actual needs, the number of the prior condition can be one or more, and the invention is not limited to the prior condition.
Alternatively, the a priori conditions may be entered into the text tamper recognition model in the form of One-Hot encoding along with the image to be recognized. One-Hot coding, also known as "One-Hot coding". In essence, N states are encoded with N-bit state registers, each having a separate register bit, and only one of these register bits is valid, i.e., only one state. Here, the prior condition term can be converted into One-Hot coding (One-bit effective coding) form of a plurality of columns, and is directly expressed by 0 and 1 to be yes or not, namely, the final vector is formed by discrete values of 0 and 1, so that the function of expanding the characteristic is played to a certain extent, the problem that the classifier is difficult to process discrete data is solved, and the accuracy of text falsification identification is improved.
Of course, the method can also be used for cutting the line of the image to be recognized by means of binarization, connected domain analysis, projection analysis and the like to obtain a line character image containing line characters. And extracting the pixel position of each character in the line character image and the text information of each character from the line character image containing the line characters.
In an embodiment, the obtained pixel position of each character and the obtained text information of each character may be subjected to size normalization processing, so as to facilitate dimension control of feature information when feature extraction is performed on the normalized information of each character by using a convolutional neural network.
Step S202, inputting the image to be identified and at least one priori condition corresponding to the image to be identified into a pre-trained text falsification identification model to carry out text falsification identification, and obtaining a text falsification identification result of the image to be identified.
The method is different from the prior art in that the prior condition guiding model is used for carrying out text tampering identification, and the prior condition is a vector matrix containing text information and position information of each character of the image to be identified. Specifically, in practical application, if the text information of the image to be identified is "0", "1", "2", "3", "4", "5", "6", "7", "8", "9" or 10 digital characters in total; the 10 characters are converted into a 1 x 10 vector matrix. If it is desired to recognize whether the numerical character "5" is tampered, in the vector matrix of 1×10, the vector corresponding to the text position of the numerical character "5" is represented by 1, and the vector corresponding to the text position other than the numerical character "5" is represented by 0. Therefore, when the text falsification recognition model is used for recognition in the follow-up, only the text position of the digital character '5' (vector 1) is needed to judge whether the text information at the position is falsified, and the method is different from the text position in the prior art, in which the text information to be recognized needs to be found first, and then whether the text information at the position is falsified is recognized, so that the text falsification recognition efficiency can be improved.
In an embodiment, when the pre-condition is multiple, the text tampering identification operation may be performed on each of the pre-conditions in order until the last pre-condition is completed after execution. For example, when the text information of the image to be recognized is "0", "1", "2", "3", "4", "5", "6", "7", "8", "9" for 10 digital characters in total, these 10 characters are converted into a vector matrix of 1×10. If it is desired to recognize whether or not the numerical character "5" and the numerical character "8" are tampered, in the vector matrix of 1×10, the vectors of the numerical character "5" and the numerical character "8" corresponding to the text positions are represented by 1, and the vectors of the corresponding text positions other than the numerical character "5" and the numerical character "8" are represented by 0. When the text falsification recognition is carried out later, firstly judging the text position of the digital character '5' (vector 1) to judge whether the text information of the position is falsified, and then judging the text position of the digital character '8' (vector 1) to judge whether the text information of the position is falsified.
In practical application, the text falsification recognition model carries out text falsification recognition on texts in images uploaded by users, a text falsification recognition result is output, and when the text falsification recognition result of the images uploaded by the users shows that the image texts are falsified texts, the images with falsified texts are manually checked or directly refused for recognition, so that the wind control capability of online business scenes can be enhanced.
Optionally, before the step S102 is performed, the image to be identified may be processed, so that the size of the processed image to be identified is the same as the input size of the pre-trained text falsification recognition model, for example, the size of the image to be identified may be processed by the size reduction process.
Referring to fig. 4, in an embodiment, the training process of the pre-trained text falsification recognition model includes:
step S401, acquiring a part of training images in a training set and labels of each training image in the part of training images, wherein the labels of the training images represent the types of the training images, and the types of the training images comprise tampered training images and untampered training images.
In particular, the training image set contains a plurality of training images for training generation of the text falsification recognition model, the training images may be text-containing images in various image formats, for example, the training images may be text-containing images acquired by an image acquisition device, such as text-containing document images captured in an online authentication scene, or text-containing images downloaded from the internet, uploaded by a user, or otherwise acquired. The invention does not limit the source of the training image.
Optionally, the training image may be labeled by manual labeling or automatic labeling by a machine, where the label may represent the type of the training image. For example, the label may be in the form of (0, 1) or (1, 0), indicating that the training image is an untampered training image when the first position is "1" and the second position is "0"; when the first position is "0" and the second position is "1", this indicates that the training image is a tampered training image. Of course, the form of the label is not limited to this, as long as it can distinguish between a tampered training image and an untampered training image.
Step S402, taking a training image as input, and extracting image features of the training image through an extraction network layer of a text falsification recognition model to be trained; inputting the image characteristics into a detection network layer of a text falsification recognition model, and obtaining falsification recognition results of training images; and outputting the falsification recognition result by using an output layer of the text falsification recognition model.
Specifically, after the part of training images and the label of each training image in the part of training images are acquired, the acquired training images are input into a text falsification recognition model to be trained, and the text falsification recognition model to be trained can be a text falsification recognition model based on an OCR technology. Image features of the training image are extracted through a plurality of convolution kernels and pooling layers in an extraction network layer in the text falsification recognition model, wherein the image features can be image texture features, image steganography features and the like. The image texture features are used to characterize texture information of the image, and the steganographic features of the image are used to characterize digital image tampering information.
And identifying the image characteristics of the training image by using a detection network layer of the text falsification model to obtain a falsification identification result of the training image, and outputting the image falsification identification result by using an output layer of the text falsification identification model, wherein the falsification identification result indicates whether the training image is a falsified image or not. The detection network layer of the image tamper recognition model may be constructed based on a deep neural network, such as CNN, backbone, YOLO, fast-RCNN, FRCNN or MaskRCNN networks, and preferably, the detection network layer in this embodiment includes a network based on a backhaul network in combination with Softmax.
Preferably, the form of the falsification recognition result output by the text falsification recognition model can be consistent with the form of the label of the training image, so that the text falsification recognition model can be optimized based on the falsification recognition result.
And S403, calculating a loss function of the falsification recognition result and the label of the training image, optimizing parameters of the text falsification recognition model according to the calculated loss function result, and continuing training until the preset condition is met, and ending training.
Specifically, a loss function value is obtained by calculating a loss function based on a falsification recognition result of a training image and a label of the training image, parameters of a text falsification recognition model are determined according to the loss function value, the text falsification recognition model is optimized, and steps S401 to S403 are repeated until training is finished by repeated optimization iteration until the loss function converges (a preset condition is met), the loss function converges, and a prediction result (falsification recognition result) of the text falsification recognition model is gradually consistent with a real result (label of the training image), so that a trained text falsification recognition model is obtained. The loss function value may be calculated by means of a mean square error or cross entropy.
In a preferred embodiment, the training image set may be generated online through a text falsification recognition model, which specifically includes: acquiring an original image, wherein the original image is an untampered image; performing data augmentation operation on the original image to obtain an augmented original image, and taking the augmented original image as an untampered training image; and performing tampering operation on the untampered training image to obtain a tampered training image, wherein the tampering operation comprises one or a combination of a plurality of operations such as elastic transformation, image distortion, pixel offset, pixel pasting and the like.
Specifically, an original image is firstly obtained, the original image can be an untampered image containing texts in various image formats, text recognition is carried out on the original image by utilizing a text falsification recognition model, text information and text positions of the original image are obtained, and each character in the original image is extracted to form a positive sample data set.
In order to improve generalization capability and robustness of a subsequent model, a positive sample data set is subjected to sample expansion, and specifically, operations such as size unification, random noise, random rotation offset, gaussian, dynamic blurring, color, contrast dithering and the like can be performed on characters of each original image in the positive sample data set so as to obtain a large number of positive samples, which is beneficial to subsequent training of a text falsification recognition model.
And carrying out random tampering operation, such as image distortion, elastic transformation, pixel pasting and other data augmentation operation, on each character contained in the positive sample data set after expansion to generate a corresponding negative sample to form a negative sample data set. It will be appreciated that each character in the positive sample data set has a correspondence with each character in the negative sample data set. In the embodiment, the text tampering identification model obtains a positive sample data set based on an original image, and tampering operation is carried out on the positive sample data set to generate a negative sample data set on line, so that on one hand, manual data collection and labeling can be avoided, and the investment of labor cost is greatly reduced; on the other hand, the sample pair is adopted to supervise and train the classification model, and the effect of contrast learning can be achieved, so that the model training precision is improved, and the generation of overfitting is restrained.
Referring to fig. 5, in an embodiment, an image to be identified and at least one prior condition corresponding to the image to be identified are input into a pre-trained text falsification identification model to perform text falsification identification, and a text falsification identification result of the image to be identified is obtained specifically as follows:
and step S501, inputting at least one priori condition corresponding to the image to be recognized into a pre-trained text falsification recognition model.
And S502, respectively extracting image features of the image to be recognized and priori features of priori conditions by using a pre-trained text falsification recognition model.
After the text falsification recognition model is trained in the mode, the text falsification recognition operation can be carried out on the image to be recognized by using the text falsification recognition model.
Firstly, inputting at least one priori condition of an image to be identified and corresponding to the image to be identified into a text falsification identification model, and respectively extracting image characteristics of the image to be identified and priori characteristics of the priori condition through an extraction network layer of the text falsification identification model. The extraction network layer may comprise a CNN convolution layer, wherein the convolution layer may be configured to perform an inner product operation on pixels of an image block according to a plurality of certain weights (i.e., convolution kernels), and its output is one of the extracted features. Extraction of feature sequences from an input image can be achieved by the CNN layer. Of course, the extraction network layer may comprise a plurality of convolution layers, wherein different convolution kernels may be used to extract different features, such as a first convolution kernel for extracting steganographic features, a second convolution kernel for extracting texture features, a third convolution kernel for extracting boundary features, corner features, etc.
Taking 10 digital characters as examples, namely, text information of an image to be recognized is "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", firstly, generating a priori condition according to the text information of the image to be recognized, wherein the priori condition is a vector matrix with the shape of 1×1×10, inputting the priori condition and the image to be recognized into a text falsification recognition model, wherein the input of the image to be recognized into the text falsification model, namely, generating a vector matrix of h×w×3, inputting the priori condition into the text falsification model, namely, broadcasting the vector matrix of original 1×1×10 to h×w×10, wherein h represents the height of the vector matrix, w represents the width of the vector matrix, and the output feature dimension is enlarged. And performing convolution operation on the two vector matrixes through a CNN layer to respectively obtain corresponding image features and prior features.
And S503, stacking the image features and the prior features to obtain the stacked features.
And step S504, processing the stacked features, outputting to obtain a classification vector, and confirming a text falsification identification result according to the classification vector.
Specifically, the image features and the prior features obtained through the steps are stacked, wherein the stacking operation is to splice channels of the features and not fuse the image features and the prior features. After the stacked features are obtained, the stacked features may be processed by using a back plane and Softmax function mode, where the back plane layer includes a plurality of convolution layers and a pooling layer, and a 1×2 binary vector (i.e. a tamper recognition result) is obtained along with the processing of the stacked features, and then the tamper recognition result is output through an output layer of the text tamper recognition model.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an embodiment of a text falsification recognition model. The left side of fig. 6 is an existing text falsification recognition model (a single task model), the right side of fig. 6 is an inventive text falsification recognition model (a multi task model), and it can be seen from fig. 6 that an improved model (an inventive text falsification recognition model) is newly added with conditional branches (priori conditions), so that the input end of the text falsification recognition model can provide priori conditions to distinguish character information of current falsification recognition, and the situation that different characters interfere with each other is avoided.
Referring to table 1, table 1 shows average recognition results of each character of the conventional text tamper recognition model and the text tamper recognition model provided by the present invention for recognizing a plurality of characters. Wherein the samples are 5000 images of characters "0" to "9" respectively.
As can be seen from table 1, when the conventional multi-character falsification recognition is performed by using a text falsification recognition model (single task model), a plurality of text falsification recognition models are required to perform text falsification recognition on different characters respectively, so that the multi-character falsification can be recognized relatively accurately. The text falsification recognition model provided by the invention can accurately recognize the falsification of the multiple characters by only using one model when falsifying the multiple characters.
It should be understood that, although the steps in the flowcharts of fig. 2-5 are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 2-5 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily occur sequentially, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or steps.
In one embodiment, as shown in fig. 7, there is provided a text falsification recognition apparatus including: the device comprises an acquisition module, a processing module and an output module, wherein:
the acquisition module is used for acquiring the image to be identified and at least one priori condition corresponding to the image to be identified, and the priori condition is determined based on the text identification result of the image to be identified.
The processing module is used for inputting the image to be identified and the prior condition corresponding to the image to be identified into a pre-trained text tampering identification model to carry out tampering identification.
And the output module is used for outputting a falsification identification result of the image to be identified.
For specific limitations of the text falsification recognition apparatus, reference may be made to the above limitations of the text falsification recognition method, and no further description is given here. The respective modules in the above text falsification recognition apparatus may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, another implementation manner of the text tampering identification method that can be implemented by the device includes the following specific steps:
the obtaining of the at least one prior condition corresponding to the image to be identified comprises:
step 301, performing text recognition on an image to be recognized to respectively obtain text information and text positions of the image to be recognized;
step S302, extracting the pixel position of each character in the text position of the image to be recognized and the text information of each character in the text information of the image to be recognized;
and step S303, matching the pixel position of each character with the text information of each character to obtain at least one priori condition corresponding to the image to be recognized.
In One embodiment, the form of the above-described a priori conditional One-Hot encoding is entered into a text tamper recognition model along with the image to be recognized.
In one embodiment, another implementation manner of the text tampering identification method that can be implemented by the device includes the following specific steps:
the training process of the pre-trained text falsification recognition model comprises the following steps:
step S401, acquiring a part of training images in a training image set and labels of each training image in the part of training images, wherein the labels of the training images represent the types of the training images, and the types of the training images comprise tampered training images and untampered training images;
step S402, taking a training image as input, and extracting image features of the training image through an extraction network layer of a text falsification recognition model to be trained; inputting the image characteristics into a detection network layer of a text falsification recognition model, and obtaining falsification recognition results of training images; outputting a falsification recognition result by using an output layer of the text falsification recognition model;
and S403, calculating a loss function of the falsification recognition result and the label of the training image, optimizing parameters of the text falsification recognition model according to the calculated loss function result, and continuing training until the preset condition is met, and ending training.
In one embodiment, another implementation manner of the text tampering identification method that can be implemented by the device includes the following specific steps:
acquiring the training image set includes:
acquiring an original image, wherein the original image is an untampered image;
performing data augmentation operation on the original image to obtain an augmented original image, and taking the augmented original image as an untampered training image;
and performing tampering operation on the untampered training image to obtain a tampered training image, wherein the tampering operation comprises one or a combination of a plurality of operations such as elastic transformation, image distortion, pixel offset, pixel pasting and the like.
In one embodiment, another implementation manner of the text tampering identification method that can be implemented by the device includes the following specific steps:
inputting the image to be identified and at least one priori condition corresponding to the image to be identified into a pre-trained text falsification identification model for text falsification identification, and obtaining a text falsification identification result of the image to be identified comprises the following steps:
step S501, inputting at least one priori condition corresponding to an image to be identified into a pre-trained text falsification identification model;
step S502, respectively extracting image features of an image to be identified and priori features of priori conditions by using a pre-trained text falsification identification model;
Step S503, stacking the image features and the prior features to obtain stacked features;
and step S504, processing the stacked features, outputting to obtain a classification vector, and confirming a text falsification identification result according to the classification vector.
In one embodiment, another implementation manner of the text tampering identification method that can be implemented by the device includes the following specific steps:
the prior condition is determined based on a text recognition result of the image to be recognized and comprises the following steps:
and carrying out text recognition on the image to be recognized by utilizing an OCR technology, and determining the prior condition corresponding to the image to be recognized.
In one embodiment, a computer device is provided, which may be a terminal, and the internal structure thereof may be as shown in fig. 8. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a text falsification recognition method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in FIG. 8 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In one embodiment, a computer device is provided comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of when executing the computer program:
step S201, acquiring at least one priori condition corresponding to the image to be identified.
Step S202, inputting the image to be identified and at least one priori condition corresponding to the image to be identified into a pre-trained text falsification identification model to carry out text falsification identification, and obtaining a text falsification identification result of the image to be identified.
In one embodiment, another implementation manner of the text tampering identification method that can be implemented by the device includes the following specific steps:
the obtaining of the at least one prior condition corresponding to the image to be identified comprises:
Step 301, performing text recognition on an image to be recognized to respectively obtain text information and text positions of the image to be recognized;
step S302, extracting the pixel position of each character in the text position of the image to be recognized and the text information of each character in the text information of the image to be recognized;
and step S303, matching the pixel position of each character with the text information of each character to obtain at least one priori condition corresponding to the image to be recognized.
In One embodiment, the above-described prior conditions are entered into the text tamper recognition model in the form of One-Hot encoding along with the image to be recognized.
In one embodiment, another implementation manner of the text tampering identification method that can be implemented by the device includes the following specific steps:
the training process of the pre-trained text falsification recognition model comprises the following steps:
step S401, acquiring a part of training images in a training image set and labels of each training image in the part of training images, wherein the labels of the training images represent the types of the training images, and the types of the training images comprise tampered training images and untampered training images;
step S402, taking a training image as input, and extracting image features of the training image through an extraction network layer of a text falsification recognition model to be trained; inputting the image characteristics into a detection network layer of a text falsification recognition model, and obtaining falsification recognition results of training images; outputting a falsification recognition result by using an output layer of the text falsification recognition model;
And S403, calculating a loss function of the falsification recognition result and the label of the training image, optimizing parameters of the text falsification recognition model according to the calculated loss function result, and continuing training until the preset condition is met, and ending training.
In one embodiment, another implementation manner of the text tampering identification method that can be implemented by the device includes the following specific steps:
acquiring the training image set includes:
acquiring an original image, wherein the original image is an untampered image;
performing data augmentation operation on the original image to obtain an augmented original image, and taking the augmented original image as an untampered training image;
and performing tampering operation on the untampered training image to obtain a tampered training image, wherein the tampering operation comprises one or a combination of a plurality of operations such as elastic transformation, image distortion, pixel offset, pixel pasting and the like.
In one embodiment, another implementation manner of the text tampering identification method that can be implemented by the device includes the following specific steps:
inputting the image to be identified and at least one priori condition corresponding to the image to be identified into a pre-trained text falsification identification model for text falsification identification, and obtaining a text falsification identification result of the image to be identified comprises the following steps:
Step S501, inputting at least one priori condition corresponding to an image to be identified into a pre-trained text falsification identification model;
step S502, respectively extracting image features of an image to be identified and priori features of priori conditions by using a pre-trained text falsification identification model;
step S503, stacking the image features and the prior features to obtain stacked features;
and step S504, processing the stacked features, outputting to obtain a classification vector, and confirming a text falsification identification result according to the classification vector.
In one embodiment, another implementation manner of the text tampering identification method that can be implemented by the device includes the following specific steps:
the prior condition is determined based on a text recognition result of the image to be recognized and comprises the following steps:
and carrying out text recognition on the image to be recognized by utilizing an OCR technology, and determining the prior condition corresponding to the image to be recognized.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims (10)

1. A text falsification recognition method, comprising:
acquiring an image to be identified and at least one priori condition corresponding to the image to be identified, wherein the priori condition is determined based on a text identification result of the image to be identified;
inputting the image to be identified and at least one priori condition corresponding to the image to be identified into a pre-trained text falsification identification model to carry out text falsification identification, and obtaining a text falsification identification result of the image to be identified.
2. The method of claim 1, wherein obtaining at least one a priori condition corresponding to the image to be identified comprises:
performing text recognition on the image to be recognized to respectively obtain text information and text positions of the image to be recognized;
extracting pixel positions of each character in the text position of the image to be recognized and text information of each character in the text information of the image to be recognized;
and matching the pixel position of each character with the text information of each character to obtain at least one priori condition corresponding to the image to be recognized.
3. The method of claim 1, wherein the inputting the image to be identified and the at least one prior condition corresponding to the image to be identified into a pre-trained text falsification identification model to perform text falsification identification, and obtaining a text falsification identification result of the image to be identified includes:
inputting the image to be identified and at least one priori condition corresponding to the image to be identified into a pre-trained text falsification identification model;
respectively extracting image features of the image to be identified and priori features of the priori conditions by using a pre-trained text falsification identification model;
Carrying out feature stacking on the image features and the prior features to obtain stacked features;
and processing the stacked features, outputting to obtain a classification vector, and confirming a text falsification recognition result according to the classification vector.
4. The method of claim 1, wherein the training process of the pre-trained text tamper recognition model comprises:
acquiring a part of training images in a training image set and labels of each training image in the part of training images, wherein the labels of the training images represent the types of the training images, and the training image types comprise tampered training images and untampered training images;
taking a training image as input, and extracting image features of the training image through an extraction network layer of a text falsification recognition model to be trained; inputting the image features into a detection network layer of the text falsification recognition model to obtain falsification recognition results of training images; outputting a falsification identification result by using an output layer of the text falsification identification model;
and carrying out loss function calculation on the falsification identification result and the label of the training image, optimizing parameters of the text falsification identification model according to the calculated loss function result, and continuing training until the training is finished when the preset condition is met.
5. The method of claim 4, wherein the acquiring a training image set comprises:
acquiring an original image, wherein the original image is an untampered image;
performing data augmentation operation on the original image to obtain an augmented original image, and taking the augmented original image as an untampered training image;
and performing tampering operation on the untampered training image to obtain a tampered training image, wherein the tampering operation comprises one or a combination of a plurality of operations such as elastic transformation, image distortion, pixel offset, pixel pasting and the like.
6. The method of claim 1, wherein the a priori conditions are entered into the text tamper recognition model in One-Hot encoded form with the image to be recognized.
7. The method of claim 1, wherein the prior condition is determined based on a text recognition result of the image to be recognized comprising: and carrying out text recognition on the image to be recognized by utilizing an OCR technology, and determining the prior condition corresponding to the image to be recognized.
8. A text tamper identification device, the device comprising:
the acquisition module is used for acquiring an image to be identified and at least one priori condition corresponding to the image to be identified, wherein the priori condition is determined based on a text identification result of the image to be identified;
The processing module is used for inputting the image to be identified and the prior condition corresponding to the image to be identified into a pre-trained text falsification identification model to falsify and identify;
and the output module is used for outputting the falsification identification result of the image to be identified.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 7 when the computer program is executed by the processor.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.
CN202310675608.3A 2023-06-08 2023-06-08 Text tampering identification method, apparatus, computer device and readable storage medium Pending CN116682116A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310675608.3A CN116682116A (en) 2023-06-08 2023-06-08 Text tampering identification method, apparatus, computer device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310675608.3A CN116682116A (en) 2023-06-08 2023-06-08 Text tampering identification method, apparatus, computer device and readable storage medium

Publications (1)

Publication Number Publication Date
CN116682116A true CN116682116A (en) 2023-09-01

Family

ID=87778751

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310675608.3A Pending CN116682116A (en) 2023-06-08 2023-06-08 Text tampering identification method, apparatus, computer device and readable storage medium

Country Status (1)

Country Link
CN (1) CN116682116A (en)

Similar Documents

Publication Publication Date Title
CN109492643B (en) Certificate identification method and device based on OCR, computer equipment and storage medium
US20190180154A1 (en) Text recognition using artificial intelligence
CN110705233B (en) Note generation method and device based on character recognition technology and computer equipment
CN109635805B (en) Image text positioning method and device and image text identification method and device
CN114092938B (en) Image recognition processing method and device, electronic equipment and storage medium
CN112883980B (en) Data processing method and system
CN113344826A (en) Image processing method, image processing device, electronic equipment and storage medium
CN112686243A (en) Method and device for intelligently identifying picture characters, computer equipment and storage medium
CN113673528B (en) Text processing method, text processing device, electronic equipment and readable storage medium
RU2633182C1 (en) Determination of text line orientation
CN112749639B (en) Model training method and device, computer equipment and storage medium
CN111898544B (en) Text image matching method, device and equipment and computer storage medium
CN112396047B (en) Training sample generation method and device, computer equipment and storage medium
CN112966676A (en) Document key information extraction method based on zero sample learning
CN116484224A (en) Training method, device, medium and equipment for multi-mode pre-training model
CN116052195A (en) Document parsing method, device, terminal equipment and computer readable storage medium
CN110796145A (en) Multi-certificate segmentation association method based on intelligent decision and related equipment
CN115984588A (en) Image background similarity analysis method and device, electronic equipment and storage medium
CN115512340A (en) Intention detection method and device based on picture
CN112801960B (en) Image processing method and device, storage medium and electronic equipment
CN116682116A (en) Text tampering identification method, apparatus, computer device and readable storage medium
CN115223183A (en) Information extraction method and device and electronic equipment
WO2023173546A1 (en) Method and apparatus for training text recognition model, and computer device and storage medium
CN114625872A (en) Risk auditing method, system and equipment based on global pointer and storage medium
CN112884046A (en) Image classification method and device based on incomplete supervised learning and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination