WO2021212658A1 - Ocr image sample generation method and apparatus, print font verification method and apparatus, and device and medium - Google Patents

Ocr image sample generation method and apparatus, print font verification method and apparatus, and device and medium Download PDF

Info

Publication number
WO2021212658A1
WO2021212658A1 PCT/CN2020/099064 CN2020099064W WO2021212658A1 WO 2021212658 A1 WO2021212658 A1 WO 2021212658A1 CN 2020099064 W CN2020099064 W CN 2020099064W WO 2021212658 A1 WO2021212658 A1 WO 2021212658A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
label
sample
image sample
model
Prior art date
Application number
PCT/CN2020/099064
Other languages
French (fr)
Chinese (zh)
Inventor
陈伟杰
Original Assignee
平安国际智慧城市科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安国际智慧城市科技股份有限公司 filed Critical 平安国际智慧城市科技股份有限公司
Publication of WO2021212658A1 publication Critical patent/WO2021212658A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • This application relates to the field of artificial intelligence data modeling, and in particular to an OCR image sample generation and printed verification method, device, computer equipment and storage medium.
  • OCR Optical Character Recognition
  • Chinese called optical character recognition Chinese called optical character recognition
  • the text written on or on paper is read and converted into a format that the computer can accept and understand.
  • application scenarios such as: application scenarios involving finance, insurance, and smart security
  • a very large number of document samples are needed to train the neural network, and usually only a very small number of document samples can be obtained.
  • This application provides an OCR image sample generation, print verification method, device, computer equipment and storage medium, which realizes the automatic generation of OCR image samples with the same texture style as the image samples, and automatically annotates the OCR image sample tags, reducing labor Cost and time, and can improve the accuracy and reliability of OCR recognition results and print verification.
  • An OCR image sample generation method including:
  • the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label;
  • the font typesetting generation model recognizes the first annotation information of the image sample, and obtains the The font typesetting generation model reconstructs and generates a simulation result according to the first font label, the first typesetting label, and the first annotation information;
  • the simulation result includes a simulated image, a second font label, and a second typesetting label And the second label information;
  • the image sample and the simulated image are input into a preset style synthesis model, the style synthesis model extracts style features and content features, and the style synthesis model performs simulation on the simulation based on the style features and the content features.
  • Image style transfer and synthesis are performed to generate a synthesis result; the synthesis result includes the synthesized image and the first texture style label;
  • a print verification method including:
  • Receive certificate verification instructions obtain the printed version and verification information of the pending certificate
  • An OCR image sample generating device including:
  • a receiving module configured to receive an image generation instruction and obtain an image sample; the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label;
  • the input module is used to input the image sample into a preset font typesetting generation model, and by performing text detection and character recognition on the image sample, to obtain the first annotation information of the image sample recognized by the font typesetting generation model , And obtain a simulation result generated by reconstruction of the font typesetting generation model according to the first font label, the first typesetting label, and the first annotation information; the simulation result includes a simulated image and a second font label , The second typesetting label and the second annotation information;
  • the synthesis module is configured to input the image sample and the simulated image into a preset style synthesis model, the style synthesis model extracts style features and content features, and the style synthesis model is based on the style features and the content
  • the feature performs style transfer and synthesis on the simulated image to generate a synthesis result; the synthesis result includes the synthesized image and the first texture style label;
  • a printed body verification device includes:
  • the acquisition module is used to receive the certificate verification instruction, and obtain the printed version and verification information of the certificate
  • the training module is used to input the printed document to be into the trained document recognition model; the document recognition model is trained through the OCR image sample generated by the above-mentioned OCR image sample generation method;
  • the recognition module is configured to perform OCR recognition on the printed document to be documented through the document recognition model, and obtain the OCR recognition result output by the document recognition model.
  • the OCR recognition result includes the document-related printed document to be documented Text information
  • the comparison module is configured to compare the OCR recognition result with the verification information, and determine whether the printed document to be document meets the verification information;
  • the first determining module is configured to determine that the verification is passed if the printed document to be document meets the verification information
  • the second determination module is configured to confirm that the verification is not passed if the printed document to be document does not meet the verification information, and prompt on the display interface.
  • a computer device includes a memory, a processor, and a computer program that is stored in the memory and can run on the processor, and the processor implements the following steps when the processor executes the computer program:
  • the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label;
  • the font typesetting generation model recognizes the first annotation information of the image sample, and obtains the The font typesetting generation model reconstructs and generates a simulation result according to the first font label, the first typesetting label, and the first annotation information;
  • the simulation result includes a simulated image, a second font label, and a second typesetting label And the second label information;
  • the image sample and the simulated image are input into a preset style synthesis model, the style synthesis model extracts style features and content features, and the style synthesis model performs simulation on the simulation based on the style features and the content features.
  • Image style transfer and synthesis are performed to generate a synthesis result; the synthesis result includes the synthesized image and the first texture style label;
  • a computer device includes a memory, a processor, and a computer program that is stored in the memory and can run on the processor, and the processor implements the following steps when the processor executes the computer program:
  • Receive certificate verification instructions obtain the printed version and verification information of the pending certificate
  • a computer-readable storage medium having a computer program stored on the computer-readable storage medium, wherein, when the computer program is executed by a processor, the following steps are implemented:
  • the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label;
  • the font typesetting generation model recognizes the first annotation information of the image sample, and obtains the The font typesetting generation model reconstructs and generates a simulation result according to the first font label, the first typesetting label, and the first annotation information;
  • the simulation result includes a simulated image, a second font label, and a second typesetting label And the second label information;
  • the image sample and the simulated image are input into a preset style synthesis model, the style synthesis model extracts style features and content features, and the style synthesis model performs simulation on the simulation based on the style features and the content features.
  • Image style transfer and synthesis are performed to generate a synthesis result; the synthesis result includes the synthesized image and the first texture style label;
  • a computer-readable storage medium having a computer program stored on the computer-readable storage medium, wherein, when the computer program is executed by a processor, the following steps are implemented:
  • Receive certificate verification instructions obtain the printed version and verification information of the pending certificate
  • the OCR image sample generation method, device, computer equipment, and storage medium provided in this application obtain image samples containing the first font label, the first typesetting label, and the first texture style label; and input it into the font typesetting generation model. Perform text detection and text recognition on the image sample to obtain the recognized first annotation information; reconstruct according to the first font label, the first typesetting label, and the first annotation information to generate a simulation result;
  • the image sample and the simulated image are input into a style synthesis model, and the style synthesis model performs style transfer and synthesis on the simulated image according to the extracted style feature and the content feature to generate a synthesis result; the synthesis The result includes the synthesized image and the first texture style label; the second font label, the second typesetting label, the second annotation information and the first texture style label are marked as OCR image sample labels, and all
  • the composite image is recorded as an OCR image sample corresponding to the image sample, and the OCR image sample is associated with the OCR image sample tag.
  • the present application realizes the automatic generation of an OCR image with the same texture style as the image sample Samples, and accurately mark the OCR image sample labels on the OCR image samples, reducing the labor cost and time of collecting image samples, and can quickly obtain OCR image samples in the desired scene, which is more targeted and is a follow-up model
  • the training improves the accuracy and reliability, reduces the labor cost of labeling the OCR image sample labels, avoids errors in manual labeling, and improves labeling accuracy.
  • the printed matter verification method, device, computer equipment and storage medium provided in this application obtain the printed matter to be document and verification information by receiving the document verification instruction; input the printed matter to be documented into the OCR generated by the above-mentioned OCR image sample generation method A certificate recognition model trained and completed by image samples; OCR recognition is performed on the printed document to be issued through the certificate recognition model, and the OCR recognition result output by the certificate recognition model is obtained; and the OCR recognition result is compared with the verification Information is compared to determine whether the printed document to be documented meets the verification information; if the printed document to be documented meets the verification information, it is determined that the verification is passed; if the printed document to be documented does not comply with the verification information, It is confirmed that the verification is not passed, and a prompt is displayed on the display interface.
  • this application uses the OCR image sample generated by the OCR image sample generation method to train a certificate recognition model for a specific scene, and performs automatic recognition and automatic verification through the certificate recognition model.
  • the automatic verification of printed documents for specific scenarios is realized, which is highly targeted, improves recognition accuracy, improves recognition efficiency and reliability, improves user experience, and saves labor costs.
  • FIG. 1 is a schematic diagram of an application environment of an OCR image sample generation method or a printed body verification method in an embodiment of the present application;
  • Fig. 2 is a flowchart of a method for generating OCR image samples in an embodiment of the present application
  • step S20 of the OCR image sample generation method in an embodiment of the present application
  • step S20 of the OCR image sample generation method in another embodiment of the present application.
  • step S30 of the OCR image sample generation method in an embodiment of the present application.
  • FIG. 6 is a flowchart of a printed matter verification method in an embodiment of the present application.
  • FIG. 7 is a flowchart of step S200 of the printed body verification method in an embodiment of the present application.
  • Fig. 8 is a functional block diagram of an OCR image sample generating device in an embodiment of the present application.
  • FIG. 9 is a schematic block diagram of a printed matter verification device in an embodiment of the present application.
  • Fig. 10 is a schematic diagram of a computer device in an embodiment of the present application.
  • the OCR image sample generation method provided by this application can be applied in the application environment as shown in Fig. 1, in which the client (computer equipment) communicates with the server through the network.
  • the client includes, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, cameras, and portable wearable devices.
  • the server can be implemented as an independent server or a server cluster composed of multiple servers.
  • a method for training a recognition model is provided, and the technical solution mainly includes the following steps S10-S40:
  • S10 Receive an image generation instruction, and obtain an image sample; the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label.
  • the image generation instruction is a request to be triggered after selecting and confirming the image sample that needs to be generated.
  • the trigger mode can be set according to requirements, for example, in the application
  • the platform interface provides a trigger button that can be triggered by clicking, sliding, etc., to acquire the image sample, and the acquisition method can be set as required.
  • the acquisition method can be the The image sample acquires the image sample, acquires the image sample according to the storage path of the image sample included in the image generation instruction, and so on.
  • the image sample is an image that contains printed matter and is related to a certificate, that is, an image file obtained after a certificate is copied.
  • the image sample can be a photo file or a scanned image file.
  • the image sample label is associated, the image sample label is a label assigned to annotate the content in the image sample, and the image sample label includes a first font label, a first typesetting label, and a first texture style label,
  • the first font label is a label of information such as font style and font size corresponding to the text content in the image sample, such as Song Ti No.
  • the first typesetting label is in the image sample
  • the text content corresponding to the label of the typesetting information such as single-column arrangement, double-column equal-width arrangement, double-column unequal-width arrangement, etc.
  • the first texture style label is transferred during scanning, copying, etc.
  • the texture information label formed by the image sample such as wrinkle, deformation, etc.
  • the font typesetting generation model includes performing text detection and text recognition processing on the input image sample to obtain the first annotation information related to the text in the image sample, and the text detection is
  • the image sample is scanned to detect the text area containing the text and the area coordinates of the text area.
  • the scanning detection method can be set according to requirements, such as extracting text features to determine the text area of the text, and identifying multiple text areas through edge detection.
  • the text boundary determines the text area of the text, etc.
  • the text area is a quadrilateral area containing text content
  • the area coordinates are the coordinates of the four points of the text area in the image sample
  • the area coordinates include four A coordinate value (the coordinate value includes the abscissa and the ordinate)
  • the text recognition is the recognition of each text area in the image sample to obtain the text content in each text area
  • the The text area, the area coordinates associated with the text area, and the text content of the text area are recorded as the first information of the image sample, and all the first information is marked as the first annotation information .
  • the font typesetting generation model further includes a reconstruction model, and the reconstruction model performs reconstruction processing according to the first font label, the first typesetting label, and the first annotation information to generate a simulated image, The simulation result of the second font label, the second typesetting label and the second annotation information.
  • the reconstruction model may be a neural network model set according to requirements.
  • the reconstruction model is based on GAN (Generative Adversarial Network , Generate a neural network model trained by a confrontation network) model
  • the simulated image is an image file containing text content corresponding to the second font label, the second typesetting label, and the second annotation information
  • the second font label may be consistent with the first font label, or may be the font label with the highest approximate value to the first font label
  • the font label is the font style and font size of all font styles, For example, Song Ti No. 5, Kai Ti No. 6 and so on, the highest approximate value is that in the font label, in addition to the first font label, the approximate value of the font style of the first font label plus the approximate value of the first font label The highest value obtained by the approximate font size.
  • the approximate font style value is a measurement index value that differs from one of the font styles in all font styles. The more similar the font style, the larger the measurement index value.
  • the range of the metric value can be set from 0 to 100.
  • the font style of the first font label is "Song Ti”
  • the font style of "Imitated Song” is the most similar to "Song Ti” (you can use the reconstruction model based on GAN model Encode the font style of the first font label, and then decode the encoded code, which can also be defined by a preset rule for the difference between the living body printing and the calligraphy font), and the measurement index value is 100;
  • the approximate value of the font size is the size measurement index value of the difference between each font size and one of the font sizes.
  • the font size of the first font label is "number five".
  • the font size of “11 points” is the most similar to “No. 5” (the font style of the first font label can be encoded through the reconstruction model based on the GAN model, and then the encoded code can be decoded, or through
  • the difference between “No. 5” and “No. 5” is the smallest.
  • the difference is negative, take its absolute value plus one to obtain the difference), and its measurement index value is 100; the second typesetting label may be A typesetting label is consistent, or it may be the typesetting label with the highest approximation value to the first typesetting label.
  • the approximate value is the typesetting value corresponding to a measure of the difference between each typesetting label and one of the typesetting labels.
  • the typesetting value between the columns with equal width is 0 (the most similar);
  • a label information is encoded, random noise is introduced to the encoded first label information and decoded to generate information close to the first label information, for example: the first label information is "I like juice” and its coordinates , The second label information is "I like fruits" and the corresponding coordinates.
  • the image sample is input into a preset font typesetting generation model, and the image sample is subjected to text detection and text recognition to obtain all the images.
  • the font typesetting generation model that recognizes the first annotation information of the image sample includes:
  • S201 Perform text detection on the image sample through the font typesetting generation model, and at the same time extract the text features of the image sample, and obtain the image recognized by the font typesetting generation model according to the extracted text features
  • the regional result of the sample includes a number of text regions containing text and region coordinates associated with each of the text regions.
  • the text feature extraction of the image sample is performed on the image sample through the font typesetting generation model, and the text features are features such as Chinese words and sentences, letter words, etc., and the font typesetting generation model is based on the The text features are recognized, and a number of text regions containing text content in the image sample and the region coordinates associated with each of the text regions are identified.
  • the text region is a quadrilateral region containing text content, and the region coordinates are Is the coordinates of the four points of the text area in the image sample, the area coordinates include four coordinate values (the coordinate values include the abscissa and the ordinate), which associate all the text areas and the text areas The area coordinates are determined as the area result.
  • the character feature extraction is performed on each of the text regions through the font typesetting generation model, and the character features are Chinese characters, letters, etc., according to the extracted features of each text region Character features, the font typesetting generation model outputs the text content of each text area, and obtains all the text content.
  • the text area has a one-to-one correspondence with the area coordinates and a one-to-one correspondence with the text content
  • the text area, the corresponding area coordinates and the corresponding text content are combined together Marked as the first information of the image sample
  • the image sample may contain several pieces of the first information, and then all the first information is marked as the first annotation information corresponding to the image sample .
  • the first annotation information in the image sample can be accurately obtained.
  • the acquiring the font typesetting generation model is performed according to the first font label, the first typesetting label, and the first annotation information.
  • the simulation results generated by reconstruction include:
  • the GAN model (GAN, Generative Adversarial Networks) is a deep neural network model of a generative confrontation network
  • the reconstruction model is a deep neural network model trained and trained based on the GAN model.
  • the first font label, the first typesetting label and the first annotation information are input into the reconstruction model.
  • S205 Perform combined reconstruction through a generator in the reconstruction model, and obtain the simulated image, the second font label, the second typesetting label, and the second annotation information output by the reconstruction model.
  • the first font label, the first typesetting label, and the first annotation information are combined into a transition image with the same size as the image sample, and the transition image is preferably a blank background, Its content includes the first font label, the first typesetting label, and the first annotation information;
  • the reconstruction model includes the generator, and the main task of the generator is to learn the transition image through The transition image is encoded, and the encoded code is decoded to generate an image file that is very close to the transition image.
  • the reconstruction model also includes a discriminator whose main task is to The image file generated by the generator is differentiated from the transition image, and the authenticity is judged. In the iterative training process of the reconstruction model, the generator continuously strives to make the generated image file closer and closer.
  • the transition image, and the discriminator is constantly trying to identify the authenticity of the image file. Through the game between the generator and the discriminator, with repeated iterations, the generator and the discriminator are finally The generator has reached a balance.
  • the generator generates the analog image close to the transition image, and the discriminator has difficulty identifying the difference between the analog image and the image sample.
  • the The reconstruction model reconstructs the simulated image.
  • the simulated image is an image file containing text content corresponding to the second font label, the second typesetting label, and the second annotation information.
  • the size of the simulated image may be the same as the size of the image sample.
  • the size of the second font label is the same as that of the first font label, for example: the first font label is Times New Roman No. 5, the second font label is imitation Song 11 points; the second typesetting label is similar to the first font label.
  • One typesetting label is close, for example: the first typesetting label is two-column equal-width arrangement, and the second typesetting label is two-column unequal-width arrangement;
  • the label information is encoded, random noise is introduced to the encoded first label information and decoded to generate information close to the first label information, for example: the first label information is "I like juice” and its coordinates, The second label information is "I like fruits” and the corresponding coordinates.
  • the simulation result includes the simulated image, the second font label, the second typesetting label, and the second annotation information, and the simulated image, the second font label, the The second typesetting label and the second annotation information are associated with each other.
  • the simulation results including the simulated image, the second font label, the second typesetting label and the second annotation information are reconstructed and generated, thereby realizing automatic Generate simulation images associated with the second font label, second typesetting label and second annotation information, that is, automatically annotate the simulated image, thereby automatically generating a simulated image with the first font label and the first typesetting label related to the image sample .
  • the image sample and the simulated image are input into the style synthesis model, and the style synthesis model extracts style features and content features, and the style features are features such as wrinkles, background gray levels, and stripes.
  • the content feature is a feature related to the first font label, the first typesetting label, the second font label, and the second typesetting label.
  • the simulated image performs style transfer, that is, the simulated image is used as the initial image, all the pixel values of the initial image are obtained, and the total loss value is obtained according to the style feature and the content feature, and iteratively updated continuously through gradient descent For all the pixel values, until the total loss value reaches a preset condition, the preset condition can be set according to demand.
  • the preset condition can be set to no longer decrease, and the updated The initial image is determined to be a composite image, and the composite image is an image of the simulated image that is optimized by transferring the texture information in the image sample, and the texture information is the process of transferring the texture information into an image file during operations such as scanning and copying. Therefore, the texture information provided in the composite image is consistent with the first texture style tag, and the composite image is associated with the first texture style tag to associate the composite image with the first texture style tag.
  • a texture style label is determined as the synthesis result.
  • the image sample and the simulated image are input into a preset style synthesis model, and the style synthesis model extracts style features and content features, so The style synthesis model performs style transfer and synthesis on the simulated image according to the style feature and the content feature to generate a synthesis result, including:
  • the simulated image is used as the initial image, that is, when the image sample and the simulated image are input into a preset style synthesis model, the initial image is consistent with the simulated image, and the initial image It contains a number of pixels, each of the pixels corresponds to one of the pixel values, and the pixel value is the value assigned by the pixel by measuring the color, and the range of the pixel value can be set according to requirements .
  • S302 Extract the style feature of the image sample and the style feature of the initial image through the style synthesis model, and calculate a style loss value according to the style feature of the image sample and the style feature of the initial image.
  • the style synthesis model is a deep neural network model obtained by transfer learning
  • the network structure of the style synthesis model is obtained by transfer learning, such as the network structure of the transfer learning VGG19 model
  • the style feature is Features such as wrinkles, background gray levels, spots, etc., where the style loss value is calculated by calculating the style feature of the image sample and the style feature of the initial image through a style loss function to obtain the image sample and the initial image sample The style difference.
  • S303 Extract the content feature of the image sample and the content feature of the initial image through the style synthesis model, and calculate a content loss value based on the content feature of the image sample and the content feature of the initial image.
  • the content feature is a feature related to the first font label, the first typesetting label, the second font label, and the second typesetting label
  • the content loss value is a loss through content
  • the function calculates the content feature of the image sample and the content feature of the initial image to obtain the content difference between the image sample and the initial image sample.
  • S304 Perform weighting processing on the style loss value and the content loss value to obtain a total loss value.
  • the weighting process is to input the style loss value and the content loss value into a loss weighting function, and the total loss value is calculated by the loss weighting function, and the loss weighting function is:
  • L 1 is the style loss value
  • L 2 is the content loss value
  • w1 is the weight of the loss function of the style loss value
  • w 2 is the weight of the loss function of the content loss value
  • L is the total loss value
  • S305 Perform gradient descent using the L-BFGS algorithm, and when the total loss value does not reach a preset condition, iteratively update all the pixel values in the initial image until the total loss value reaches the preset condition At this time, the updated initial image is determined to be a composite image.
  • the L-BFGS algorithm is a method for solving unconstrained nonlinear problems, and the preset conditions It can be set according to requirements. For example, the preset condition can be set to no longer decrease, and when the total loss value does not reach the preset condition (no more decrease), all the items in the initial image are updated iteratively. For the pixel value, until the total loss value reaches the preset condition (no more drop), the updated initial image is determined to be a composite image.
  • S306 Associate the first texture style label with the synthesized image, and determine the synthesized image and the first texture style label as the synthesis result.
  • the synthesized image is associated with the first texture style tag, so that the synthesized image and the first texture style tag are determined as the synthesis result.
  • the style transfer of the simulated image from the image sample is carried out through the style synthesis model, and the synthesized image with the first texture style label associated with the image sample is automatically generated, and the synthesized image is automatically generated from the image sample, which is the training of the subsequent model.
  • the OCR image sample is annotated with OCR image sample tags, that is, the second font tags, the second typesetting tags, the second annotation information, and the first texture style tags are marked as OCR images
  • OCR image sample tags that is, the second font tags, the second typesetting tags, the second annotation information, and the first texture style tags are marked as OCR images
  • This application obtains an image sample by receiving an image generation instruction; the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label; and the image sample is input
  • the preset font typesetting generation model uses text detection and text recognition on the image sample to obtain the font typesetting generation model to recognize the first annotation information of the image sample, and obtain the font typesetting generation model according to the A simulation result generated by reconstructing the first font label, the first typesetting label, and the first annotation information; the simulation result includes a simulated image, a second font label, a second typesetting label, and second annotation information;
  • the image sample and the simulated image are input into a preset style synthesis model, the style synthesis model extracts style features and content features, and the style synthesis model performs simulation on the simulation based on the style features and the content features.
  • the image undergoes style transfer and synthesis to generate a synthesis result;
  • the synthesis result includes a synthesized image and a first texture style label;
  • a texture style label is marked as an OCR image sample label, and the composite image is recorded as an OCR image sample corresponding to the image sample, and the OCR image sample is associated with the OCR image sample label.
  • this application realizes that by obtaining an image sample containing the first font label, the first typesetting label, and the first texture style label; inputting it into the font typesetting generation model, and performing text detection and text recognition on the image sample to obtain The identified first annotation information; reconstruct according to the first font label, the first typesetting label, and the first annotation information to generate a simulation result; synthesize the image sample and the simulated image input style Model, the style synthesis model performs style transfer and synthesis on the simulated image according to the extracted style feature and the content feature to generate a synthesis result; the synthesis result includes the synthesis image and the first texture style label;
  • the second font label, the second typesetting label, the second annotation information, and the first texture style label are marked as OCR image sample labels, and the composite image is recorded as corresponding to the image sample OCR image samples, and associate the OCR image samples with the OCR image sample tags.
  • the present application realizes the automatic generation of OCR image samples with the same texture style as the image samples, and accurately performs operations on the OCR image samples.
  • Annotating OCR image sample labels reduces the labor cost and time of collecting image samples, and can quickly obtain OCR image samples in the required scene, which is more targeted, and improves the accuracy and reliability of subsequent model training, and The labor cost for labeling the OCR image sample label is reduced, errors caused by manual labeling are avoided, and labeling accuracy is improved.
  • the image recognition method provided in this application can be applied in the application environment as shown in Fig. 1, in which the client (computer equipment) communicates with the server through the network.
  • the client includes, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, cameras, and portable wearable devices.
  • the server can be implemented as an independent server or a server cluster composed of multiple servers.
  • a printed matter verification method is provided, and the technical solution mainly includes the following steps S100-S600:
  • S100 Receive a certificate verification instruction, and obtain a printed version of a certificate to be issued and verification information.
  • the document verification instruction is a request triggered after selecting and confirming the printed document to be verified and the verification information that need to be verified, and the printed document to be verified is the document that needs to be verified after being scanned or copied.
  • An image file the verification information is a verification carrier that provides the printed document to be verified, the verification information can be obtained from the information related to the printed document to be entered by the user at the client, or from a database It is obtained by querying information related to the printed document to be documented, such as the entered bank card number or bank name.
  • S200 Input the printed document to be into a trained document recognition model; the document recognition model is trained through the OCR image sample generated by the above-mentioned OCR image sample generation method.
  • the printed document to be document is input to the document recognition model
  • the document recognition model is an OCR image sample generated by the above-mentioned OCR image sample generation method and a neural network model trained on the image sample.
  • the OCR recognition is to read the text of the printed document to be documented through OCR (Optical Character Recognition) technology, and the OCR recognition result includes that the printed document to be documented is related to the document Text information.
  • OCR Optical Character Recognition
  • S400 Compare the OCR recognition result with the verification information, and determine whether the printed document to be certified meets the verification information.
  • the OCR recognition result is checked with the verification information to determine whether the printed document to be checked passes.
  • the OCR recognition result corresponding to the printed document to be certified is consistent with the verification information, it is determined that the printed document to be certified is verified.
  • the OCR recognition result corresponding to the printed document to be documented is inconsistent with the verification information, it is determined that the printed document to be documented has failed the verification, and a prompt is displayed on the display interface that the display interface It is the display interface of the terminal device corresponding to the customer.
  • the content of the prompt can be set according to the needs. For example, the content of the prompt is "The verification information is wrong, please re-enter the verification information!.
  • the method before step S200, that is, before inputting the printed document to be into the trained document recognition model, the method includes:
  • the certificate sample set includes a number of certificate samples, and one certificate sample is associated with a sample label; when the certificate sample is an image sample, the sample label is an image sample label; in the certificate sample When it is an OCR image sample, the sample label is an OCR image sample label; the OCR image sample is generated by the above-mentioned OCR image sample generation method; the number of the image samples in the certificate sample set is less than the number of the OCR image samples.
  • the OCR image sample is an image generated by the above-mentioned OCR image sample generation method through the image samples in the certificate sample set, and the OCR image sample has been annotated by the OCR image sample generation method.
  • OCR image sample label one image sample can generate multiple OCR image samples by the OCR image sample generation method, and the number of the image samples in the certificate sample set is less than the number of the OCR image samples, thus saving The time for collecting the document sample set and the manual time for labeling the document sample, and the OCR image sample label can be accurately labelled on the OCR image sample.
  • the initial parameters can be set according to requirements, for example, the initial parameters are randomly assigned parameter values, or the initial parameters are preset parameter values, and so on.
  • S2003 Perform OCR recognition on the credential sample through the deep learning OCR model, and obtain a training recognition result of the credential sample output by the deep learning OCR model.
  • the OCR recognition is to read the printed text of the to-be-documented document through OCR (Optical Character Recognition) technology, and the training recognition result includes the document-related text in the document sample information.
  • OCR Optical Character Recognition
  • the training recognition result and the sample label are input into the loss function in the deep learning OCR model, and the loss value of the certificate sample is calculated by the loss function, and the loss value The difference between the training recognition result and the sample label is indicated, and the loss value is getting smaller and smaller, indicating that the training recognition result is getting closer and closer to the sample label.
  • the loss value reaches the preset convergence condition, it indicates that the loss value has reached the optimal result, that is, the training recognition result is already very close to the sample label, and the deep learning The OCR model has converged, and the deep learning OCR model after convergence is recorded as a certificate recognition model completed by training.
  • the trained certificate recognition model obtained through continuous training can improve the accuracy and reliability of the OCR recognition result.
  • step S2004 that is, after matching the training recognition result with the sample label, and obtaining the loss value of the certificate sample, the method further includes:
  • the convergence condition may be a condition that the value of the loss value is small and will not drop after 8000 calculations, that is, the value of the loss value is small and will not drop anymore after 8000 calculations Stop training, and record the deep learning OCR model after convergence as the certificate recognition model after training;
  • the convergence condition can also be the condition that the loss value is less than the set threshold, that is, when the loss value When it is less than the set threshold, the training is stopped, and the OCR model of the deep learning after convergence is recorded as the certificate recognition model after the training.
  • the initial parameters of the iterative neural network model are continuously updated to continuously move closer to the accurate recognition result, so that the accuracy of the recognition result becomes higher and higher.
  • the present application inputs the to-be-detected image to the trained remake recognition model, and outputs the recognition result of the to-be-detected image.
  • the present application realizes the rapid and accurate recognition of the re-taken image, which improves the recognition.
  • an OCR image sample generating device is provided, and the OCR image sample generating device corresponds to the OCR image sample generating method in the above-mentioned embodiment in a one-to-one correspondence.
  • the OCR image sample generating device includes a receiving module 11, an input module 12, a synthesis module 13 and a generating module 14. The detailed description of each functional module is as follows:
  • the receiving module 11 is configured to receive an image generation instruction and obtain an image sample; the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label;
  • the input module 12 is configured to input the image sample into a preset font typesetting generation model, and by performing text detection and character recognition on the image sample, the font typesetting generation model recognizes the first annotation of the image sample Information, and obtain a simulation result generated by reconstruction of the font typesetting generation model according to the first font label, the first typesetting label, and the first annotation information; the simulation result includes a simulated image and a second font Label, second typesetting label and second annotation information;
  • the synthesis module 13 is configured to input the image sample and the simulated image into a preset style synthesis model, the style synthesis model extracts style features and content characteristics, and the style synthesis model is based on the style features and the The content feature performs style transfer and synthesis on the simulated image to generate a synthesis result; the synthesis result includes the synthesized image and the first texture style label;
  • the generating module 14 is configured to mark the second font label, the second typesetting label, the second annotation information, and the first texture style label as OCR image sample labels, and at the same time record the composite image as An OCR image sample corresponding to the image sample, and associate the OCR image sample with the OCR image sample tag.
  • the input module 12 includes:
  • the first extraction unit is configured to perform text detection on the image sample through the font typesetting generation model, and at the same time extract the text features of the image sample, and obtain the font typesetting generation model to recognize the extracted text features
  • the area result of the image sample obtained; the area result includes a number of text areas containing text and the area coordinates associated with each of the text areas;
  • the second extraction unit is configured to extract the character features of each text area through the font typesetting generation model, and obtain the font typesetting generation model to recognize the extracted text features of each text area The text content of each of the text areas;
  • the marking unit is configured to record the text area, the area coordinates associated with the text area, and the text content of the text area as the first information of the image sample, and to record all the first information Mark as the first marking information.
  • the input module 12 further includes:
  • a reconstruction unit configured to perform combined reconstruction through a generator in the reconstruction model to obtain the simulated image, the second font label, the second typesetting label, and the second annotation information output by the reconstruction model;
  • the output unit is configured to record the simulated image, the second font label, the second typesetting label, and the second annotation information as a simulation result output by the font typesetting generation model.
  • the synthesis module 13 includes:
  • An acquiring unit configured to use the simulated image as an initial image and acquire all pixel values of the initial image
  • the first calculation unit is configured to extract the style feature of the image sample and the style feature of the initial image through the style synthesis model, and calculate it according to the style feature of the image sample and the style feature of the initial image Out the style loss value;
  • the second calculation unit is configured to extract the content feature of the image sample and the content feature of the initial image through the style synthesis model, and calculate it according to the content feature of the image sample and the content feature of the initial image The content loss value;
  • the training unit is configured to perform gradient descent using the L-BFGS algorithm, and when the total loss value does not reach a preset condition, iteratively update all the pixel values in the initial image until the total loss value reaches the When the conditions are preset, the updated initial image is determined to be a composite image;
  • the associating unit is configured to associate the first texture style label with the synthesized image, and determine the synthesized image and the first texture style label as the synthesis result.
  • Each module in the above-mentioned OCR image sample generating device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a printed matter verification device is provided, and the printed matter verification device corresponds to the printed matter verification method in the above-mentioned embodiment one-to-one.
  • the printed body verification device includes an acquisition module 101, a training module 102, an identification module 103, a comparison module 104, a first determination module 105 and a second determination module 106.
  • the detailed description of each functional module is as follows:
  • the obtaining module 101 is configured to receive a certificate verification instruction, and obtain the printed form and verification information of the to-be-certified certificate;
  • the training module 102 is configured to input the printed document to be trained into a document recognition model that has been trained; the document recognition model is trained by an OCR image sample generated by the OCR image sample generation method according to any one of claims 1 to 4 Finish;
  • the recognition module 103 is configured to perform OCR recognition on the printed document to be documented through the document recognition model, and obtain the OCR recognition result output by the document recognition model.
  • the OCR recognition result includes the printed document to be documented and the document Relevant text information;
  • the comparison module 104 is configured to compare the OCR recognition result with the verification information, and determine whether the printed document to be issued meets the verification information;
  • the first determining module 105 is configured to determine that the verification is passed if the printed document to be document meets the verification information
  • the second determining module 106 is configured to confirm that the verification is not passed if the printed document to be document does not meet the verification information, and prompt on the display interface.
  • Each module in the above-mentioned printed matter verification device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 10.
  • the computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus.
  • the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, a computer program, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer program is executed by the processor to realize an OCR image sample generation method or a printed body verification method.
  • a computer device including a memory, a processor, and a computer program stored in the memory and capable of running on the processor.
  • the processor executes the computer program to implement the OCR image sample generation method in the above embodiment.
  • the processor implements the printed body verification method in the foregoing embodiment when the computer program is executed by the processor.
  • a computer-readable storage medium may be non-volatile or volatile, and a computer program is stored thereon.
  • the OCR image sample generation method in the foregoing embodiment is implemented, or the printed body verification method in the foregoing embodiment is implemented when the computer program is executed by a processor.
  • the computer-readable storage medium may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, an application program required by at least one function, etc.; the storage data area may store Data created by the use of nodes, etc.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
  • the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

Abstract

Disclosed are an OCR image sample generation method and apparatus, a print font verification method and apparatus, a device and a medium, which relate to artificial intelligence. The method comprises: receiving an image generation instruction, and acquiring an image sample; inputting the image sample into a preset font typesetting generation model; acquiring first annotation information by means of performing text detection and character recognition on the image sample, and obtaining a simulation result generated by means of the reconstruction of the font typesetting generation model; inputting the image sample and a simulated image into a preset style compositing model, such that the style compositing model extracts style features and content features, and generates a composite result; and acquiring an OCR image sample label and also recording a composite image as an OCR image sample corresponding to the image sample, and associating the OCR image sample with the OCR image sample label. By means of the method, an OCR image sample with the same texture style as an image sample is automatically generated, and automatic annotation with a sample label is realized.

Description

OCR图像样本生成、印刷体验证方法、装置、设备及介质OCR image sample generation, print verification methods, devices, equipment and media
本申请要求于2020年4月24日提交中国专利局、申请号为CN202010333257.4、名称为“OCR图像样本生成、印刷体验证方法、装置、设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application requires the priority of a Chinese patent application filed with the Chinese Patent Office with the application number CN202010333257.4 and titled "OCR image sample generation, printed matter verification method, device, equipment and medium" on April 24, 2020, which The entire content is incorporated into this application by reference.
技术领域Technical field
本申请涉及人工智能的数据建模领域,尤其涉及一种OCR图像样本生成、印刷体验证方法、装置、计算机设备及存储介质。This application relates to the field of artificial intelligence data modeling, and in particular to an OCR image sample generation and printed verification method, device, computer equipment and storage medium.
背景技术Background technique
目前,随着社会的科技发展,以及数字时代不断庞大,基于OCR技术进行文本识别的应用得到了广泛应用;OCR(Optical Character Recognition),中文叫做光学字符识别,为利用光学技术和计算机技术把印在或写在纸上的文字读取出来,并转换成一种计算机能够接受和可以理解的格式。在现有技术中,越来越多的应用场景(比如:涉及金融、保险、智慧安防的应用场景)都需要识别证件印刷体中的文本信息进行验证,发明人意识到由于针对特定的场景就需要非常庞大的证件样本(还需对其进行人工标注样本标签)进行训练神经网络,而通常只能获得数量极少的证件样本,很难获取到如此庞大的证件样本,而且由于其独特的干扰(如在印刷过程中字体很可能变得断裂或者墨水粘连)使得进行OCR识别依然困难,导致训练后的证件识别模型的准确率和精度不高,从而导致验证出错率高,需要人工重新验证,大大浪费成本,效率低。At present, with the development of science and technology in society and the growing digital age, the application of text recognition based on OCR technology has been widely used; OCR (Optical Character Recognition), Chinese called optical character recognition, is used to make use of optical technology and computer technology. The text written on or on paper is read and converted into a format that the computer can accept and understand. In the prior art, more and more application scenarios (such as: application scenarios involving finance, insurance, and smart security) need to identify text information in printed documents for verification. A very large number of document samples (and manual labeling of sample labels) are needed to train the neural network, and usually only a very small number of document samples can be obtained. It is difficult to obtain such a large document sample, and due to its unique interference (For example, in the printing process, the font is likely to become broken or ink sticking), making OCR recognition still difficult, resulting in low accuracy and precision of the trained document recognition model, resulting in a high verification error rate, requiring manual re-verification. Great waste of cost and low efficiency.
发明内容Summary of the invention
本申请提供一种OCR图像样本生成、印刷体验证方法、装置、计算机设备及存储介质,实现了自动生成与图像样本一样的纹理风格的OCR图像样本,并自动标注OCR图像样本标签,减少了人工成本和时间,以及能够提升OCR识别结果和印刷体验证的准确率和可靠性。This application provides an OCR image sample generation, print verification method, device, computer equipment and storage medium, which realizes the automatic generation of OCR image samples with the same texture style as the image samples, and automatically annotates the OCR image sample tags, reducing labor Cost and time, and can improve the accuracy and reliability of OCR recognition results and print verification.
一种OCR图像样本生成方法,包括:An OCR image sample generation method, including:
接收图像生成指令,获取图像样本;所述图像样本与图像样本标签关联,所述图像样本标签包括第一字体标签、第一排版标签和第一纹理风格标签;Receiving an image generation instruction to obtain an image sample; the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label;
将所述图像样本输入预设的字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取所述字体排版生成模型识别出所述图像样本的第一标注信息,并且获取所述字体排版生成模型根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构生成的模拟结果;所述模拟结果包括模拟图像、第二字体标签、第二排版标签和第二标注信息;Input the image sample into a preset font typesetting generation model, by performing text detection and character recognition on the image sample, the font typesetting generation model recognizes the first annotation information of the image sample, and obtains the The font typesetting generation model reconstructs and generates a simulation result according to the first font label, the first typesetting label, and the first annotation information; the simulation result includes a simulated image, a second font label, and a second typesetting label And the second label information;
将所述图像样本和所述模拟图像输入预设的风格合成模型,所述风格合成模型提取出风格特征和内容特征,所述风格合成模型根据所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果;所述合成结果包括合成图像和第一纹理风格标签;The image sample and the simulated image are input into a preset style synthesis model, the style synthesis model extracts style features and content features, and the style synthesis model performs simulation on the simulation based on the style features and the content features. Image style transfer and synthesis are performed to generate a synthesis result; the synthesis result includes the synthesized image and the first texture style label;
将所述第二字体标签、所述第二排版标签、所述第二标注信息和所述第一纹理风格标签标记为OCR图像样本标签,同时将所述合成图像记录为与所述图像样本对应的OCR图像样本,并将所述OCR图像样本与所述OCR图像样本标签关联。Mark the second font label, the second typesetting label, the second annotation information, and the first texture style label as OCR image sample labels, and at the same time record the composite image as corresponding to the image sample And associate the OCR image sample with the OCR image sample tag.
一种印刷体验证方法,包括:A print verification method, including:
接收证件验证指令,获取待证件印刷体和验证信息;Receive certificate verification instructions, obtain the printed version and verification information of the pending certificate;
将所述待证件印刷体输入已训练完成的证件识别模型;所述证件识别模型通过上述OCR图像样本生成方法生成的OCR图像样本训练完成;Input the printed document to be trained into the document recognition model that has been trained; the document recognition model is trained through the OCR image sample generated by the above-mentioned OCR image sample generation method;
通过所述证件识别模型对所述待证件印刷体进行OCR识别,获取所述证件识别模型 输出的OCR识别结果,所述OCR识别结果包含所述待证件印刷体中与证件相关的文本信息;Perform OCR recognition on the printed document to be documented through the document recognition model, and obtain an OCR recognition result output by the document recognition model, where the OCR recognition result includes the document-related text information in the printed document to be documented;
将所述OCR识别结果与所述验证信息进行比对,确定所述待证件印刷体是否符合所述验证信息;Comparing the OCR recognition result with the verification information to determine whether the printed document to be certified meets the verification information;
若所述待证件印刷体符合所述验证信息,确定验证通过;If the printed version of the certificate to be issued meets the verification information, it is determined that the verification is passed;
若所述待证件印刷体不符合所述验证信息,确认验证不通过,并在显示界面提示。If the printed body of the document to be issued does not conform to the verification information, confirm that the verification is not passed, and prompt on the display interface.
一种OCR图像样本生成装置,包括:An OCR image sample generating device, including:
接收模块,用于接收图像生成指令,获取图像样本;所述图像样本与图像样本标签关联,所述图像样本标签包括第一字体标签、第一排版标签和第一纹理风格标签;A receiving module, configured to receive an image generation instruction and obtain an image sample; the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label;
输入模块,用于将所述图像样本输入预设的字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取所述字体排版生成模型识别出所述图像样本的第一标注信息,并且获取所述字体排版生成模型根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构生成的模拟结果;所述模拟结果包括模拟图像、第二字体标签、第二排版标签和第二标注信息;The input module is used to input the image sample into a preset font typesetting generation model, and by performing text detection and character recognition on the image sample, to obtain the first annotation information of the image sample recognized by the font typesetting generation model , And obtain a simulation result generated by reconstruction of the font typesetting generation model according to the first font label, the first typesetting label, and the first annotation information; the simulation result includes a simulated image and a second font label , The second typesetting label and the second annotation information;
合成模块,用于将所述图像样本和所述模拟图像输入预设的风格合成模型,所述风格合成模型提取出风格特征和内容特征,所述风格合成模型根据所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果;所述合成结果包括合成图像和第一纹理风格标签;The synthesis module is configured to input the image sample and the simulated image into a preset style synthesis model, the style synthesis model extracts style features and content features, and the style synthesis model is based on the style features and the content The feature performs style transfer and synthesis on the simulated image to generate a synthesis result; the synthesis result includes the synthesized image and the first texture style label;
生成模块,用于将所述第二字体标签、所述第二排版标签、所述第二标注信息和所述第一纹理风格标签标记为OCR图像样本标签,同时将所述合成图像记录为与所述图像样本对应的OCR图像样本,并将所述OCR图像样本与所述OCR图像样本标签关联。A generating module for marking the second font label, the second typesetting label, the second annotation information, and the first texture style label as OCR image sample labels, and at the same time recording the composite image as The OCR image sample corresponding to the image sample, and the OCR image sample is associated with the OCR image sample tag.
一种印刷体验证装置,包括:A printed body verification device includes:
获取模块,用于接收证件验证指令,获取待证件印刷体和验证信息;The acquisition module is used to receive the certificate verification instruction, and obtain the printed version and verification information of the certificate;
训练模块,用于将所述待证件印刷体输入已训练完成的证件识别模型;所述证件识别模型通过上述OCR图像样本生成方法生成的OCR图像样本训练完成;The training module is used to input the printed document to be into the trained document recognition model; the document recognition model is trained through the OCR image sample generated by the above-mentioned OCR image sample generation method;
识别模块,用于通过所述证件识别模型对所述待证件印刷体进行OCR识别,获取所述证件识别模型输出的OCR识别结果,所述OCR识别结果包含所述待证件印刷体中与证件相关的文本信息;The recognition module is configured to perform OCR recognition on the printed document to be documented through the document recognition model, and obtain the OCR recognition result output by the document recognition model. The OCR recognition result includes the document-related printed document to be documented Text information;
比对模块,用于将所述OCR识别结果与所述验证信息进行比对,确定所述待证件印刷体是否符合所述验证信息;The comparison module is configured to compare the OCR recognition result with the verification information, and determine whether the printed document to be document meets the verification information;
第一确定模块,用于若所述待证件印刷体符合所述验证信息,确定验证通过;The first determining module is configured to determine that the verification is passed if the printed document to be document meets the verification information;
第二确定模块,用于若所述待证件印刷体不符合所述验证信息,确认验证不通过,并在显示界面提示。The second determination module is configured to confirm that the verification is not passed if the printed document to be document does not meet the verification information, and prompt on the display interface.
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如下步骤:A computer device includes a memory, a processor, and a computer program that is stored in the memory and can run on the processor, and the processor implements the following steps when the processor executes the computer program:
接收图像生成指令,获取图像样本;所述图像样本与图像样本标签关联,所述图像样本标签包括第一字体标签、第一排版标签和第一纹理风格标签;Receiving an image generation instruction to obtain an image sample; the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label;
将所述图像样本输入预设的字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取所述字体排版生成模型识别出所述图像样本的第一标注信息,并且获取所述字体排版生成模型根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构生成的模拟结果;所述模拟结果包括模拟图像、第二字体标签、第二排版标签和第二标注信息;Input the image sample into a preset font typesetting generation model, by performing text detection and character recognition on the image sample, the font typesetting generation model recognizes the first annotation information of the image sample, and obtains the The font typesetting generation model reconstructs and generates a simulation result according to the first font label, the first typesetting label, and the first annotation information; the simulation result includes a simulated image, a second font label, and a second typesetting label And the second label information;
将所述图像样本和所述模拟图像输入预设的风格合成模型,所述风格合成模型提取出风格特征和内容特征,所述风格合成模型根据所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果;所述合成结果包括合成图像和第一纹理风格标签;The image sample and the simulated image are input into a preset style synthesis model, the style synthesis model extracts style features and content features, and the style synthesis model performs simulation on the simulation based on the style features and the content features. Image style transfer and synthesis are performed to generate a synthesis result; the synthesis result includes the synthesized image and the first texture style label;
将所述第二字体标签、所述第二排版标签、所述第二标注信息和所述第一纹理风格标签标记为OCR图像样本标签,同时将所述合成图像记录为与所述图像样本对应的OCR图像样本,并将所述OCR图像样本与所述OCR图像样本标签关联。Mark the second font label, the second typesetting label, the second annotation information, and the first texture style label as OCR image sample labels, and at the same time record the composite image as corresponding to the image sample And associate the OCR image sample with the OCR image sample tag.
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如下步骤:A computer device includes a memory, a processor, and a computer program that is stored in the memory and can run on the processor, and the processor implements the following steps when the processor executes the computer program:
接收证件验证指令,获取待证件印刷体和验证信息;Receive certificate verification instructions, obtain the printed version and verification information of the pending certificate;
将所述待证件印刷体输入已训练完成的证件识别模型;所述证件识别模型通过上述OCR图像样本生成方法生成的OCR图像样本训练完成;Input the printed document to be trained into the document recognition model that has been trained; the document recognition model is trained through the OCR image sample generated by the above-mentioned OCR image sample generation method;
通过所述证件识别模型对所述待证件印刷体进行OCR识别,获取所述证件识别模型输出的OCR识别结果,所述OCR识别结果包含所述待证件印刷体中与证件相关的文本信息;Perform OCR recognition on the printed document to be documented by the document recognition model, and obtain an OCR recognition result output by the document recognition model, where the OCR recognition result includes the document-related text information in the printed document to be documented;
将所述OCR识别结果与所述验证信息进行比对,确定所述待证件印刷体是否符合所述验证信息;Comparing the OCR recognition result with the verification information to determine whether the printed document to be certified meets the verification information;
若所述待证件印刷体符合所述验证信息,确定验证通过;If the printed version of the certificate to be issued meets the verification information, it is determined that the verification is passed;
若所述待证件印刷体不符合所述验证信息,确认验证不通过,并在显示界面提示。If the printed body of the document to be issued does not conform to the verification information, confirm that the verification is not passed, and prompt on the display interface.
一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,其中,所述计算机程序被处理器执行时实现如下步骤:A computer-readable storage medium having a computer program stored on the computer-readable storage medium, wherein, when the computer program is executed by a processor, the following steps are implemented:
接收图像生成指令,获取图像样本;所述图像样本与图像样本标签关联,所述图像样本标签包括第一字体标签、第一排版标签和第一纹理风格标签;Receiving an image generation instruction to obtain an image sample; the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label;
将所述图像样本输入预设的字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取所述字体排版生成模型识别出所述图像样本的第一标注信息,并且获取所述字体排版生成模型根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构生成的模拟结果;所述模拟结果包括模拟图像、第二字体标签、第二排版标签和第二标注信息;Input the image sample into a preset font typesetting generation model, by performing text detection and character recognition on the image sample, the font typesetting generation model recognizes the first annotation information of the image sample, and obtains the The font typesetting generation model reconstructs and generates a simulation result according to the first font label, the first typesetting label, and the first annotation information; the simulation result includes a simulated image, a second font label, and a second typesetting label And the second label information;
将所述图像样本和所述模拟图像输入预设的风格合成模型,所述风格合成模型提取出风格特征和内容特征,所述风格合成模型根据所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果;所述合成结果包括合成图像和第一纹理风格标签;The image sample and the simulated image are input into a preset style synthesis model, the style synthesis model extracts style features and content features, and the style synthesis model performs simulation on the simulation based on the style features and the content features. Image style transfer and synthesis are performed to generate a synthesis result; the synthesis result includes the synthesized image and the first texture style label;
将所述第二字体标签、所述第二排版标签、所述第二标注信息和所述第一纹理风格标签标记为OCR图像样本标签,同时将所述合成图像记录为与所述图像样本对应的OCR图像样本,并将所述OCR图像样本与所述OCR图像样本标签关联。Mark the second font label, the second typesetting label, the second annotation information, and the first texture style label as OCR image sample labels, and at the same time record the composite image as corresponding to the image sample And associate the OCR image sample with the OCR image sample tag.
一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,其中,所述计算机程序被处理器执行时实现如下步骤:A computer-readable storage medium having a computer program stored on the computer-readable storage medium, wherein, when the computer program is executed by a processor, the following steps are implemented:
接收证件验证指令,获取待证件印刷体和验证信息;Receive certificate verification instructions, obtain the printed version and verification information of the pending certificate;
将所述待证件印刷体输入已训练完成的证件识别模型;所述证件识别模型通过上述OCR图像样本生成方法生成的OCR图像样本训练完成;Input the printed document to be trained into the document recognition model that has been trained; the document recognition model is trained through the OCR image sample generated by the above-mentioned OCR image sample generation method;
通过所述证件识别模型对所述待证件印刷体进行OCR识别,获取所述证件识别模型输出的OCR识别结果,所述OCR识别结果包含所述待证件印刷体中与证件相关的文本信息;Perform OCR recognition on the printed document to be documented by the document recognition model, and obtain an OCR recognition result output by the document recognition model, where the OCR recognition result includes the document-related text information in the printed document to be documented;
将所述OCR识别结果与所述验证信息进行比对,确定所述待证件印刷体是否符合所述验证信息;Comparing the OCR recognition result with the verification information to determine whether the printed document to be certified meets the verification information;
若所述待证件印刷体符合所述验证信息,确定验证通过;If the printed version of the certificate to be issued meets the verification information, it is determined that the verification is passed;
若所述待证件印刷体不符合所述验证信息,确认验证不通过,并在显示界面提示。If the printed body of the document to be issued does not conform to the verification information, confirm that the verification is not passed, and prompt on the display interface.
本申请提供的OCR图像样本生成方法、装置、计算机设备及存储介质,通过获取含有第一字体标签、第一排版标签和第一纹理风格标签的图像样本;将其输入字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取识别出的第一标注信息;根据 所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构,生成模拟结果;将所述图像样本和所述模拟图像输入风格合成模型,所述风格合成模型根据提取出的所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果;所述合成结果包括合成图像和第一纹理风格标签;将所述第二字体标签、所述第二排版标签、所述第二标注信息和所述第一纹理风格标签标记为OCR图像样本标签,同时将所述合成图像记录为与所述图像样本对应的OCR图像样本,并将所述OCR图像样本与所述OCR图像样本标签关联,因此,本申请实现了自动生成与图像样本一样的纹理风格的OCR图像样本,并且对所述OCR图像样本进行准确地标注OCR图像样本标签,减少了收集图像样本的人工成本和时间,以及能够快速获得所需场景下的OCR图像样本,更具针对性,为后续模型的训练提升了的准确性和可靠性,而且减少了对所述OCR图像样本标签标注的人工成本,避免人工标注出现的误差,以及提高标注准确性。The OCR image sample generation method, device, computer equipment, and storage medium provided in this application obtain image samples containing the first font label, the first typesetting label, and the first texture style label; and input it into the font typesetting generation model. Perform text detection and text recognition on the image sample to obtain the recognized first annotation information; reconstruct according to the first font label, the first typesetting label, and the first annotation information to generate a simulation result; The image sample and the simulated image are input into a style synthesis model, and the style synthesis model performs style transfer and synthesis on the simulated image according to the extracted style feature and the content feature to generate a synthesis result; the synthesis The result includes the synthesized image and the first texture style label; the second font label, the second typesetting label, the second annotation information and the first texture style label are marked as OCR image sample labels, and all The composite image is recorded as an OCR image sample corresponding to the image sample, and the OCR image sample is associated with the OCR image sample tag. Therefore, the present application realizes the automatic generation of an OCR image with the same texture style as the image sample Samples, and accurately mark the OCR image sample labels on the OCR image samples, reducing the labor cost and time of collecting image samples, and can quickly obtain OCR image samples in the desired scene, which is more targeted and is a follow-up model The training improves the accuracy and reliability, reduces the labor cost of labeling the OCR image sample labels, avoids errors in manual labeling, and improves labeling accuracy.
本申请提供的印刷体验证方法、装置、计算机设备及存储介质,通过接收证件验证指令,获取待证件印刷体和验证信息;将所述待证件印刷体输入通过上述OCR图像样本生成方法生成的OCR图像样本进行训练并完成的证件识别模型;通过所述证件识别模型对所述待证件印刷体进行OCR识别,获取所述证件识别模型输出的OCR识别结果;将所述OCR识别结果与所述验证信息进行比对,确定所述待证件印刷体是否符合所述验证信息;若所述待证件印刷体符合所述验证信息,确定验证通过;若所述待证件印刷体不符合所述验证信息,确认验证不通过,并在显示界面提示,因此,本申请利用所述OCR图像样本生成方法生成的OCR图像样本训练针对特定场景的证件识别模型,通过所述证件识别模型进行自动识别和自动验证,实现了针对特定场景的证件印刷体的自动验证,针对性强,提高了识别准确率,提升了识别效率和可靠性,提升了用户体验,节省了人工成本。The printed matter verification method, device, computer equipment and storage medium provided in this application obtain the printed matter to be document and verification information by receiving the document verification instruction; input the printed matter to be documented into the OCR generated by the above-mentioned OCR image sample generation method A certificate recognition model trained and completed by image samples; OCR recognition is performed on the printed document to be issued through the certificate recognition model, and the OCR recognition result output by the certificate recognition model is obtained; and the OCR recognition result is compared with the verification Information is compared to determine whether the printed document to be documented meets the verification information; if the printed document to be documented meets the verification information, it is determined that the verification is passed; if the printed document to be documented does not comply with the verification information, It is confirmed that the verification is not passed, and a prompt is displayed on the display interface. Therefore, this application uses the OCR image sample generated by the OCR image sample generation method to train a certificate recognition model for a specific scene, and performs automatic recognition and automatic verification through the certificate recognition model. The automatic verification of printed documents for specific scenarios is realized, which is highly targeted, improves recognition accuracy, improves recognition efficiency and reliability, improves user experience, and saves labor costs.
附图说明Description of the drawings
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative labor.
图1是本申请一实施例中OCR图像样本生成方法或印刷体验证方法的应用环境示意图;FIG. 1 is a schematic diagram of an application environment of an OCR image sample generation method or a printed body verification method in an embodiment of the present application;
图2是本申请一实施例中OCR图像样本生成方法的流程图;Fig. 2 is a flowchart of a method for generating OCR image samples in an embodiment of the present application;
图3是本申请一实施例中OCR图像样本生成方法的步骤S20的流程图;3 is a flowchart of step S20 of the OCR image sample generation method in an embodiment of the present application;
图4是本申请另一实施例中OCR图像样本生成方法的步骤S20的流程图;4 is a flowchart of step S20 of the OCR image sample generation method in another embodiment of the present application;
图5是本申请一实施例中OCR图像样本生成方法的步骤S30的流程图;5 is a flowchart of step S30 of the OCR image sample generation method in an embodiment of the present application;
图6是本申请一实施例中印刷体验证方法的流程图;FIG. 6 is a flowchart of a printed matter verification method in an embodiment of the present application;
图7是本申请一实施例中印刷体验证方法的步骤S200的流程图;FIG. 7 is a flowchart of step S200 of the printed body verification method in an embodiment of the present application;
图8是本申请一实施例中OCR图像样本生成装置的原理框图;Fig. 8 is a functional block diagram of an OCR image sample generating device in an embodiment of the present application;
图9是本申请一实施例中印刷体验证装置的原理框图;FIG. 9 is a schematic block diagram of a printed matter verification device in an embodiment of the present application;
图10是本申请一实施例中计算机设备的示意图。Fig. 10 is a schematic diagram of a computer device in an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
本申请提供的OCR图像样本生成方法,可应用在如图1的应用环境中,其中,客户端(计算机设备)通过网络与服务器进行通信。其中,客户端(计算机设备)包括但不限 于为各种个人计算机、笔记本电脑、智能手机、平板电脑、摄像头和便携式可穿戴设备。服务器可以用独立的服务器或者是多个服务器组成的服务器集群来实现。The OCR image sample generation method provided by this application can be applied in the application environment as shown in Fig. 1, in which the client (computer equipment) communicates with the server through the network. Among them, the client (computer equipment) includes, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, cameras, and portable wearable devices. The server can be implemented as an independent server or a server cluster composed of multiple servers.
在一实施例中,如图2所示,提供一种识别模型训练方法,其技术方案主要包括以下步骤S10-S40:In an embodiment, as shown in FIG. 2, a method for training a recognition model is provided, and the technical solution mainly includes the following steps S10-S40:
S10,接收图像生成指令,获取图像样本;所述图像样本与图像样本标签关联,所述图像样本标签包括第一字体标签、第一排版标签和第一纹理风格标签。S10: Receive an image generation instruction, and obtain an image sample; the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label.
可理解地,接收到所述图像生成指令之后,所述图像生成指令为选择并确认需要进行生成的所述图像样本之后触发的请求,所述触发方式可以根据需求进行设定,比如在应用程序平台界面提供一个可以通过点击、滑动等方式进行触发的触发按键等等,获取所述图像样本,其获取方式可以根据需要进行设定,比如获取方式可以为通过所述图像生成指令包含的所述图像样本获取所述图像样本、根据所述图像生成指令中包含的所述图像样本的存储路径获取所述图像样本等等。Understandably, after receiving the image generation instruction, the image generation instruction is a request to be triggered after selecting and confirming the image sample that needs to be generated. The trigger mode can be set according to requirements, for example, in the application The platform interface provides a trigger button that can be triggered by clicking, sliding, etc., to acquire the image sample, and the acquisition method can be set as required. For example, the acquisition method can be the The image sample acquires the image sample, acquires the image sample according to the storage path of the image sample included in the image generation instruction, and so on.
其中,所述图像样本为含有印刷体并与证件类相关的图像,即证件经过复印之后获得的图像文件,所述图像样本可以为照片文件,也可以为扫描的图像文件,所述图像样本与所述图像样本标签关联,所述图像样本标签为对所述图像样本中的内容进行标注而赋予的标签,所述图像样本标签包括第一字体标签、第一排版标签和第一纹理风格标签,所述第一字体标签为所述图像样本中的文字内容对应的字体风格及字体大小等信息的标签,比如宋体五号、隶书小四等等,所述第一排版标签为所述图像样本中的文字内容对应的排版信息的标签,比如单栏排布、双栏等宽排布、双栏不等宽排布等等,所述第一纹理风格标签为在扫描、复印等操作过程中转移成的所述图像样本而形成的纹理信息的标签,比如皱褶、变形等等。Wherein, the image sample is an image that contains printed matter and is related to a certificate, that is, an image file obtained after a certificate is copied. The image sample can be a photo file or a scanned image file. The image sample label is associated, the image sample label is a label assigned to annotate the content in the image sample, and the image sample label includes a first font label, a first typesetting label, and a first texture style label, The first font label is a label of information such as font style and font size corresponding to the text content in the image sample, such as Song Ti No. 5, Li Shu Xiao 4, etc., and the first typesetting label is in the image sample The text content corresponding to the label of the typesetting information, such as single-column arrangement, double-column equal-width arrangement, double-column unequal-width arrangement, etc., the first texture style label is transferred during scanning, copying, etc. The texture information label formed by the image sample, such as wrinkle, deformation, etc.
S20,将所述图像样本输入预设的字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取所述字体排版生成模型识别出所述图像样本的第一标注信息,并且获取所述字体排版生成模型根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构生成的模拟结果;所述模拟结果包括模拟图像、第二字体标签、第二排版标签和第二标注信息。S20. Input the image sample into a preset font typesetting generation model, by performing text detection and character recognition on the image sample, obtain the first annotation information of the image sample that the font typesetting generation model recognizes, and obtain The font typesetting generation model reconstructs and generates a simulation result according to the first font label, the first typesetting label, and the first annotation information; the simulation result includes a simulated image, a second font label, and a second Typesetting labels and second annotation information.
可理解地,所述字体排版生成模型包括对输入的所述图像样本进行文本检测和文字识别处理而获得包含所述图像样本中与文本相关的所述第一标注信息,所述文本检测为对所述图像样本进行扫描检测出含有文本的文本区域及文本区域的区域坐标,所述扫描检测的方式可以根据需求进行设定,比如提取文本特征确定文本的文本区域、通过边缘检测识别出多个文字边界确定文本的文本区域等等,所述文本区域为含有文本内容的四边形区域范围,所述区域坐标为所述文本区域的四个点在所述图像样本的坐标,所述区域坐标包含四个坐标值(坐标值包含横坐标和纵坐标),所述文字识别为对所述图像样本中的每个所述文本区域进行识别得出每个所述文本区域中的文本内容,将所述文本区域、与所述文本区域关联的所述区域坐标以及所述文本区域的所述文本内容记录为所述图像样本的第一信息,将所有所述第一信息标记为所述第一标注信息。Understandably, the font typesetting generation model includes performing text detection and text recognition processing on the input image sample to obtain the first annotation information related to the text in the image sample, and the text detection is The image sample is scanned to detect the text area containing the text and the area coordinates of the text area. The scanning detection method can be set according to requirements, such as extracting text features to determine the text area of the text, and identifying multiple text areas through edge detection. The text boundary determines the text area of the text, etc. The text area is a quadrilateral area containing text content, the area coordinates are the coordinates of the four points of the text area in the image sample, and the area coordinates include four A coordinate value (the coordinate value includes the abscissa and the ordinate), the text recognition is the recognition of each text area in the image sample to obtain the text content in each text area, and the The text area, the area coordinates associated with the text area, and the text content of the text area are recorded as the first information of the image sample, and all the first information is marked as the first annotation information .
其中,所述字体排版生成模型还包括重构模型,所述重构模型根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构处理,生成含有模拟图像、第二字体标签、第二排版标签和第二标注信息的模拟结果,所述重构模型可以为根据需求进行设定的神经网络模型,优选地,所述重构模型为基于GAN(Generative Adversarial Network,生成对抗网络)模型进行训练完成的神经网络模型,所述模拟图像为包含与所述第二字体标签、所述第二排版标签及所述第二标注信息对应的文本内容的图像文件,所述第二字体标签可以为与所述第一字体标签一致,亦可以为与所述第一字体标签的近似值最高的字体标签,所述字体标签为所有字体风格中的字体风格及其字体大小,比如宋体五号、楷体六号等等,所述近似值最高为在字体标签中除了所述第一字体标签以外,与所述第一字体标签的字体 风格近似值加上与所述第一字体标签的字体大小近似值得到的最高值,所述字体风格近似值为在所有字体风格中每个字体风格都与其中一个字体风格存在差异的衡量指标值,字体风格越相似,其衡量指标值越大,所述衡量指标值的范围可以设置为0至100的范围,例如:第一字体标签的字体风格为“宋体”,“仿宋”的字体风格与“宋体”最相似(可以通过基于GAN模型的重构模型对所述第一字体标签的字体风格进行编码,再对编码之后的代码进行解码,亦可以通过预设的活体印刷与书法字体之间差值的规则定义),其衡量指标值为100;所述字体大小近似值为每个字体大小与其中一个字体大小存在差异的大小衡量指标值,字体大小越相近,其大小衡量指标值越大,例如:第一字体标签的字体大小为“五号”,“11磅”的字体大小与“五号”为最相似(可以通过基于GAN模型的重构模型对所述第一字体标签的字体风格进行编码,再对编码之后的代码进行解码,亦可以通过与“五号”之间的差值最小,其中在差值为负数时,取其绝对值加一获得差值),其衡量指标值为100;所述第二排版标签可以为与所述第一排版标签一致,亦可以为与所述第一排版标签的近似值最高的排版标签,所述近似值为每个排版标签与其中一个排版标签存在差异的衡量指标对应的排版值,排版标签越相近,其排版值越大,例如:第一排版标签为双栏等宽排布,三栏等宽排布与双栏等宽排布之间的排版值为1,双栏不等宽排布与双栏等宽排布之间的排版值为0(最为相近);所述第二标注信息为通过所述重构模型对所述第一标注信息进行编码,对编码后的所述第一标注信息引入随机噪声并解码,以生成与所述第一标注信息接近的信息,例如:第一标注信息为“我喜欢果汁”及其坐标,第二标注信息为“我喜爱水果”及对应的坐标。Wherein, the font typesetting generation model further includes a reconstruction model, and the reconstruction model performs reconstruction processing according to the first font label, the first typesetting label, and the first annotation information to generate a simulated image, The simulation result of the second font label, the second typesetting label and the second annotation information. The reconstruction model may be a neural network model set according to requirements. Preferably, the reconstruction model is based on GAN (Generative Adversarial Network , Generate a neural network model trained by a confrontation network) model, and the simulated image is an image file containing text content corresponding to the second font label, the second typesetting label, and the second annotation information, so The second font label may be consistent with the first font label, or may be the font label with the highest approximate value to the first font label, and the font label is the font style and font size of all font styles, For example, Song Ti No. 5, Kai Ti No. 6 and so on, the highest approximate value is that in the font label, in addition to the first font label, the approximate value of the font style of the first font label plus the approximate value of the first font label The highest value obtained by the approximate font size. The approximate font style value is a measurement index value that differs from one of the font styles in all font styles. The more similar the font style, the larger the measurement index value. The range of the metric value can be set from 0 to 100. For example, the font style of the first font label is "Song Ti", and the font style of "Imitated Song" is the most similar to "Song Ti" (you can use the reconstruction model based on GAN model Encode the font style of the first font label, and then decode the encoded code, which can also be defined by a preset rule for the difference between the living body printing and the calligraphy font), and the measurement index value is 100; The approximate value of the font size is the size measurement index value of the difference between each font size and one of the font sizes. The closer the font size is, the larger the size measurement index value. For example, the font size of the first font label is "number five". The font size of “11 points” is the most similar to “No. 5” (the font style of the first font label can be encoded through the reconstruction model based on the GAN model, and then the encoded code can be decoded, or through The difference between “No. 5” and “No. 5” is the smallest. When the difference is negative, take its absolute value plus one to obtain the difference), and its measurement index value is 100; the second typesetting label may be A typesetting label is consistent, or it may be the typesetting label with the highest approximation value to the first typesetting label. The approximate value is the typesetting value corresponding to a measure of the difference between each typesetting label and one of the typesetting labels. The closer the typesetting label is, The larger the typesetting value, for example: the first typesetting label is double-column equal-width arrangement, the typesetting value between three-column equal-width arrangement and double-column equal-width arrangement is 1, and double-column unequal-width arrangement and double The typesetting value between the columns with equal width is 0 (the most similar); A label information is encoded, random noise is introduced to the encoded first label information and decoded to generate information close to the first label information, for example: the first label information is "I like juice" and its coordinates , The second label information is "I like fruits" and the corresponding coordinates.
在一实施例中,如图3所示,所述步骤S20中,即所述将所述图像样本输入预设的字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取所述字体排版生成模型识别出所述图像样本的第一标注信息,包括:In one embodiment, as shown in FIG. 3, in the step S20, the image sample is input into a preset font typesetting generation model, and the image sample is subjected to text detection and text recognition to obtain all the images. The font typesetting generation model that recognizes the first annotation information of the image sample includes:
S201,通过所述字体排版生成模型对所述图像样本进行文本检测,同时提取出所述图像样本的文本特征,获取所述字体排版生成模型根据提取出的所述文本特征识别出的所述图像样本的区域结果;所述区域结果包括若干个含有文本的文本区域以及与每个所述文本区域关联的区域坐标。S201: Perform text detection on the image sample through the font typesetting generation model, and at the same time extract the text features of the image sample, and obtain the image recognized by the font typesetting generation model according to the extracted text features The regional result of the sample; the regional result includes a number of text regions containing text and region coordinates associated with each of the text regions.
可理解地,通过所述字体排版生成模型对所述图像样本进行所述图像样本的所述文本特征提取,所述文本特征为中文词句、字母单词等特征,所述字体排版生成模型根据所述文本特征进行识别,识别出所述图像样本中若干个含有文本内容的文本区域及与每个所述文本区域关联的区域坐标,所述文本区域为含有文本内容的四边形区域范围,所述区域坐标为所述文本区域的四个点在所述图像样本的坐标,所述区域坐标包含四个坐标值(坐标值包含横坐标和纵坐标),将所有所述文本区域及所述文本区域关联的所述区域坐标确定为所述区域结果。Understandably, the text feature extraction of the image sample is performed on the image sample through the font typesetting generation model, and the text features are features such as Chinese words and sentences, letter words, etc., and the font typesetting generation model is based on the The text features are recognized, and a number of text regions containing text content in the image sample and the region coordinates associated with each of the text regions are identified. The text region is a quadrilateral region containing text content, and the region coordinates are Is the coordinates of the four points of the text area in the image sample, the area coordinates include four coordinate values (the coordinate values include the abscissa and the ordinate), which associate all the text areas and the text areas The area coordinates are determined as the area result.
S202,通过所述字体排版生成模型提取出每个所述文本区域的文字特征,获取所述字体排版生成模型根据提取出的每个所述文本区域的所述文字特征识别出的每个所述文本区域的文本内容。S202. Extract the character feature of each text area through the font typesetting generation model, and obtain each of the character features identified by the font typesetting generation model according to the extracted character feature of each text area. The text content of the text area.
可理解地,通过所述字体排版生成模型对每个所述文本区域进行所述文字特征的提取,所述文字特征为汉字、字母等特征,根据提取出的每个所述文本区域的所述文字特征,所述字体排版生成模型输出每个所述文本区域的所述文本内容,获取所有所述文本内容。Understandably, the character feature extraction is performed on each of the text regions through the font typesetting generation model, and the character features are Chinese characters, letters, etc., according to the extracted features of each text region Character features, the font typesetting generation model outputs the text content of each text area, and obtains all the text content.
S203,将所述文本区域、与所述文本区域关联的所述区域坐标以及所述文本区域的所述文本内容记录为所述图像样本的第一信息,将所有所述第一信息标记为所述第一标注信息。S203. Record the text area, the area coordinates associated with the text area, and the text content of the text area as the first information of the image sample, and mark all the first information as all Describe the first label information.
可理解地,所述文本区域都有一一对应的所述区域坐标和一一对应的所述文本内容,将所述文本区域、与其对应的所述区域坐标和与其对应的所述文本内容一起标记为所述图像样本的所述第一信息,所述图像样本中可以包含若干个所述第一信息,再将所有所述第 一信息标记为所述图像样本对应的所述第一标注信息。Understandably, the text area has a one-to-one correspondence with the area coordinates and a one-to-one correspondence with the text content, and the text area, the corresponding area coordinates and the corresponding text content are combined together Marked as the first information of the image sample, the image sample may contain several pieces of the first information, and then all the first information is marked as the first annotation information corresponding to the image sample .
如此,通过所述字体排版生成模型对所述图像样本进行文字检测和文字识别,能够准确获取所述图像样本中的所述第一标注信息。In this way, by performing text detection and text recognition on the image sample through the font typesetting generation model, the first annotation information in the image sample can be accurately obtained.
在一实施例中,如图4所示,所述步骤S20中,即所述获取所述字体排版生成模型根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构生成的模拟结果,包括:In one embodiment, as shown in FIG. 4, in the step S20, the acquiring the font typesetting generation model is performed according to the first font label, the first typesetting label, and the first annotation information. The simulation results generated by reconstruction include:
S204,将所述第一字体标签、所述第一排版标签和所述第一标注信息输入所述字体排版生成模型中的重构模型;所述重构模型为基于GAN模型进行训练完成。S204. Input the first font label, the first typesetting label, and the first annotation information into a reconstruction model in the font typesetting generation model; the reconstruction model is completed based on a GAN model.
可理解地,所述GAN模型(GAN,Generative Adversarial Networks)为一种生成式对抗网络的深度神经网络模型,所述重构模型为基于GAN模型进行训练并训练完成的深度神经网络模型,将所述第一字体标签、所述第一排版标签和所述第一标注信息输入所述重构模型中。Understandably, the GAN model (GAN, Generative Adversarial Networks) is a deep neural network model of a generative confrontation network, and the reconstruction model is a deep neural network model trained and trained based on the GAN model. The first font label, the first typesetting label and the first annotation information are input into the reconstruction model.
S205,通过所述重构模型中的生成器进行组合重构,获取所述重构模型输出的所述模拟图像、第二字体标签、第二排版标签和第二标注信息。S205: Perform combined reconstruction through a generator in the reconstruction model, and obtain the simulated image, the second font label, the second typesetting label, and the second annotation information output by the reconstruction model.
可理解地,将所述第一字体标签、所述第一排版标签和所述第一标注信息进行组合成一个与所述图像样本的大小一样的过渡图像,所述过渡图像优选为空白底,其内容包含所述第一字体标签、所述第一排版标签和所述第一标注信息;所述重构模型包括所述生成器,所述生成器的主要任务是学习所述过渡图像,通过对所述过渡图像进行编码,在对编码之后的代码进行解码,从而生成与所述过渡图像十分接近的图像文件,所述重构模型还包括判别器,所述判别器的主要任务是对所述生成器生成的图像文件进行区份与所述过渡图像的不同,进行真假判别,在所述重构模型的迭代训练过程中,所述生成器不断努力让生成的图像文件越来越接近所述过渡图像,而所述判别器不断努力识别出该图像文件的真假,通过所述生成器与所述判别器之间的博弈,随着反复迭代,最终所述生成器与所述判别器达到了平衡,所述生成器生成与所述过渡图像接近的所述模拟图像,而所述判别器已经很难识别出所述模拟图像与所述图像样本之间的不同,如此,所述重构模型重构出所述模拟图像。Understandably, the first font label, the first typesetting label, and the first annotation information are combined into a transition image with the same size as the image sample, and the transition image is preferably a blank background, Its content includes the first font label, the first typesetting label, and the first annotation information; the reconstruction model includes the generator, and the main task of the generator is to learn the transition image through The transition image is encoded, and the encoded code is decoded to generate an image file that is very close to the transition image. The reconstruction model also includes a discriminator whose main task is to The image file generated by the generator is differentiated from the transition image, and the authenticity is judged. In the iterative training process of the reconstruction model, the generator continuously strives to make the generated image file closer and closer. The transition image, and the discriminator is constantly trying to identify the authenticity of the image file. Through the game between the generator and the discriminator, with repeated iterations, the generator and the discriminator are finally The generator has reached a balance. The generator generates the analog image close to the transition image, and the discriminator has difficulty identifying the difference between the analog image and the image sample. Thus, the The reconstruction model reconstructs the simulated image.
所述模拟图像为包含与所述第二字体标签、所述第二排版标签及所述第二标注信息对应的文本内容的图像文件,优选地,所述模拟图像的大小可以与所述图像样本的大小一致,所述第二字体标签为与所述第一字体标签接近,例如:第一字体标签为宋体五号,第二字体标签为仿宋11磅;所述第二排版标签与所述第一排版标签接近,例如:第一排版标签为双栏等宽排布,第二排版标签为双栏不等宽排布;所述第二标注信息为通过所述重构模型对所述第一标注信息进行编码,对编码后的所述第一标注信息引入随机噪声并解码,以生成与所述第一标注信息接近的信息,例如:第一标注信息为“我喜欢果汁”及其坐标,第二标注信息为“我喜爱水果”及对应的坐标。The simulated image is an image file containing text content corresponding to the second font label, the second typesetting label, and the second annotation information. Preferably, the size of the simulated image may be the same as the size of the image sample. The size of the second font label is the same as that of the first font label, for example: the first font label is Times New Roman No. 5, the second font label is imitation Song 11 points; the second typesetting label is similar to the first font label. One typesetting label is close, for example: the first typesetting label is two-column equal-width arrangement, and the second typesetting label is two-column unequal-width arrangement; The label information is encoded, random noise is introduced to the encoded first label information and decoded to generate information close to the first label information, for example: the first label information is "I like juice" and its coordinates, The second label information is "I like fruits" and the corresponding coordinates.
S206,将所述模拟图像、所述第二字体标签、所述第二排版标签和所述第二标注信息记录为所述字体排版生成模型输出的模拟结果。S206. Record the simulated image, the second font label, the second typesetting label, and the second annotation information as a simulation result output by the font typesetting generation model.
可理解地,所述模拟结果包括所述模拟图像、所述第二字体标签、所述第二排版标签和所述第二标注信息,而且所述模拟图像、所述第二字体标签、所述第二排版标签和所述第二标注信息彼此关联。Understandably, the simulation result includes the simulated image, the second font label, the second typesetting label, and the second annotation information, and the simulated image, the second font label, the The second typesetting label and the second annotation information are associated with each other.
如此,通过所述字体排版生成模型中的基于GAN模型进行训练完成的重构模型,重构生成包含模拟图像、第二字体标签、第二排版标签和第二标注信息的模拟结果,实现了自动生成与第二字体标签、第二排版标签和第二标注信息关联的模拟图像,即自动给模拟图像进行标注,从而自动生成具有与图像样本的第一字体标签和第一排版标签相关的模拟图像。In this way, through the reconstruction model trained on the basis of the GAN model in the font typesetting generation model, the simulation results including the simulated image, the second font label, the second typesetting label and the second annotation information are reconstructed and generated, thereby realizing automatic Generate simulation images associated with the second font label, second typesetting label and second annotation information, that is, automatically annotate the simulated image, thereby automatically generating a simulated image with the first font label and the first typesetting label related to the image sample .
S30,将所述图像样本和所述模拟图像输入预设的风格合成模型,所述风格合成模型 提取出风格特征和内容特征,所述风格合成模型根据所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果;所述合成结果包括合成图像和第一纹理风格标签。S30. Input the image sample and the simulated image into a preset style synthesis model, the style synthesis model extracts style features and content features, and the style synthesis model compares the style features and the content features with each other. The simulated image undergoes style transfer and synthesis to generate a synthesis result; the synthesis result includes the synthesized image and the first texture style label.
可理解地,将所述图像样本和所述模拟图像输入所述风格合成模型中,所述风格合成模型提取出风格特征和内容特征,所述风格特征为皱褶、背景灰度、斑纹等特征,所述内容特征为与所述第一字体标签、所述第一排版标签、所述第二字体标签和所述第二排版标签相关的特征,根据所述风格特征和所述内容特征对所述模拟图像进行风格迁移,即将所述模拟图像作为初始图像,获取所述初始图像的所有像素值,同时根据所述风格特征和所述内容特征得出总损失值,通过梯度下降,不断迭代更新所有所述像素值,直到所述总损失值达到预设条件时,所述预设条件可以根据需求进行设定,比如所述预设条件可以设定为不再下降,将更新后的所述初始图像确定为合成图像,所述合成图像为所述模拟图像通过迁移所述图像样本中的纹理信息达到最优的图像,所述纹理信息为在扫描、复印等操作过程中转移成图像文件过程中形成的信息,所以所述合成图像具备的纹理信息与所述第一纹理风格标签一致,将所述合成图像与所述第一纹理风格标签进行关联,从而将所述合成图像和所述第一纹理风格标签确定为合成结果。Understandably, the image sample and the simulated image are input into the style synthesis model, and the style synthesis model extracts style features and content features, and the style features are features such as wrinkles, background gray levels, and stripes. , The content feature is a feature related to the first font label, the first typesetting label, the second font label, and the second typesetting label. The simulated image performs style transfer, that is, the simulated image is used as the initial image, all the pixel values of the initial image are obtained, and the total loss value is obtained according to the style feature and the content feature, and iteratively updated continuously through gradient descent For all the pixel values, until the total loss value reaches a preset condition, the preset condition can be set according to demand. For example, the preset condition can be set to no longer decrease, and the updated The initial image is determined to be a composite image, and the composite image is an image of the simulated image that is optimized by transferring the texture information in the image sample, and the texture information is the process of transferring the texture information into an image file during operations such as scanning and copying. Therefore, the texture information provided in the composite image is consistent with the first texture style tag, and the composite image is associated with the first texture style tag to associate the composite image with the first texture style tag. A texture style label is determined as the synthesis result.
在一实施例中,如图5所示,所述步骤S30中,即将所述图像样本和所述模拟图像输入预设的风格合成模型,所述风格合成模型提取出风格特征和内容特征,所述风格合成模型根据所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果,包括:In one embodiment, as shown in FIG. 5, in the step S30, the image sample and the simulated image are input into a preset style synthesis model, and the style synthesis model extracts style features and content features, so The style synthesis model performs style transfer and synthesis on the simulated image according to the style feature and the content feature to generate a synthesis result, including:
S301,将所述模拟图像作为初始图像,获取所述初始图像的所有像素值;S301, using the simulated image as an initial image, and acquiring all pixel values of the initial image;
可理解地,将所述模拟图像作为所述初始图像,即在所述图像样本和所述模拟图像输入预设的风格合成模型时,所述初始图像与所述模拟图像一致,所述初始图像中包含若干个像素点,每个所述像素点对应一个所述像素值,所述像素值为该像素点通过对彩色进行衡量而赋予的值,所述像素值的范围可以根据需求进行设定。Understandably, the simulated image is used as the initial image, that is, when the image sample and the simulated image are input into a preset style synthesis model, the initial image is consistent with the simulated image, and the initial image It contains a number of pixels, each of the pixels corresponds to one of the pixel values, and the pixel value is the value assigned by the pixel by measuring the color, and the range of the pixel value can be set according to requirements .
S302,通过所述风格合成模型提取出所述图像样本的风格特征以及所述初始图像的风格特征,并根据所述图像样本的风格特征和所述初始图像的风格特征计算得出风格损失值。S302: Extract the style feature of the image sample and the style feature of the initial image through the style synthesis model, and calculate a style loss value according to the style feature of the image sample and the style feature of the initial image.
可理解地,所述风格合成模型为通过迁移学习而得到的深度神经网络模型,所述风格合成模型的网络结构通过迁移学习的方式获得,比如迁移学习VGG19模型的网络结构,所述风格特征为皱褶、背景灰度、斑纹等特征,所述风格损失值为通过风格损失函数对所述图像样本的风格特征和所述初始图像的风格特征进行计算获得所述图像样本与所述初始图像样本的风格差异。Understandably, the style synthesis model is a deep neural network model obtained by transfer learning, and the network structure of the style synthesis model is obtained by transfer learning, such as the network structure of the transfer learning VGG19 model, and the style feature is Features such as wrinkles, background gray levels, spots, etc., where the style loss value is calculated by calculating the style feature of the image sample and the style feature of the initial image through a style loss function to obtain the image sample and the initial image sample The style difference.
S303,通过所述风格合成模型提取出所述图像样本的内容特征以及所述初始图像的内容特征,并根据所述图像样本的内容特征和所述初始图像的内容特征计算得出内容损失值。S303: Extract the content feature of the image sample and the content feature of the initial image through the style synthesis model, and calculate a content loss value based on the content feature of the image sample and the content feature of the initial image.
可理解地,所述内容特征为与所述第一字体标签、所述第一排版标签、所述第二字体标签和所述第二排版标签相关的特征,所述内容损失值为通过内容损失函数对所述图像样本的内容特征和所述初始图像的内容特征进行计算获得所述图像样本与所述初始图像样本的内容差异。Understandably, the content feature is a feature related to the first font label, the first typesetting label, the second font label, and the second typesetting label, and the content loss value is a loss through content The function calculates the content feature of the image sample and the content feature of the initial image to obtain the content difference between the image sample and the initial image sample.
S304,将所述风格损失值和所述内容损失值进行加权处理得到总损失值。S304: Perform weighting processing on the style loss value and the content loss value to obtain a total loss value.
可理解地,所述加权处理为将所述风格损失值和所述内容损失值输入损失加权函数中,通过所述损失加权函数计算出所述总损失值,所述损失加权函数为:Understandably, the weighting process is to input the style loss value and the content loss value into a loss weighting function, and the total loss value is calculated by the loss weighting function, and the loss weighting function is:
L=w 1×L 1+w 2×L 2 L=w 1 ×L 1 +w 2 ×L 2
其中:in:
L 1为风格损失值; L 1 is the style loss value;
L 2为内容损失值; L 2 is the content loss value;
w1为风格损失值的损失函数权重;w1 is the weight of the loss function of the style loss value;
w 2为内容损失值的损失函数权重; w 2 is the weight of the loss function of the content loss value;
L为总损失值。L is the total loss value.
S305,通过L-BFGS算法进行梯度下降,在所述总损失值未达到预设条件时,迭代更新所述初始图像中的所有所述像素值,直至所述总损失值达到所述预设条件时,将更新后的所述初始图像确定为合成图像。S305: Perform gradient descent using the L-BFGS algorithm, and when the total loss value does not reach a preset condition, iteratively update all the pixel values in the initial image until the total loss value reaches the preset condition At this time, the updated initial image is determined to be a composite image.
可理解地,通过L-BFGS算法进行梯度下降,即通过L-BFGS算法不断将所述总损失值进行下降,所述L-BFGS算法为解决无约束非线性问题的方法,所述预设条件可以根据需求进行设定,比如所述预设条件可以设定为不再下降,在所述总损失值未达到所述预设条件(不再下降)时,迭代更新所述初始图像中的所有所述像素值,直至所述总损失值达到所述预设条件(不再下降)时,将更新后的所述初始图像确定为合成图像。Understandably, gradient descent is performed through the L-BFGS algorithm, that is, the total loss value is continuously reduced through the L-BFGS algorithm. The L-BFGS algorithm is a method for solving unconstrained nonlinear problems, and the preset conditions It can be set according to requirements. For example, the preset condition can be set to no longer decrease, and when the total loss value does not reach the preset condition (no more decrease), all the items in the initial image are updated iteratively. For the pixel value, until the total loss value reaches the preset condition (no more drop), the updated initial image is determined to be a composite image.
S306,将所述第一纹理风格标签与所述合成图像关联,并将所述合成图像和所述第一纹理风格标签确定为所述合成结果。S306: Associate the first texture style label with the synthesized image, and determine the synthesized image and the first texture style label as the synthesis result.
可理解地,将所述合成图像关联所述第一纹理风格标签,从而将所述合成图像和所述第一纹理风格标签确定为合成结果。Understandably, the synthesized image is associated with the first texture style tag, so that the synthesized image and the first texture style tag are determined as the synthesis result.
如此,通过风格合成模型对模拟图像从图像样本中进行风格迁移,自动生成具有与图像样本关联的第一纹理风格标签一致的合成图像,实现了通过图像样本自动生成合成图像,为后续模型的训练提供有效的样本,缩短了收集样本的时间,并减少了收集样本的成本,提高了效率。In this way, the style transfer of the simulated image from the image sample is carried out through the style synthesis model, and the synthesized image with the first texture style label associated with the image sample is automatically generated, and the synthesized image is automatically generated from the image sample, which is the training of the subsequent model. Provide effective samples, shorten the time for collecting samples, reduce the cost of collecting samples, and improve efficiency.
S40,将所述第二字体标签、所述第二排版标签、所述第二标注信息和所述第一纹理风格标签标记为OCR图像样本标签,同时将所述合成图像记录为与所述图像样本对应的OCR图像样本,并将所述OCR图像样本与所述OCR图像样本标签关联。S40. Mark the second font label, the second typesetting label, the second annotation information, and the first texture style label as OCR image sample labels, and at the same time record the composite image as the same as the image An OCR image sample corresponding to the sample, and associate the OCR image sample with the OCR image sample label.
可理解地,对所述OCR图像样本进行标注OCR图像样本标签,即将所述第二字体标签、所述第二排版标签、所述第二标注信息和所述第一纹理风格标签标记为OCR图像样本对应的所述OCR图像样本标签,如此,减少了对所述OCR图像样本标签标注的人工成本,避免人工标注出现的误差,以及提高标注准确性。Understandably, the OCR image sample is annotated with OCR image sample tags, that is, the second font tags, the second typesetting tags, the second annotation information, and the first texture style tags are marked as OCR images The OCR image sample label corresponding to the sample thus reduces the labor cost of labeling the OCR image sample label, avoids errors in manual labeling, and improves labeling accuracy.
本申请通过接收图像生成指令,获取图像样本;所述图像样本与图像样本标签关联,所述图像样本标签包括第一字体标签、第一排版标签和第一纹理风格标签;将所述图像样本输入预设的字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取所述字体排版生成模型识别出所述图像样本的第一标注信息,并且获取所述字体排版生成模型根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构生成的模拟结果;所述模拟结果包括模拟图像、第二字体标签、第二排版标签和第二标注信息;将所述图像样本和所述模拟图像输入预设的风格合成模型,所述风格合成模型提取出风格特征和内容特征,所述风格合成模型根据所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果;所述合成结果包括合成图像和第一纹理风格标签;将所述第二字体标签、所述第二排版标签、所述第二标注信息和所述第一纹理风格标签标记为OCR图像样本标签,同时将所述合成图像记录为与所述图像样本对应的OCR图像样本,并将所述OCR图像样本与所述OCR图像样本标签关联。本申请可应用于智慧安防领域中,从而推动智慧城市的建设。This application obtains an image sample by receiving an image generation instruction; the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label; and the image sample is input The preset font typesetting generation model uses text detection and text recognition on the image sample to obtain the font typesetting generation model to recognize the first annotation information of the image sample, and obtain the font typesetting generation model according to the A simulation result generated by reconstructing the first font label, the first typesetting label, and the first annotation information; the simulation result includes a simulated image, a second font label, a second typesetting label, and second annotation information; The image sample and the simulated image are input into a preset style synthesis model, the style synthesis model extracts style features and content features, and the style synthesis model performs simulation on the simulation based on the style features and the content features. The image undergoes style transfer and synthesis to generate a synthesis result; the synthesis result includes a synthesized image and a first texture style label; the second font label, the second typesetting label, the second annotation information and the first A texture style label is marked as an OCR image sample label, and the composite image is recorded as an OCR image sample corresponding to the image sample, and the OCR image sample is associated with the OCR image sample label. This application can be applied in the field of smart security, so as to promote the construction of smart cities.
因此,本申请实现了通过获取含有第一字体标签、第一排版标签和第一纹理风格标签的图像样本;将其输入字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取识别出的第一标注信息;根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构,生成模拟结果;将所述图像样本和所述模拟图像输入风格合成模型,所述风格合成模型根据提取出的所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果;所述合成结果包括合成图像和第一纹理风格标签;将所述第二字体标签、所述第二排版标签、所述第二标注信息和所述第一纹理风格标签标记为OCR图 像样本标签,同时将所述合成图像记录为与所述图像样本对应的OCR图像样本,并将所述OCR图像样本与所述OCR图像样本标签关联,因此,本申请实现了自动生成与图像样本一样的纹理风格的OCR图像样本,并且对所述OCR图像样本进行准确地标注OCR图像样本标签,减少了收集图像样本的人工成本和时间,以及能够快速获得所需场景下的OCR图像样本,更具针对性,为后续模型的训练提升了的准确性和可靠性,而且减少了对所述OCR图像样本标签标注的人工成本,避免人工标注出现的误差,以及提高标注准确性。Therefore, this application realizes that by obtaining an image sample containing the first font label, the first typesetting label, and the first texture style label; inputting it into the font typesetting generation model, and performing text detection and text recognition on the image sample to obtain The identified first annotation information; reconstruct according to the first font label, the first typesetting label, and the first annotation information to generate a simulation result; synthesize the image sample and the simulated image input style Model, the style synthesis model performs style transfer and synthesis on the simulated image according to the extracted style feature and the content feature to generate a synthesis result; the synthesis result includes the synthesis image and the first texture style label; The second font label, the second typesetting label, the second annotation information, and the first texture style label are marked as OCR image sample labels, and the composite image is recorded as corresponding to the image sample OCR image samples, and associate the OCR image samples with the OCR image sample tags. Therefore, the present application realizes the automatic generation of OCR image samples with the same texture style as the image samples, and accurately performs operations on the OCR image samples. Annotating OCR image sample labels reduces the labor cost and time of collecting image samples, and can quickly obtain OCR image samples in the required scene, which is more targeted, and improves the accuracy and reliability of subsequent model training, and The labor cost for labeling the OCR image sample label is reduced, errors caused by manual labeling are avoided, and labeling accuracy is improved.
本申请提供的图像识别方法,可应用在如图1的应用环境中,其中,客户端(计算机设备)通过网络与服务器进行通信。其中,客户端(计算机设备)包括但不限于为各种个人计算机、笔记本电脑、智能手机、平板电脑、摄像头和便携式可穿戴设备。服务器可以用独立的服务器或者是多个服务器组成的服务器集群来实现。The image recognition method provided in this application can be applied in the application environment as shown in Fig. 1, in which the client (computer equipment) communicates with the server through the network. Among them, the client (computer equipment) includes, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, cameras, and portable wearable devices. The server can be implemented as an independent server or a server cluster composed of multiple servers.
在一实施例中,如图6示,提供一种印刷体验证方法,其技术方案主要包括以下步骤S100-S600:In an embodiment, as shown in FIG. 6, a printed matter verification method is provided, and the technical solution mainly includes the following steps S100-S600:
S100,接收证件验证指令,获取待证件印刷体和验证信息。S100: Receive a certificate verification instruction, and obtain a printed version of a certificate to be issued and verification information.
可理解地,所述证件验证指令为选择并确认需要进行验证的所述待证件印刷体和所述验证信息之后触发的请求,所述待证件印刷体为需要验证的证件经过扫描或者复印之后的图像文件,所述验证信息为提供所述待证件印刷体进行验证的核对载体,所述验证信息可以从用户在客户端输入的与所述待证件印刷体相关的信息中获取,也可以自数据库中通过查询与所述待证件印刷体相关的信息中获取,比如输入的银行卡号或者银行名称等。Understandably, the document verification instruction is a request triggered after selecting and confirming the printed document to be verified and the verification information that need to be verified, and the printed document to be verified is the document that needs to be verified after being scanned or copied. An image file, the verification information is a verification carrier that provides the printed document to be verified, the verification information can be obtained from the information related to the printed document to be entered by the user at the client, or from a database It is obtained by querying information related to the printed document to be documented, such as the entered bank card number or bank name.
S200,将所述待证件印刷体输入已训练完成的证件识别模型;所述证件识别模型通过上述OCR图像样本生成方法生成的OCR图像样本训练完成。S200: Input the printed document to be into a trained document recognition model; the document recognition model is trained through the OCR image sample generated by the above-mentioned OCR image sample generation method.
可理解地,将所述待证件印刷体输入至所述证件识别模型,所述证件识别模型为通过上述OCR图像样本生成方法生成的OCR图像样本和图像样本进行训练的神经网络模型。Understandably, the printed document to be document is input to the document recognition model, and the document recognition model is an OCR image sample generated by the above-mentioned OCR image sample generation method and a neural network model trained on the image sample.
S300,通过所述证件识别模型对所述待证件印刷体进行OCR识别,获取所述证件识别模型输出的OCR识别结果,所述OCR识别结果包含所述待证件印刷体中与证件相关的文本信息。S300. Perform OCR recognition on the printed document to be documented through the document recognition model, and obtain an OCR recognition result output by the document recognition model, where the OCR recognition result includes document-related text information in the printed document to be documented .
可理解地,所述OCR识别为通过OCR(Optical Character Recognition,光学字符识别)技术把所述待证件印刷体的文字读取出来,所述OCR识别结果包含所述待证件印刷体中与证件相关的文本信息。Understandably, the OCR recognition is to read the text of the printed document to be documented through OCR (Optical Character Recognition) technology, and the OCR recognition result includes that the printed document to be documented is related to the document Text information.
S400,将所述OCR识别结果与所述验证信息进行比对,确定所述待证件印刷体是否符合所述验证信息。S400: Compare the OCR recognition result with the verification information, and determine whether the printed document to be certified meets the verification information.
可理解地,将所述OCR识别结果与所述验证信息进行核对,以确定所述待证件印刷体是否核对通过。Understandably, the OCR recognition result is checked with the verification information to determine whether the printed document to be checked passes.
S500,若所述待证件印刷体符合所述验证信息,确定验证通过。S500: If the printed document to be document meets the verification information, it is determined that the verification is passed.
可理解地,如果所述待证件印刷体对应的所述OCR识别结果与所述验证信息一致,则确定所述待证件印刷体为验证通过。Understandably, if the OCR recognition result corresponding to the printed document to be certified is consistent with the verification information, it is determined that the printed document to be certified is verified.
S600,若所述待证件印刷体不符合所述验证信息,确认验证不通过,并在显示界面提示。S600: If the printed document to be document does not meet the verification information, confirm that the verification fails, and prompt on the display interface.
可理解地,如果所述待证件印刷体对应的所述OCR识别结果与所述验证信息不一致,则确定所述待证件印刷体为验证不通过,并且在所述显示界面提示,所述显示界面为客户对应的终端设备的显示界面,提示的内容可以根据需求进行设定,比如提示的内容为“验证信息有误,请重新输入验证信息!”。Understandably, if the OCR recognition result corresponding to the printed document to be documented is inconsistent with the verification information, it is determined that the printed document to be documented has failed the verification, and a prompt is displayed on the display interface that the display interface It is the display interface of the terminal device corresponding to the customer. The content of the prompt can be set according to the needs. For example, the content of the prompt is "The verification information is wrong, please re-enter the verification information!".
在一实施例中,如图7所示,所述步骤S200之前,即所述将所述待证件印刷体输入已训练完成的证件识别模型之前,包括:In one embodiment, as shown in FIG. 7, before step S200, that is, before inputting the printed document to be into the trained document recognition model, the method includes:
S2001,获取证件样本集;所述证件样本集包含若干证件样本,一个证件样本与一个 样本标签关联;在所述证件样本为图像样本时,所述样本标签为图像样本标签;在所述证件样本为OCR图像样本时,所述样本标签为OCR图像样本标签;所述OCR图像样本通过上述OCR图像样本生成方法生成;所述证件样本集中所述图像样本的数量小于所述OCR图像样本的数量。S2001: Obtain a certificate sample set; the certificate sample set includes a number of certificate samples, and one certificate sample is associated with a sample label; when the certificate sample is an image sample, the sample label is an image sample label; in the certificate sample When it is an OCR image sample, the sample label is an OCR image sample label; the OCR image sample is generated by the above-mentioned OCR image sample generation method; the number of the image samples in the certificate sample set is less than the number of the OCR image samples.
作为优选,所述OCR图像样本为通过所述证件样本集中的所述图像样本经过上述OCR图像样本生成方法生成的图像,并且所述OCR图像样本已经被所述OCR图像样本生成方法进行标注所述OCR图像样本标签,一个所述图像样本可以通过所述OCR图像样本生成方法生成多个OCR图像样本,所述证件样本集中所述图像样本的数量小于所述OCR图像样本的数量,如此,可以节省收集所述证件样本集的时间,以及对所述证件样本进行标注的人工时间,并且能够对所述OCR图像样本进行准确地标注OCR图像样本标签。Preferably, the OCR image sample is an image generated by the above-mentioned OCR image sample generation method through the image samples in the certificate sample set, and the OCR image sample has been annotated by the OCR image sample generation method. OCR image sample label, one image sample can generate multiple OCR image samples by the OCR image sample generation method, and the number of the image samples in the certificate sample set is less than the number of the OCR image samples, thus saving The time for collecting the document sample set and the manual time for labeling the document sample, and the OCR image sample label can be accurately labelled on the OCR image sample.
S2002,将所述证件样本集输入含有初始参数的深度学习的OCR模型。S2002: Input the document sample set into a deep learning OCR model containing initial parameters.
可理解地,所述初始参数可以根据需求进行设置,比如所述初始参数为随机赋予的参数值、或者所述初始参数为预设的参数值等等。Understandably, the initial parameters can be set according to requirements, for example, the initial parameters are randomly assigned parameter values, or the initial parameters are preset parameter values, and so on.
S2003,通过所述深度学习的OCR模型对所述证件样本进行OCR识别,获取所述深度学习的OCR模型输出的所述证件样本的训练识别结果。S2003: Perform OCR recognition on the credential sample through the deep learning OCR model, and obtain a training recognition result of the credential sample output by the deep learning OCR model.
可理解地,所述OCR识别为通过OCR(Optical Character Recognition,光学字符识别)技术把所述待证件印刷体的文字读取出来,所述训练识别结果包含所述证件样本中与证件相关的文本信息。Understandably, the OCR recognition is to read the printed text of the to-be-documented document through OCR (Optical Character Recognition) technology, and the training recognition result includes the document-related text in the document sample information.
S2004,将所述训练识别结果与所述样本标签进行匹配,获得所述证件样本的损失值。S2004: Match the training recognition result with the sample label to obtain the loss value of the certificate sample.
可理解地,将所述训练识别结果与所述样本标签输入所述深度学习的OCR模型中的损失函数中,通过所述损失函数计算出所述证件样本的所述损失值,所述损失值表明了所述训练识别结果与所述样本标签的差距,所述损失值越来越小,说明所述训练识别结果越来越靠近所述样本标签。Understandably, the training recognition result and the sample label are input into the loss function in the deep learning OCR model, and the loss value of the certificate sample is calculated by the loss function, and the loss value The difference between the training recognition result and the sample label is indicated, and the loss value is getting smaller and smaller, indicating that the training recognition result is getting closer and closer to the sample label.
S2005,在所述损失值达到预设的收敛条件时,将收敛之后的所述深度学习的OCR模型记录为训练完成的证件识别模型。S2005: When the loss value reaches a preset convergence condition, record the deep learning OCR model after convergence as a certificate recognition model completed by training.
可理解地,在所述损失值达到预设的收敛条件时,说明所述损失值已经达到最优的结果,即所述训练识别结果已经十分接近所述样本标签,此时所述深度学习的OCR模型已经收敛,将收敛之后的所述深度学习的OCR模型记录为训练完成的证件识别模型。Understandably, when the loss value reaches the preset convergence condition, it indicates that the loss value has reached the optimal result, that is, the training recognition result is already very close to the sample label, and the deep learning The OCR model has converged, and the deep learning OCR model after convergence is recorded as a certificate recognition model completed by training.
如此,根据所述证件样本集中的证件样本的样本标签,通过不断训练获得训练完成的训练所述证件识别模型,能够提升OCR识别结果的准确率和可靠性。In this way, according to the sample labels of the certificate samples in the certificate sample set, the trained certificate recognition model obtained through continuous training can improve the accuracy and reliability of the OCR recognition result.
在一实施例中,所述步骤S2004之后,即所述将所述训练识别结果与所述样本标签进行匹配,获得所述证件样本的损失值之后,还包括:In one embodiment, after step S2004, that is, after matching the training recognition result with the sample label, and obtaining the loss value of the certificate sample, the method further includes:
S2006,在所述损失值未达到预设的收敛条件时,迭代更新所述深度学习的OCR模型的初始参数,直至所述损失值达到所述预设的收敛条件时,将收敛之后的所述深度学习的OCR模型记录为训练完成的证件识别模型。S2006: When the loss value does not reach a preset convergence condition, iteratively update the initial parameters of the deep learning OCR model, until the loss value reaches the preset convergence condition, converge the The OCR model of deep learning is recorded as a certificate recognition model completed by training.
其中,所述收敛条件可以为所述损失值经过了8000次计算后值为很小且不会再下降的条件,即在所述损失值经过8000次计算后值为很小且不会再下降时,停止训练,并将收敛之后的所述深度学习的OCR模型记录为训练完成的证件识别模型;所述收敛条件也可以为所述损失值小于设定阈值的条件,即在所述损失值小于设定阈值时,停止训练,并将收敛之后的所述深度学习的OCR模型记录为训练完成的证件识别模型。Wherein, the convergence condition may be a condition that the value of the loss value is small and will not drop after 8000 calculations, that is, the value of the loss value is small and will not drop anymore after 8000 calculations Stop training, and record the deep learning OCR model after convergence as the certificate recognition model after training; the convergence condition can also be the condition that the loss value is less than the set threshold, that is, when the loss value When it is less than the set threshold, the training is stopped, and the OCR model of the deep learning after convergence is recorded as the certificate recognition model after the training.
如此,在所述损失值未达到预设的收敛条件时,不断更新迭代所述神经网络模型的初始参数,可以不断向准确的识别结果靠拢,让识别结果的准确率越来越高。In this way, when the loss value does not reach the preset convergence condition, the initial parameters of the iterative neural network model are continuously updated to continuously move closer to the accurate recognition result, so that the accuracy of the recognition result becomes higher and higher.
本申请通过将所述待检测图像输入至已训练完成的所述翻拍识别模型,输出所述待检测图像的识别结果,如此,本申请实现了快速地、准确地识别出翻拍图像,提高了识别的准确率和命中率,提升了识别效率和可靠性,节省了成本。The present application inputs the to-be-detected image to the trained remake recognition model, and outputs the recognition result of the to-be-detected image. In this way, the present application realizes the rapid and accurate recognition of the re-taken image, which improves the recognition. The accuracy and hit rate of, improve the efficiency and reliability of recognition, and save costs.
在一实施例中,提供一种OCR图像样本生成装置,该OCR图像样本生成装置与上述实施例中OCR图像样本生成方法一一对应。如图8所示,该OCR图像样本生成装置包括接收模块11、输入模块12、合成模块13和生成模块14。各功能模块详细说明如下:In one embodiment, an OCR image sample generating device is provided, and the OCR image sample generating device corresponds to the OCR image sample generating method in the above-mentioned embodiment in a one-to-one correspondence. As shown in FIG. 8, the OCR image sample generating device includes a receiving module 11, an input module 12, a synthesis module 13 and a generating module 14. The detailed description of each functional module is as follows:
接收模块11,用于接收图像生成指令,获取图像样本;所述图像样本与图像样本标签关联,所述图像样本标签包括第一字体标签、第一排版标签和第一纹理风格标签;The receiving module 11 is configured to receive an image generation instruction and obtain an image sample; the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label;
输入模块12,用于将所述图像样本输入预设的字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取所述字体排版生成模型识别出所述图像样本的第一标注信息,并且获取所述字体排版生成模型根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构生成的模拟结果;所述模拟结果包括模拟图像、第二字体标签、第二排版标签和第二标注信息;The input module 12 is configured to input the image sample into a preset font typesetting generation model, and by performing text detection and character recognition on the image sample, the font typesetting generation model recognizes the first annotation of the image sample Information, and obtain a simulation result generated by reconstruction of the font typesetting generation model according to the first font label, the first typesetting label, and the first annotation information; the simulation result includes a simulated image and a second font Label, second typesetting label and second annotation information;
合成模块13,用于将所述图像样本和所述模拟图像输入预设的风格合成模型,所述风格合成模型提取出风格特征和内容特征,所述风格合成模型根据所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果;所述合成结果包括合成图像和第一纹理风格标签;The synthesis module 13 is configured to input the image sample and the simulated image into a preset style synthesis model, the style synthesis model extracts style features and content characteristics, and the style synthesis model is based on the style features and the The content feature performs style transfer and synthesis on the simulated image to generate a synthesis result; the synthesis result includes the synthesized image and the first texture style label;
生成模块14,用于将所述第二字体标签、所述第二排版标签、所述第二标注信息和所述第一纹理风格标签标记为OCR图像样本标签,同时将所述合成图像记录为与所述图像样本对应的OCR图像样本,并将所述OCR图像样本与所述OCR图像样本标签关联。The generating module 14 is configured to mark the second font label, the second typesetting label, the second annotation information, and the first texture style label as OCR image sample labels, and at the same time record the composite image as An OCR image sample corresponding to the image sample, and associate the OCR image sample with the OCR image sample tag.
在一实施例中,所述输入模块12包括:In an embodiment, the input module 12 includes:
第一提取单元,用于通过所述字体排版生成模型对所述图像样本进行文本检测,同时提取出所述图像样本的文本特征,获取所述字体排版生成模型根据提取出的所述文本特征识别出的所述图像样本的区域结果;所述区域结果包括若干个含有文本的文本区域以及与每个所述文本区域关联的区域坐标;The first extraction unit is configured to perform text detection on the image sample through the font typesetting generation model, and at the same time extract the text features of the image sample, and obtain the font typesetting generation model to recognize the extracted text features The area result of the image sample obtained; the area result includes a number of text areas containing text and the area coordinates associated with each of the text areas;
第二提取单元,用于通过所述字体排版生成模型提取出每个所述文本区域的文字特征,获取所述字体排版生成模型根据提取出的每个所述文本区域的所述文字特征识别出的每个所述文本区域的文本内容;The second extraction unit is configured to extract the character features of each text area through the font typesetting generation model, and obtain the font typesetting generation model to recognize the extracted text features of each text area The text content of each of the text areas;
标记单元,用于将所述文本区域、与所述文本区域关联的所述区域坐标以及所述文本区域的所述文本内容记录为所述图像样本的第一信息,将所有所述第一信息标记为所述第一标注信息。The marking unit is configured to record the text area, the area coordinates associated with the text area, and the text content of the text area as the first information of the image sample, and to record all the first information Mark as the first marking information.
在一实施例中,所述输入模块12还包括:In an embodiment, the input module 12 further includes:
输入单元,用于将所述第一字体标签、所述第一排版标签和所述第一标注信息输入所述字体排版生成模型中的重构模型;所述重构模型为基于GAN模型进行训练完成;An input unit for inputting the first font label, the first typesetting label, and the first annotation information into a reconstruction model in the font typesetting generation model; the reconstruction model is trained based on a GAN model Finish;
重构单元,用于通过所述重构模型中的生成器进行组合重构,获取所述重构模型输出的所述模拟图像、第二字体标签、第二排版标签和第二标注信息;A reconstruction unit, configured to perform combined reconstruction through a generator in the reconstruction model to obtain the simulated image, the second font label, the second typesetting label, and the second annotation information output by the reconstruction model;
输出单元,用于将所述模拟图像、所述第二字体标签、所述第二排版标签和所述第二标注信息记录为所述字体排版生成模型输出的模拟结果。The output unit is configured to record the simulated image, the second font label, the second typesetting label, and the second annotation information as a simulation result output by the font typesetting generation model.
在一实施例中,所述合成模块13包括:In an embodiment, the synthesis module 13 includes:
获取单元,用于将所述模拟图像作为初始图像,获取所述初始图像的所有像素值;An acquiring unit, configured to use the simulated image as an initial image and acquire all pixel values of the initial image;
第一计算单元,用于通过所述风格合成模型提取出所述图像样本的风格特征以及所述初始图像的风格特征,并根据所述图像样本的风格特征和所述初始图像的风格特征计算得出风格损失值;The first calculation unit is configured to extract the style feature of the image sample and the style feature of the initial image through the style synthesis model, and calculate it according to the style feature of the image sample and the style feature of the initial image Out the style loss value;
第二计算单元,用于通过所述风格合成模型提取出所述图像样本的内容特征以及所述初始图像的内容特征,并根据所述图像样本的内容特征和所述初始图像的内容特征计算得出内容损失值;The second calculation unit is configured to extract the content feature of the image sample and the content feature of the initial image through the style synthesis model, and calculate it according to the content feature of the image sample and the content feature of the initial image The content loss value;
损失单元,用于将所述风格损失值和所述内容损失值进行加权处理得到总损失值;A loss unit for weighting the style loss value and the content loss value to obtain a total loss value;
训练单元,用于通过L-BFGS算法进行梯度下降,在所述总损失值未达到预设条件时, 迭代更新所述初始图像中的所有所述像素值,直至所述总损失值达到所述预设条件时,将更新后的所述初始图像确定为合成图像;The training unit is configured to perform gradient descent using the L-BFGS algorithm, and when the total loss value does not reach a preset condition, iteratively update all the pixel values in the initial image until the total loss value reaches the When the conditions are preset, the updated initial image is determined to be a composite image;
关联单元,用于将所述第一纹理风格标签与所述合成图像关联,并将所述合成图像和所述第一纹理风格标签确定为所述合成结果。The associating unit is configured to associate the first texture style label with the synthesized image, and determine the synthesized image and the first texture style label as the synthesis result.
关于OCR图像样本生成装置的具体限定可以参见上文中对于OCR图像样本生成方法的限定,在此不再赘述。上述OCR图像样本生成装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific limitation of the OCR image sample generating device, please refer to the above limitation on the OCR image sample generating method, which will not be repeated here. Each module in the above-mentioned OCR image sample generating device can be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
在一实施例中,提供一种印刷体验证装置,该印刷体验证装置与上述实施例中印刷体验证方法一一对应。如图9所示,该印刷体验证装置包括获取模块101、训练模块102、识别模块103、比对模块104、第一确定模块105和第二确定模块106。各功能模块详细说明如下:In one embodiment, a printed matter verification device is provided, and the printed matter verification device corresponds to the printed matter verification method in the above-mentioned embodiment one-to-one. As shown in FIG. 9, the printed body verification device includes an acquisition module 101, a training module 102, an identification module 103, a comparison module 104, a first determination module 105 and a second determination module 106. The detailed description of each functional module is as follows:
获取模块101,用于接收证件验证指令,获取待证件印刷体和验证信息;The obtaining module 101 is configured to receive a certificate verification instruction, and obtain the printed form and verification information of the to-be-certified certificate;
训练模块102,用于将所述待证件印刷体输入已训练完成的证件识别模型;所述证件识别模型通过如权利要求1至4任一项所述OCR图像样本生成方法生成的OCR图像样本训练完成;The training module 102 is configured to input the printed document to be trained into a document recognition model that has been trained; the document recognition model is trained by an OCR image sample generated by the OCR image sample generation method according to any one of claims 1 to 4 Finish;
识别模块103,用于通过所述证件识别模型对所述待证件印刷体进行OCR识别,获取所述证件识别模型输出的OCR识别结果,所述OCR识别结果包含所述待证件印刷体中与证件相关的文本信息;The recognition module 103 is configured to perform OCR recognition on the printed document to be documented through the document recognition model, and obtain the OCR recognition result output by the document recognition model. The OCR recognition result includes the printed document to be documented and the document Relevant text information;
比对模块104,用于将所述OCR识别结果与所述验证信息进行比对,确定所述待证件印刷体是否符合所述验证信息;The comparison module 104 is configured to compare the OCR recognition result with the verification information, and determine whether the printed document to be issued meets the verification information;
第一确定模块105,用于若所述待证件印刷体符合所述验证信息,确定验证通过;The first determining module 105 is configured to determine that the verification is passed if the printed document to be document meets the verification information;
第二确定模块106,用于若所述待证件印刷体不符合所述验证信息,确认验证不通过,并在显示界面提示。The second determining module 106 is configured to confirm that the verification is not passed if the printed document to be document does not meet the verification information, and prompt on the display interface.
关于印刷体验证装置的具体限定可以参见上文中对于印刷体验证方法的限定,在此不再赘述。上述印刷体验证装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific limitation of the printed body verification device, please refer to the above-mentioned limitation on the printed body verification method, which will not be repeated here. Each module in the above-mentioned printed matter verification device can be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图10所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种OCR图像样本生成方法,或者印刷体验证方法。In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 10. The computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer program is executed by the processor to realize an OCR image sample generation method or a printed body verification method.
在一个实施例中,提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,处理器执行计算机程序时实现上述实施例中OCR图像样本生成方法,或者处理器执行计算机程序时实现上述实施例中印刷体验证方法。In one embodiment, a computer device is provided, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor. The processor executes the computer program to implement the OCR image sample generation method in the above embodiment. , Or the processor implements the printed body verification method in the foregoing embodiment when the computer program is executed by the processor.
在一个实施例中,提供了一种计算机可读存储介质,所述计算机可读存储介质可以是非易失性,也可以是易失性,其上存储有计算机程序,计算机程序被处理器执行时实现上述实施例中OCR图像样本生成方法,或者计算机程序被处理器执行时实现上述实施例中印刷体验证方法。In one embodiment, a computer-readable storage medium is provided. The computer-readable storage medium may be non-volatile or volatile, and a computer program is stored thereon. When the computer program is executed by a processor, The OCR image sample generation method in the foregoing embodiment is implemented, or the printed body verification method in the foregoing embodiment is implemented when the computer program is executed by a processor.
进一步地,所述计算机可读存储介质可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等;存储数据区可存储根据区块 链节点的使用所创建的数据等。Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, an application program required by at least one function, etc.; the storage data area may store Data created by the use of nodes, etc.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through a computer program. The computer program can be stored in a non-volatile computer readable storage. In the medium, when the computer program is executed, it may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database, or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。Those skilled in the art can clearly understand that for the convenience and conciseness of description, only the division of the above functional units and modules is used as an example. In practical applications, the above functions can be allocated to different functional units and modules as required. Module completion, that is, the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之类。The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in The scope of protection of this application and so on.

Claims (20)

  1. 一种OCR图像样本生成方法,其中,包括:An OCR image sample generation method, which includes:
    接收图像生成指令,获取图像样本;所述图像样本与图像样本标签关联,所述图像样本标签包括第一字体标签、第一排版标签和第一纹理风格标签;Receiving an image generation instruction to obtain an image sample; the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label;
    将所述图像样本输入预设的字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取所述字体排版生成模型识别出所述图像样本的第一标注信息,并且获取所述字体排版生成模型根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构生成的模拟结果;所述模拟结果包括模拟图像、第二字体标签、第二排版标签和第二标注信息;Input the image sample into a preset font typesetting generation model, by performing text detection and character recognition on the image sample, the font typesetting generation model recognizes the first annotation information of the image sample, and obtains the The font typesetting generation model reconstructs and generates a simulation result according to the first font label, the first typesetting label, and the first annotation information; the simulation result includes a simulated image, a second font label, and a second typesetting label And the second label information;
    将所述图像样本和所述模拟图像输入预设的风格合成模型,所述风格合成模型提取出风格特征和内容特征,所述风格合成模型根据所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果;所述合成结果包括合成图像和第一纹理风格标签;The image sample and the simulated image are input into a preset style synthesis model, the style synthesis model extracts style features and content features, and the style synthesis model performs simulation on the simulation based on the style features and the content features. Image style transfer and synthesis are performed to generate a synthesis result; the synthesis result includes the synthesized image and the first texture style label;
    将所述第二字体标签、所述第二排版标签、所述第二标注信息和所述第一纹理风格标签标记为OCR图像样本标签,同时将所述合成图像记录为与所述图像样本对应的OCR图像样本,并将所述OCR图像样本与所述OCR图像样本标签关联。Mark the second font label, the second typesetting label, the second annotation information, and the first texture style label as OCR image sample labels, and at the same time record the composite image as corresponding to the image sample And associate the OCR image sample with the OCR image sample tag.
  2. 如权利要求1所述的OCR图像样本生成方法,其中,所述将所述图像样本输入预设的字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取所述字体排版生成模型识别出所述图像样本的第一标注信息,包括:The OCR image sample generation method according to claim 1, wherein the image sample is input into a preset font typesetting generation model, and the font typesetting generation is obtained by performing text detection and character recognition on the image sample The model recognizes the first annotation information of the image sample, including:
    通过所述字体排版生成模型对所述图像样本进行文本检测,同时提取出所述图像样本的文本特征,获取所述字体排版生成模型根据提取出的所述文本特征识别出的所述图像样本的区域结果;所述区域结果包括若干个含有文本的文本区域以及与每个所述文本区域关联的区域坐标;Perform text detection on the image sample through the font typesetting generation model, and at the same time extract the text features of the image sample, and obtain the image sample's value of the image sample identified by the font typesetting generation model according to the extracted text features Regional results; the regional results include a number of text regions containing text and region coordinates associated with each of the text regions;
    通过所述字体排版生成模型提取出每个所述文本区域的文字特征,获取所述字体排版生成模型根据提取出的每个所述文本区域的所述文字特征识别出的每个所述文本区域的文本内容;The text feature of each text area is extracted by the font typesetting generation model, and each text area recognized by the font typesetting generation model according to the extracted text feature of each text area is obtained The text content;
    将所述文本区域、与所述文本区域关联的所述区域坐标以及所述文本区域的所述文本内容记录为所述图像样本的第一信息,将所有所述第一信息标记为所述第一标注信息。The text area, the area coordinates associated with the text area, and the text content of the text area are recorded as the first information of the image sample, and all the first information is marked as the first information One label information.
  3. 如权利要求1所述的OCR图像样本生成方法,其中,所述获取所述字体排版生成模型根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构生成的模拟结果,包括:The OCR image sample generation method according to claim 1, wherein said acquiring said font typesetting generation model is reconstructed and generated according to said first font label, said first typesetting label and said first annotation information Simulation results, including:
    将所述第一字体标签、所述第一排版标签和所述第一标注信息输入所述字体排版生成模型中的重构模型;所述重构模型为基于GAN模型进行训练完成;Inputting the first font label, the first typesetting label, and the first annotation information into a reconstruction model in the font typesetting generation model; the reconstruction model is trained based on a GAN model;
    通过所述重构模型中的生成器进行组合重构,获取所述重构模型输出的所述模拟图像、第二字体标签、第二排版标签和第二标注信息;Performing combined reconstruction through a generator in the reconstruction model to obtain the simulated image, the second font label, the second typesetting label, and the second annotation information output by the reconstruction model;
    将所述模拟图像、所述第二字体标签、所述第二排版标签和所述第二标注信息记录为所述字体排版生成模型输出的模拟结果。The simulation image, the second font label, the second typesetting label, and the second annotation information are recorded as a simulation result output by the font typesetting generation model.
  4. 如权利要求1所述的OCR图像样本生成方法,其中,将所述图像样本和所述模拟图像输入预设的风格合成模型,所述风格合成模型提取出风格特征和内容特征,所述风格合成模型根据所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果,包括:The OCR image sample generation method of claim 1, wherein the image sample and the simulated image are input into a preset style synthesis model, and the style synthesis model extracts style features and content features, and the style synthesis The model performs style transfer and synthesis on the simulated image according to the style feature and the content feature to generate a synthesis result, including:
    将所述模拟图像作为初始图像,获取所述初始图像的所有像素值;Using the simulated image as an initial image, and acquiring all pixel values of the initial image;
    通过所述风格合成模型提取出所述图像样本的风格特征以及所述初始图像的风格特征,并根据所述图像样本的风格特征和所述初始图像的风格特征计算得出风格损失值;Extracting the style feature of the image sample and the style feature of the initial image through the style synthesis model, and calculating a style loss value according to the style feature of the image sample and the style feature of the initial image;
    通过所述风格合成模型提取出所述图像样本的内容特征以及所述初始图像的内容特 征,并根据所述图像样本的内容特征和所述初始图像的内容特征计算得出内容损失值;Extracting the content feature of the image sample and the content feature of the initial image through the style synthesis model, and calculating a content loss value based on the content feature of the image sample and the content feature of the initial image;
    将所述风格损失值和所述内容损失值进行加权处理得到总损失值;Weighting the style loss value and the content loss value to obtain a total loss value;
    通过L-BFGS算法进行梯度下降,在所述总损失值未达到预设条件时,迭代更新所述初始图像中的所有所述像素值,直至所述总损失值达到所述预设条件时,将更新后的所述初始图像确定为合成图像;Perform gradient descent through the L-BFGS algorithm. When the total loss value does not reach the preset condition, iteratively update all the pixel values in the initial image until the total loss value reaches the preset condition, Determining the updated initial image as a composite image;
    将所述第一纹理风格标签与所述合成图像关联,并将所述合成图像和所述第一纹理风格标签确定为所述合成结果。Associating the first texture style label with the synthesized image, and determining the synthesized image and the first texture style label as the synthesis result.
  5. 一种印刷体验证方法,其中,包括:A print verification method, which includes:
    接收证件验证指令,获取待证件印刷体和验证信息;Receive certificate verification instructions, obtain the printed version and verification information of the pending certificate;
    将所述待证件印刷体输入已训练完成的证件识别模型;所述证件识别模型通过如权利要求1至4任一项所述OCR图像样本生成方法生成的OCR图像样本训练完成;Input the printed document to be trained into the document recognition model that has been trained; the document recognition model is trained by the OCR image sample generated by the OCR image sample generation method according to any one of claims 1 to 4;
    通过所述证件识别模型对所述待证件印刷体进行OCR识别,获取所述证件识别模型输出的OCR识别结果,所述OCR识别结果包含所述待证件印刷体中与证件相关的文本信息;Perform OCR recognition on the printed document to be documented by the document recognition model, and obtain an OCR recognition result output by the document recognition model, where the OCR recognition result includes the document-related text information in the printed document to be documented;
    将所述OCR识别结果与所述验证信息进行比对,确定所述待证件印刷体是否符合所述验证信息;Comparing the OCR recognition result with the verification information to determine whether the printed document to be certified meets the verification information;
    若所述待证件印刷体符合所述验证信息,确定验证通过;If the printed version of the certificate to be issued meets the verification information, it is determined that the verification is passed;
    若所述待证件印刷体不符合所述验证信息,确认验证不通过,并在显示界面提示。If the printed body of the document to be issued does not conform to the verification information, confirm that the verification is not passed, and prompt on the display interface.
  6. 如权利要求5所述的印刷体验证方法,其中,所述将所述待证件印刷体输入已训练完成的证件识别模型之前,包括:5. The method for verification of printed matter according to claim 5, wherein, before inputting the printed matter of the to-be-documented document into a trained document recognition model, the method comprises:
    获取证件样本集;所述证件样本集包含若干证件样本,一个证件样本与一个样本标签关联;在所述证件样本为图像样本时,所述样本标签为图像样本标签;在所述证件样本为OCR图像样本时,所述样本标签为OCR图像样本标签;所述OCR图像样本通过如权利要求1至4任一项所述OCR图像样本生成方法生成;所述证件样本集中所述图像样本的数量小于所述OCR图像样本的数量;Obtain a certificate sample set; the certificate sample set contains several certificate samples, and one certificate sample is associated with a sample label; when the certificate sample is an image sample, the sample label is an image sample label; when the certificate sample is OCR In the case of an image sample, the sample label is an OCR image sample label; the OCR image sample is generated by the OCR image sample generation method according to any one of claims 1 to 4; the number of the image samples in the certificate sample set is less than The number of OCR image samples;
    将所述证件样本集输入含有初始参数的深度学习的OCR模型;Input the document sample set into a deep learning OCR model containing initial parameters;
    通过所述深度学习的OCR模型对所述证件样本进行OCR识别,获取所述深度学习的OCR模型输出的所述证件样本的训练识别结果;Performing OCR recognition on the certificate sample through the deep learning OCR model, and obtaining the training recognition result of the certificate sample output by the deep learning OCR model;
    将所述训练识别结果与所述样本标签进行匹配,获得所述证件样本的损失值;Matching the training recognition result with the sample label to obtain the loss value of the certificate sample;
    在所述损失值达到预设的收敛条件时,将收敛之后的所述深度学习的OCR模型记录为训练完成的证件识别模型。When the loss value reaches a preset convergence condition, the deep learning OCR model after convergence is recorded as a certificate recognition model completed by training.
  7. 一种OCR图像样本生成装置,其中,包括:An OCR image sample generating device, which includes:
    接收模块,用于接收图像生成指令,获取图像样本;所述图像样本与图像样本标签关联,所述图像样本标签包括第一字体标签、第一排版标签和第一纹理风格标签;A receiving module, configured to receive an image generation instruction and obtain an image sample; the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label;
    输入模块,用于将所述图像样本输入预设的字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取所述字体排版生成模型识别出所述图像样本的第一标注信息,并且获取所述字体排版生成模型根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构生成的模拟结果;所述模拟结果包括模拟图像、第二字体标签、第二排版标签和第二标注信息;The input module is used to input the image sample into a preset font typesetting generation model, and by performing text detection and character recognition on the image sample, to obtain the first annotation information of the image sample recognized by the font typesetting generation model , And obtain a simulation result generated by reconstruction of the font typesetting generation model according to the first font label, the first typesetting label, and the first annotation information; the simulation result includes a simulated image and a second font label , The second typesetting label and the second annotation information;
    合成模块,用于将所述图像样本和所述模拟图像输入预设的风格合成模型,所述风格合成模型提取出风格特征和内容特征,所述风格合成模型根据所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果;所述合成结果包括合成图像和第一纹理风格标签;The synthesis module is configured to input the image sample and the simulated image into a preset style synthesis model, the style synthesis model extracts style features and content features, and the style synthesis model is based on the style features and the content The feature performs style transfer and synthesis on the simulated image to generate a synthesis result; the synthesis result includes the synthesized image and the first texture style label;
    生成模块,用于将所述第二字体标签、所述第二排版标签、所述第二标注信息和所述第一纹理风格标签标记为OCR图像样本标签,同时将所述合成图像记录为与所述图像样 本对应的OCR图像样本,并将所述OCR图像样本与所述OCR图像样本标签关联。A generating module for marking the second font label, the second typesetting label, the second annotation information, and the first texture style label as OCR image sample labels, and at the same time recording the composite image as The OCR image sample corresponding to the image sample, and the OCR image sample is associated with the OCR image sample tag.
  8. 一种印刷体验证装置,其中,包括:A printed body verification device, which includes:
    获取模块,用于接收证件验证指令,获取待证件印刷体和验证信息;The acquisition module is used to receive the certificate verification instruction, and obtain the printed version and verification information of the certificate;
    训练模块,用于将所述待证件印刷体输入已训练完成的证件识别模型;所述证件识别模型通过如权利要求1至4任一项所述OCR图像样本生成方法生成的OCR图像样本训练完成;The training module is used to input the printed document to be trained into the document recognition model that has been trained; the document recognition model is trained through the OCR image sample generated by the OCR image sample generation method according to any one of claims 1 to 4 ;
    识别模块,用于通过所述证件识别模型对所述待证件印刷体进行OCR识别,获取所述证件识别模型输出的OCR识别结果,所述OCR识别结果包含所述待证件印刷体中与证件相关的文本信息;The recognition module is configured to perform OCR recognition on the printed document to be documented through the document recognition model, and obtain the OCR recognition result output by the document recognition model. The OCR recognition result includes the document-related printed document to be documented Text information;
    比对模块,用于将所述OCR识别结果与所述验证信息进行比对,确定所述待证件印刷体是否符合所述验证信息;The comparison module is configured to compare the OCR recognition result with the verification information, and determine whether the printed document to be document meets the verification information;
    第一确定模块,用于若所述待证件印刷体符合所述验证信息,确定验证通过;The first determining module is configured to determine that the verification is passed if the printed document to be document meets the verification information;
    第二确定模块,用于若所述待证件印刷体不符合所述验证信息,确认验证不通过,并在显示界面提示。The second determination module is configured to confirm that the verification is not passed if the printed document to be document does not meet the verification information, and prompt on the display interface.
  9. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如下步骤:A computer device includes a memory, a processor, and a computer program that is stored in the memory and can run on the processor, wherein the processor implements the following steps when the processor executes the computer program:
    接收图像生成指令,获取图像样本;所述图像样本与图像样本标签关联,所述图像样本标签包括第一字体标签、第一排版标签和第一纹理风格标签;Receiving an image generation instruction to obtain an image sample; the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label;
    将所述图像样本输入预设的字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取所述字体排版生成模型识别出所述图像样本的第一标注信息,并且获取所述字体排版生成模型根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构生成的模拟结果;所述模拟结果包括模拟图像、第二字体标签、第二排版标签和第二标注信息;Input the image sample into a preset font typesetting generation model, by performing text detection and character recognition on the image sample, the font typesetting generation model recognizes the first annotation information of the image sample, and obtains the The font typesetting generation model reconstructs and generates a simulation result according to the first font label, the first typesetting label, and the first annotation information; the simulation result includes a simulated image, a second font label, and a second typesetting label And the second label information;
    将所述图像样本和所述模拟图像输入预设的风格合成模型,所述风格合成模型提取出风格特征和内容特征,所述风格合成模型根据所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果;所述合成结果包括合成图像和第一纹理风格标签;The image sample and the simulated image are input into a preset style synthesis model, the style synthesis model extracts style features and content features, and the style synthesis model performs simulation on the simulation based on the style features and the content features. Image style transfer and synthesis are performed to generate a synthesis result; the synthesis result includes the synthesized image and the first texture style label;
    将所述第二字体标签、所述第二排版标签、所述第二标注信息和所述第一纹理风格标签标记为OCR图像样本标签,同时将所述合成图像记录为与所述图像样本对应的OCR图像样本,并将所述OCR图像样本与所述OCR图像样本标签关联。Mark the second font label, the second typesetting label, the second annotation information, and the first texture style label as OCR image sample labels, and at the same time record the composite image as corresponding to the image sample And associate the OCR image sample with the OCR image sample tag.
  10. 如权利要求9所述的计算机设备,其中,所述将所述图像样本输入预设的字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取所述字体排版生成模型识别出所述图像样本的第一标注信息,包括:The computer device according to claim 9, wherein said inputting said image sample into a preset font typesetting generation model, and obtaining said font typesetting generation model by performing text detection and character recognition on said image sample The first annotation information of the image sample includes:
    通过所述字体排版生成模型对所述图像样本进行文本检测,同时提取出所述图像样本的文本特征,获取所述字体排版生成模型根据提取出的所述文本特征识别出的所述图像样本的区域结果;所述区域结果包括若干个含有文本的文本区域以及与每个所述文本区域关联的区域坐标;Perform text detection on the image sample through the font typesetting generation model, and at the same time extract the text features of the image sample, and obtain the image sample's value of the image sample identified by the font typesetting generation model according to the extracted text features Regional results; the regional results include a number of text regions containing text and region coordinates associated with each of the text regions;
    通过所述字体排版生成模型提取出每个所述文本区域的文字特征,获取所述字体排版生成模型根据提取出的每个所述文本区域的所述文字特征识别出的每个所述文本区域的文本内容;The text feature of each text area is extracted by the font typesetting generation model, and each text area recognized by the font typesetting generation model according to the extracted text feature of each text area is obtained The text content;
    将所述文本区域、与所述文本区域关联的所述区域坐标以及所述文本区域的所述文本内容记录为所述图像样本的第一信息,将所有所述第一信息标记为所述第一标注信息。The text area, the area coordinates associated with the text area, and the text content of the text area are recorded as the first information of the image sample, and all the first information is marked as the first information One label information.
  11. 如权利要求9所述的计算机设备,其中,所述获取所述字体排版生成模型根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构生成的模拟结果,包括:9. The computer device according to claim 9, wherein said obtaining a simulation result generated by reconstruction of said font typesetting generation model according to said first font label, said first typesetting label and said first annotation information, include:
    将所述第一字体标签、所述第一排版标签和所述第一标注信息输入所述字体排版生成模型中的重构模型;所述重构模型为基于GAN模型进行训练完成;Inputting the first font label, the first typesetting label, and the first annotation information into a reconstruction model in the font typesetting generation model; the reconstruction model is trained based on a GAN model;
    通过所述重构模型中的生成器进行组合重构,获取所述重构模型输出的所述模拟图像、第二字体标签、第二排版标签和第二标注信息;Performing combined reconstruction through a generator in the reconstruction model to obtain the simulated image, the second font label, the second typesetting label, and the second annotation information output by the reconstruction model;
    将所述模拟图像、所述第二字体标签、所述第二排版标签和所述第二标注信息记录为所述字体排版生成模型输出的模拟结果。The simulation image, the second font label, the second typesetting label, and the second annotation information are recorded as a simulation result output by the font typesetting generation model.
  12. 如权利要求9所述的计算机设备,其中,将所述图像样本和所述模拟图像输入预设的风格合成模型,所述风格合成模型提取出风格特征和内容特征,所述风格合成模型根据所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果,包括:The computer device according to claim 9, wherein the image sample and the simulated image are input into a preset style synthesis model, the style synthesis model extracts style features and content characteristics, and the style synthesis model is based on the The style feature and the content feature perform style transfer and synthesis on the simulated image to generate a synthesis result, including:
    将所述模拟图像作为初始图像,获取所述初始图像的所有像素值;Using the simulated image as an initial image, and acquiring all pixel values of the initial image;
    通过所述风格合成模型提取出所述图像样本的风格特征以及所述初始图像的风格特征,并根据所述图像样本的风格特征和所述初始图像的风格特征计算得出风格损失值;Extracting the style feature of the image sample and the style feature of the initial image through the style synthesis model, and calculating a style loss value according to the style feature of the image sample and the style feature of the initial image;
    通过所述风格合成模型提取出所述图像样本的内容特征以及所述初始图像的内容特征,并根据所述图像样本的内容特征和所述初始图像的内容特征计算得出内容损失值;Extracting the content feature of the image sample and the content feature of the initial image through the style synthesis model, and calculating a content loss value according to the content feature of the image sample and the content feature of the initial image;
    将所述风格损失值和所述内容损失值进行加权处理得到总损失值;Weighting the style loss value and the content loss value to obtain a total loss value;
    通过L-BFGS算法进行梯度下降,在所述总损失值未达到预设条件时,迭代更新所述初始图像中的所有所述像素值,直至所述总损失值达到所述预设条件时,将更新后的所述初始图像确定为合成图像;Gradient descent is performed through the L-BFGS algorithm. When the total loss value does not reach the preset condition, all the pixel values in the initial image are iteratively updated until the total loss value reaches the preset condition, Determining the updated initial image as a composite image;
    将所述第一纹理风格标签与所述合成图像关联,并将所述合成图像和所述第一纹理风格标签确定为所述合成结果。Associating the first texture style label with the synthesized image, and determining the synthesized image and the first texture style label as the synthesis result.
  13. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如下步骤:A computer device includes a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor implements the following steps when the processor executes the computer program:
    接收证件验证指令,获取待证件印刷体和验证信息;Receive certificate verification instructions, obtain the printed version and verification information of the pending certificate;
    将所述待证件印刷体输入已训练完成的证件识别模型;所述证件识别模型通过如权利要求1至4任一项所述OCR图像样本生成方法生成的OCR图像样本训练完成;Input the printed document to be trained into the document recognition model that has been trained; the document recognition model is trained through the OCR image sample generated by the OCR image sample generation method according to any one of claims 1 to 4;
    通过所述证件识别模型对所述待证件印刷体进行OCR识别,获取所述证件识别模型输出的OCR识别结果,所述OCR识别结果包含所述待证件印刷体中与证件相关的文本信息;Perform OCR recognition on the printed document to be documented by the document recognition model, and obtain an OCR recognition result output by the document recognition model, where the OCR recognition result includes the document-related text information in the printed document to be documented;
    将所述OCR识别结果与所述验证信息进行比对,确定所述待证件印刷体是否符合所述验证信息;Comparing the OCR recognition result with the verification information to determine whether the printed document to be certified meets the verification information;
    若所述待证件印刷体符合所述验证信息,确定验证通过;If the printed version of the certificate to be issued meets the verification information, it is determined that the verification is passed;
    若所述待证件印刷体不符合所述验证信息,确认验证不通过,并在显示界面提示。If the printed body of the document to be issued does not conform to the verification information, confirm that the verification is not passed, and prompt on the display interface.
  14. 如权利要求13所述的计算机设备,其中,在所述将所述待证件印刷体输入已训练完成的证件识别模型之前,所述处理器执行所述计算机程序时还实现如下步骤:The computer device according to claim 13, wherein, before inputting the printed document to be into the trained document recognition model, the processor further implements the following steps when executing the computer program:
    获取证件样本集;所述证件样本集包含若干证件样本,一个证件样本与一个样本标签关联;在所述证件样本为图像样本时,所述样本标签为图像样本标签;在所述证件样本为OCR图像样本时,所述样本标签为OCR图像样本标签;所述OCR图像样本通过如权利要求1至4任一项所述OCR图像样本生成方法生成;所述证件样本集中所述图像样本的数量小于所述OCR图像样本的数量;Obtain a certificate sample set; the certificate sample set contains several certificate samples, and one certificate sample is associated with a sample label; when the certificate sample is an image sample, the sample label is an image sample label; when the certificate sample is OCR In the case of an image sample, the sample label is an OCR image sample label; the OCR image sample is generated by the OCR image sample generation method according to any one of claims 1 to 4; the number of the image samples in the certificate sample set is less than The number of OCR image samples;
    将所述证件样本集输入含有初始参数的深度学习的OCR模型;Input the document sample set into a deep learning OCR model containing initial parameters;
    通过所述深度学习的OCR模型对所述证件样本进行OCR识别,获取所述深度学习的OCR模型输出的所述证件样本的训练识别结果;Performing OCR recognition on the certificate sample through the deep learning OCR model, and obtaining the training recognition result of the certificate sample output by the deep learning OCR model;
    将所述训练识别结果与所述样本标签进行匹配,获得所述证件样本的损失值;Matching the training recognition result with the sample label to obtain the loss value of the certificate sample;
    在所述损失值达到预设的收敛条件时,将收敛之后的所述深度学习的OCR模型记录为训练完成的证件识别模型。When the loss value reaches a preset convergence condition, the deep learning OCR model after convergence is recorded as a certificate recognition model completed by training.
  15. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,其中,所述计算机程序被处理器执行时实现下步骤:A computer-readable storage medium having a computer program stored on the computer-readable storage medium, wherein the computer program is executed by a processor to implement the following steps:
    接收图像生成指令,获取图像样本;所述图像样本与图像样本标签关联,所述图像样本标签包括第一字体标签、第一排版标签和第一纹理风格标签;Receiving an image generation instruction to obtain an image sample; the image sample is associated with an image sample label, and the image sample label includes a first font label, a first typesetting label, and a first texture style label;
    将所述图像样本输入预设的字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取所述字体排版生成模型识别出所述图像样本的第一标注信息,并且获取所述字体排版生成模型根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构生成的模拟结果;所述模拟结果包括模拟图像、第二字体标签、第二排版标签和第二标注信息;Input the image sample into a preset font typesetting generation model, by performing text detection and character recognition on the image sample, the font typesetting generation model recognizes the first annotation information of the image sample, and obtains the The font typesetting generation model reconstructs and generates a simulation result according to the first font label, the first typesetting label, and the first annotation information; the simulation result includes a simulated image, a second font label, and a second typesetting label And the second label information;
    将所述图像样本和所述模拟图像输入预设的风格合成模型,所述风格合成模型提取出风格特征和内容特征,所述风格合成模型根据所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果;所述合成结果包括合成图像和第一纹理风格标签;The image sample and the simulated image are input into a preset style synthesis model, the style synthesis model extracts style features and content features, and the style synthesis model performs simulation on the simulation based on the style features and the content features. Image style transfer and synthesis are performed to generate a synthesis result; the synthesis result includes the synthesized image and the first texture style label;
    将所述第二字体标签、所述第二排版标签、所述第二标注信息和所述第一纹理风格标签标记为OCR图像样本标签,同时将所述合成图像记录为与所述图像样本对应的OCR图像样本,并将所述OCR图像样本与所述OCR图像样本标签关联。Mark the second font label, the second typesetting label, the second annotation information, and the first texture style label as OCR image sample labels, and at the same time record the composite image as corresponding to the image sample And associate the OCR image sample with the OCR image sample tag.
  16. 如权利要求15所述的计算机可读存储介质,其中,所述将所述图像样本输入预设的字体排版生成模型,通过对所述图像样本进行文本检测和文字识别,获取所述字体排版生成模型识别出所述图像样本的第一标注信息,包括:The computer-readable storage medium according to claim 15, wherein said inputting said image sample into a preset font typesetting generation model, and obtaining said font typesetting generation by performing text detection and character recognition on said image sample The model recognizes the first annotation information of the image sample, including:
    通过所述字体排版生成模型对所述图像样本进行文本检测,同时提取出所述图像样本的文本特征,获取所述字体排版生成模型根据提取出的所述文本特征识别出的所述图像样本的区域结果;所述区域结果包括若干个含有文本的文本区域以及与每个所述文本区域关联的区域坐标;Perform text detection on the image sample through the font typesetting generation model, and at the same time extract the text features of the image sample, and obtain the image sample's value of the image sample identified by the font typesetting generation model according to the extracted text features Regional results; the regional results include a number of text regions containing text and region coordinates associated with each of the text regions;
    通过所述字体排版生成模型提取出每个所述文本区域的文字特征,获取所述字体排版生成模型根据提取出的每个所述文本区域的所述文字特征识别出的每个所述文本区域的文本内容;The text feature of each text area is extracted by the font typesetting generation model, and each text area recognized by the font typesetting generation model according to the extracted text feature of each text area is obtained The text content;
    将所述文本区域、与所述文本区域关联的所述区域坐标以及所述文本区域的所述文本内容记录为所述图像样本的第一信息,将所有所述第一信息标记为所述第一标注信息。The text area, the area coordinates associated with the text area, and the text content of the text area are recorded as the first information of the image sample, and all the first information is marked as the first information One label information.
  17. 如权利要求15所述的计算机可读存储介质,其中,所述获取所述字体排版生成模型根据所述第一字体标签、所述第一排版标签和所述第一标注信息进行重构生成的模拟结果,包括:The computer-readable storage medium according to claim 15, wherein said acquiring said font typesetting generation model is reconstructed and generated according to said first font label, said first typesetting label and said first annotation information Simulation results, including:
    将所述第一字体标签、所述第一排版标签和所述第一标注信息输入所述字体排版生成模型中的重构模型;所述重构模型为基于GAN模型进行训练完成;Inputting the first font label, the first typesetting label, and the first annotation information into a reconstruction model in the font typesetting generation model; the reconstruction model is trained based on a GAN model;
    通过所述重构模型中的生成器进行组合重构,获取所述重构模型输出的所述模拟图像、第二字体标签、第二排版标签和第二标注信息;Performing combined reconstruction through a generator in the reconstruction model to obtain the simulated image, the second font label, the second typesetting label, and the second annotation information output by the reconstruction model;
    将所述模拟图像、所述第二字体标签、所述第二排版标签和所述第二标注信息记录为所述字体排版生成模型输出的模拟结果。The simulation image, the second font label, the second typesetting label, and the second annotation information are recorded as a simulation result output by the font typesetting generation model.
  18. 如权利要求15所述的计算机可读存储介质,其中,将所述图像样本和所述模拟图像输入预设的风格合成模型,所述风格合成模型提取出风格特征和内容特征,所述风格合成模型根据所述风格特征和所述内容特征对所述模拟图像进行风格迁移及合成,生成合成结果,包括:The computer-readable storage medium of claim 15, wherein the image sample and the simulated image are input into a preset style synthesis model, the style synthesis model extracts style features and content characteristics, and the style synthesis The model performs style transfer and synthesis on the simulated image according to the style feature and the content feature to generate a synthesis result, including:
    将所述模拟图像作为初始图像,获取所述初始图像的所有像素值;Using the simulated image as an initial image, and acquiring all pixel values of the initial image;
    通过所述风格合成模型提取出所述图像样本的风格特征以及所述初始图像的风格特征,并根据所述图像样本的风格特征和所述初始图像的风格特征计算得出风格损失值;Extracting the style feature of the image sample and the style feature of the initial image through the style synthesis model, and calculating a style loss value according to the style feature of the image sample and the style feature of the initial image;
    通过所述风格合成模型提取出所述图像样本的内容特征以及所述初始图像的内容特征,并根据所述图像样本的内容特征和所述初始图像的内容特征计算得出内容损失值;Extracting the content feature of the image sample and the content feature of the initial image through the style synthesis model, and calculating a content loss value according to the content feature of the image sample and the content feature of the initial image;
    将所述风格损失值和所述内容损失值进行加权处理得到总损失值;Weighting the style loss value and the content loss value to obtain a total loss value;
    通过L-BFGS算法进行梯度下降,在所述总损失值未达到预设条件时,迭代更新所述 初始图像中的所有所述像素值,直至所述总损失值达到所述预设条件时,将更新后的所述初始图像确定为合成图像;Perform gradient descent through the L-BFGS algorithm. When the total loss value does not reach the preset condition, iteratively update all the pixel values in the initial image until the total loss value reaches the preset condition, Determining the updated initial image as a composite image;
    将所述第一纹理风格标签与所述合成图像关联,并将所述合成图像和所述第一纹理风格标签确定为所述合成结果。Associating the first texture style label with the synthesized image, and determining the synthesized image and the first texture style label as the synthesis result.
  19. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,其中,所述计算机程序被处理器执行时实现下步骤:A computer-readable storage medium having a computer program stored on the computer-readable storage medium, wherein the computer program is executed by a processor to implement the following steps:
    接收证件验证指令,获取待证件印刷体和验证信息;Receive certificate verification instructions, obtain the printed version and verification information of the pending certificate;
    将所述待证件印刷体输入已训练完成的证件识别模型;所述证件识别模型通过如权利要求1至4任一项所述OCR图像样本生成方法生成的OCR图像样本训练完成;Input the printed document to be trained into the document recognition model that has been trained; the document recognition model is trained by the OCR image sample generated by the OCR image sample generation method according to any one of claims 1 to 4;
    通过所述证件识别模型对所述待证件印刷体进行OCR识别,获取所述证件识别模型输出的OCR识别结果,所述OCR识别结果包含所述待证件印刷体中与证件相关的文本信息;Perform OCR recognition on the printed document to be documented by the document recognition model, and obtain an OCR recognition result output by the document recognition model, where the OCR recognition result includes the document-related text information in the printed document to be documented;
    将所述OCR识别结果与所述验证信息进行比对,确定所述待证件印刷体是否符合所述验证信息;Comparing the OCR recognition result with the verification information to determine whether the printed document to be certified meets the verification information;
    若所述待证件印刷体符合所述验证信息,确定验证通过;If the printed version of the certificate to be issued meets the verification information, it is determined that the verification is passed;
    若所述待证件印刷体不符合所述验证信息,确认验证不通过,并在显示界面提示。If the printed body of the document to be issued does not conform to the verification information, confirm that the verification is not passed, and prompt on the display interface.
  20. 如权利要求19所述的计算机可读存储介质,其中,在所述将所述待证件印刷体输入已训练完成的证件识别模型之前,所述计算机程序被处理器执行时还实现下步骤:19. The computer-readable storage medium of claim 19, wherein, before the printed document to be entered into the trained document recognition model, when the computer program is executed by the processor, the following steps are further implemented:
    获取证件样本集;所述证件样本集包含若干证件样本,一个证件样本与一个样本标签关联;在所述证件样本为图像样本时,所述样本标签为图像样本标签;在所述证件样本为OCR图像样本时,所述样本标签为OCR图像样本标签;所述OCR图像样本通过如权利要求1至4任一项所述OCR图像样本生成方法生成;所述证件样本集中所述图像样本的数量小于所述OCR图像样本的数量;Obtain a certificate sample set; the certificate sample set contains several certificate samples, and one certificate sample is associated with a sample label; when the certificate sample is an image sample, the sample label is an image sample label; when the certificate sample is OCR In the case of an image sample, the sample label is an OCR image sample label; the OCR image sample is generated by the OCR image sample generation method according to any one of claims 1 to 4; the number of the image samples in the certificate sample set is less than The number of OCR image samples;
    将所述证件样本集输入含有初始参数的深度学习的OCR模型;Input the document sample set into a deep learning OCR model containing initial parameters;
    通过所述深度学习的OCR模型对所述证件样本进行OCR识别,获取所述深度学习的OCR模型输出的所述证件样本的训练识别结果;Performing OCR recognition on the certificate sample through the deep learning OCR model, and obtaining the training recognition result of the certificate sample output by the deep learning OCR model;
    将所述训练识别结果与所述样本标签进行匹配,获得所述证件样本的损失值;Matching the training recognition result with the sample label to obtain the loss value of the certificate sample;
    在所述损失值达到预设的收敛条件时,将收敛之后的所述深度学习的OCR模型记录为训练完成的证件识别模型。When the loss value reaches a preset convergence condition, the deep learning OCR model after convergence is recorded as a certificate recognition model completed by training.
PCT/CN2020/099064 2020-04-24 2020-06-30 Ocr image sample generation method and apparatus, print font verification method and apparatus, and device and medium WO2021212658A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010333257.4 2020-04-24
CN202010333257.4A CN111626124A (en) 2020-04-24 2020-04-24 OCR image sample generation method, OCR image sample generation device, OCR image sample printing body verification equipment and OCR image sample printing body verification medium

Publications (1)

Publication Number Publication Date
WO2021212658A1 true WO2021212658A1 (en) 2021-10-28

Family

ID=72270828

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/099064 WO2021212658A1 (en) 2020-04-24 2020-06-30 Ocr image sample generation method and apparatus, print font verification method and apparatus, and device and medium

Country Status (2)

Country Link
CN (1) CN111626124A (en)
WO (1) WO2021212658A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114332514A (en) * 2022-03-17 2022-04-12 北京许先网科技发展有限公司 Font evaluation method and system
CN115035360A (en) * 2021-11-22 2022-09-09 荣耀终端有限公司 Character recognition method for image, electronic device and storage medium
CN115297106A (en) * 2022-07-22 2022-11-04 江西五十铃汽车有限公司 Method and system for printing vehicle-mounted certificate and uploading information of motor vehicle

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112508000B (en) * 2020-11-26 2023-04-07 上海展湾信息科技有限公司 Method and equipment for generating OCR image recognition model training data
CN112613572B (en) * 2020-12-30 2024-01-23 北京奇艺世纪科技有限公司 Sample data obtaining method and device, electronic equipment and storage medium
CN112528998B (en) * 2021-02-18 2021-06-01 成都新希望金融信息有限公司 Certificate image processing method and device, electronic equipment and readable storage medium
CN112766268A (en) * 2021-03-02 2021-05-07 阳光财产保险股份有限公司 Text label generation method and device, electronic equipment and storage medium
CN112966685B (en) * 2021-03-23 2024-04-19 深圳赛安特技术服务有限公司 Attack network training method and device for scene text recognition and related equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060056697A1 (en) * 2004-08-13 2006-03-16 Fujitsu Limited Degraded character image generation method and apparatus
CN109241894A (en) * 2018-08-28 2019-01-18 南京安链数据科技有限公司 A kind of specific aim ticket contents identifying system and method based on form locating and deep learning
CN109272043A (en) * 2018-09-21 2019-01-25 北京京东金融科技控股有限公司 Training data generation method, system and electronic equipment for optical character identification
CN109711396A (en) * 2018-11-12 2019-05-03 平安科技(深圳)有限公司 Generation method, device, equipment and the readable storage medium storing program for executing of OCR training sample
CN109948549A (en) * 2019-03-20 2019-06-28 深圳市华付信息技术有限公司 OCR data creation method, device, computer equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109492643B (en) * 2018-10-11 2023-12-19 平安科技(深圳)有限公司 Certificate identification method and device based on OCR, computer equipment and storage medium
CN109902678A (en) * 2019-02-12 2019-06-18 北京奇艺世纪科技有限公司 Model training method, character recognition method, device, electronic equipment and computer-readable medium
CN110246198B (en) * 2019-05-21 2023-05-02 北京奇艺世纪科技有限公司 Method and device for generating character selection verification code, electronic equipment and storage medium
CN110458906B (en) * 2019-06-26 2024-03-15 广州大鱼创福科技有限公司 Medical image coloring method based on depth color migration
CN110659646A (en) * 2019-08-21 2020-01-07 北京三快在线科技有限公司 Automatic multitask certificate image processing method, device, equipment and readable storage medium
CN110796583A (en) * 2019-10-25 2020-02-14 南京航空航天大学 Stylized visible watermark adding method
CN110942062B (en) * 2019-11-21 2022-12-23 杭州网易智企科技有限公司 Image verification code generation method, medium, device and computing equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060056697A1 (en) * 2004-08-13 2006-03-16 Fujitsu Limited Degraded character image generation method and apparatus
CN109241894A (en) * 2018-08-28 2019-01-18 南京安链数据科技有限公司 A kind of specific aim ticket contents identifying system and method based on form locating and deep learning
CN109272043A (en) * 2018-09-21 2019-01-25 北京京东金融科技控股有限公司 Training data generation method, system and electronic equipment for optical character identification
CN109711396A (en) * 2018-11-12 2019-05-03 平安科技(深圳)有限公司 Generation method, device, equipment and the readable storage medium storing program for executing of OCR training sample
CN109948549A (en) * 2019-03-20 2019-06-28 深圳市华付信息技术有限公司 OCR data creation method, device, computer equipment and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115035360A (en) * 2021-11-22 2022-09-09 荣耀终端有限公司 Character recognition method for image, electronic device and storage medium
CN114332514A (en) * 2022-03-17 2022-04-12 北京许先网科技发展有限公司 Font evaluation method and system
CN114332514B (en) * 2022-03-17 2022-06-07 北京许先网科技发展有限公司 Font evaluation method and system
CN115297106A (en) * 2022-07-22 2022-11-04 江西五十铃汽车有限公司 Method and system for printing vehicle-mounted certificate and uploading information of motor vehicle
CN115297106B (en) * 2022-07-22 2024-03-01 江西五十铃汽车有限公司 Method and system for printing and uploading information of motor vehicle-mounted certificate

Also Published As

Publication number Publication date
CN111626124A (en) 2020-09-04

Similar Documents

Publication Publication Date Title
WO2021212658A1 (en) Ocr image sample generation method and apparatus, print font verification method and apparatus, and device and medium
WO2021135499A1 (en) Damage detection model training and vehicle damage detection methods, device, apparatus, and medium
US11210510B2 (en) Storing anonymized identifiers instead of personally identifiable information
CN109492643B (en) Certificate identification method and device based on OCR, computer equipment and storage medium
CN111914597B (en) Document comparison identification method and device, electronic equipment and readable storage medium
CN110728687B (en) File image segmentation method and device, computer equipment and storage medium
CN112183296B (en) Simulated bill image generation and bill image recognition method and device
CN113111880B (en) Certificate image correction method, device, electronic equipment and storage medium
JP2019079347A (en) Character estimation system, character estimation method, and character estimation program
CN113837151A (en) Table image processing method and device, computer equipment and readable storage medium
CN116229494A (en) License key information extraction method based on small sample data
CN113159013A (en) Paragraph identification method and device based on machine learning, computer equipment and medium
CN112396047B (en) Training sample generation method and device, computer equipment and storage medium
CN117115823A (en) Tamper identification method and device, computer equipment and storage medium
US20200294410A1 (en) Methods, systems, apparatuses and devices for facilitating grading of handwritten sheets
CN112801099A (en) Image processing method, device, terminal equipment and medium
CN112989820B (en) Legal document positioning method, device, equipment and storage medium
CN111612045B (en) Universal method for acquiring target detection data set
CN114550189A (en) Bill recognition method, device, equipment, computer storage medium and program product
CN113705749A (en) Two-dimensional code identification method, device and equipment based on deep learning and storage medium
CN113807218A (en) Layout analysis method, layout analysis device, computer equipment and storage medium
TWI807467B (en) Key-item detection model building method, business-oriented key-value identification system and method
Das et al. Enhancement of identification accuracy by handling outlier feature values within a signature case base
JP2019101647A (en) Information processing device, control method therefor, and program
JP6926279B1 (en) Learning device, recognition device, learning method, recognition method, program, and recurrent neural network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20932264

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17/02/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20932264

Country of ref document: EP

Kind code of ref document: A1