CN116959004A - Handwritten signature recognition method, handwritten signature recognition device, electronic equipment and computer program product - Google Patents
Handwritten signature recognition method, handwritten signature recognition device, electronic equipment and computer program product Download PDFInfo
- Publication number
- CN116959004A CN116959004A CN202310864798.3A CN202310864798A CN116959004A CN 116959004 A CN116959004 A CN 116959004A CN 202310864798 A CN202310864798 A CN 202310864798A CN 116959004 A CN116959004 A CN 116959004A
- Authority
- CN
- China
- Prior art keywords
- path
- signature
- handwritten signature
- vector image
- file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000004590 computer program Methods 0.000 title claims abstract description 21
- 238000012545 processing Methods 0.000 claims abstract description 14
- 238000010276 construction Methods 0.000 claims description 11
- 238000000605 extraction Methods 0.000 claims description 6
- 238000009877 rendering Methods 0.000 claims description 6
- 238000004891 communication Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000013459 approach Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 3
- 239000003086 colorant Substances 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/22—Character recognition characterised by the type of writing
- G06V30/226—Character recognition characterised by the type of writing of cursive writing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/18—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/18—Extraction of features or characteristics of the image
- G06V30/1801—Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Collating Specific Patterns (AREA)
Abstract
The application relates to the technical field of data processing, and provides a handwriting signature recognition method, a handwriting signature recognition device, electronic equipment and a computer program product. The method comprises the following steps: extracting a vector image object of a file to be identified based on a file format of the file to be identified, wherein the vector image object comprises characteristic information of a handwriting signature; constructing a path object based on the vector image object, wherein the path object is used for representing a path or a contour of an image; and drawing the path object, and cutting the drawn path object to obtain a handwritten signature picture. According to the embodiment of the application, the path object is constructed, drawn and cut based on the vector image object by extracting the vector image object, so that the handwritten signature picture is obtained, and based on the path object, when the handwritten signature background has the watermark or other characters and other interference factors exist on the overlapped part of document files, the intercepted handwritten signature picture has strong interference resistance, and the auditing accuracy and the handwritten signature recognition accuracy are improved.
Description
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method and apparatus for identifying a handwritten signature, an electronic device, and a computer program product.
Background
In the auditing service of the paperless electronic bill of the operator service, the positioning identification and the extraction and the preservation of the handwritten electronic signature of the customer in the service bill are related to the identity authentication of the customer body of the bill service, which is an entrance of the service auditing, thus being particularly important and critical in the paperless bill auditing service of the whole operator.
At present, three main ways for extracting and storing handwritten signatures of electronic documents are as follows: a mode of intercepting a signature according to coordinates, a mode of intercepting the signature according to picture elements, and a mode of identifying a signature area and intercepting by AI.
For the way signatures are intercepted by coordinates: when the signature background has a watermark or other characters on part of the document file are overlapped, the intercepted signature picture can cause a certain degree of interference to manual identification and program identification. The method aims at intercepting signatures according to picture elements: the PDF receipt file contains a plurality of picture elements, the signature picture cannot be clearly distinguished according to the size and the color, and if the signature picture is not stored in the PDF receipt file in a picture form, the signature picture cannot be obtained. For the manner of identifying and intercepting signature areas by adopting AI: a large number of samples are required for training, a large number of labeling works are required, and the accuracy rate is less than 100%.
Based on the above, the existing method for recognizing the handwritten signature of the electronic document is inaccurate.
Disclosure of Invention
The embodiment of the application provides a handwritten signature recognition method, a device, electronic equipment and a computer program product, which are used for solving the technical problem of inaccurate handwritten signature recognition.
In a first aspect, an embodiment of the present application provides a handwritten signature recognition method, including:
extracting a vector image object of a file to be identified based on a file format of the file to be identified, wherein the vector image object comprises characteristic information of a handwriting signature;
constructing a path object based on the vector image object, wherein the path object is used for representing a path or a contour of an image;
and drawing the path object, and cutting the drawn path object to obtain a handwritten signature picture.
In one embodiment, the constructing a path object based on the vector image object includes:
and constructing the path object based on the characteristic information of the handwritten signature of the vector image object and at least one path construction operator.
In one embodiment, the drawing the path object, clipping the drawn path object to obtain a handwritten signature picture includes:
drawing and filling the path object by adopting a path drawing operator so as to draw the path object on a canvas;
and adopting a path clipping operator to perform cross processing on the drawn path object and a clipping region so as to clip the path object outside the clipping region, thereby obtaining the handwritten signature picture.
In one embodiment, the extracting the vector image object of the file to be identified based on the file format of the file to be identified includes:
determining a signature page of the file to be identified based on the signature page identification;
identifying content of the signature page based on the file format;
and extracting the vector image object from the content of the signature page obtained by recognition.
In one embodiment, the identifying the content of the signature page based on the file format includes:
and analyzing and rendering the signature page based on the file format to obtain the content of the signature page.
In one embodiment, said extracting said vector image object from the identified content of said signature page comprises: extracting target elements from the content of the signature page, and performing text processing on the target elements to obtain target texts; extracting the vector image object in the target text based on the characteristics of the vector image object.
In one embodiment, the drawing the path object, clipping the drawn path object, and obtaining the handwritten signature picture includes: and converting the handwritten signature picture into a grid picture, and storing the grid picture.
In a second aspect, an embodiment of the present application provides a handwritten signature recognition apparatus, including:
the extraction module is used for extracting a vector image object of the file to be identified based on the file format of the file to be identified, wherein the vector image object comprises characteristic information of a handwriting signature;
a construction module for constructing a path object based on the vector image object, the path object being used for characterizing a path or a contour of an image;
and the obtaining module is used for drawing the path object, cutting the drawn path object, and obtaining a handwritten signature picture.
In a third aspect, an embodiment of the present application provides an electronic device, including a processor and a memory storing a computer program, where the processor implements the steps of the handwritten signature recognition method according to the first aspect when executing the program.
In a fourth aspect, embodiments of the present application provide a computer program product comprising a computer program which, when executed by a processor, implements the steps of the handwritten signature recognition method of the first aspect.
According to the handwriting signature recognition method, the handwriting signature recognition device, the electronic equipment and the computer program product, the vector image object of the file to be recognized is extracted based on the file format of the file to be recognized, and the vector image object comprises characteristic information of the handwriting signature; constructing a path object based on the vector image object, wherein the path object is used for representing a path or a contour of an image; and drawing the path object, and cutting the drawn path object to obtain a handwritten signature picture. According to the embodiment of the application, the path object is constructed, drawn and cut based on the vector image object by extracting the vector image object, so that the handwritten signature picture is obtained, and based on the path object, when the handwritten signature background has the watermark or other characters and other interference factors exist on the overlapped part of document files, the intercepted handwritten signature picture has strong interference resistance, and the auditing accuracy and the handwritten signature recognition accuracy are improved.
Drawings
In order to more clearly illustrate the application or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the application, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flow chart of a handwritten signature recognition method provided by an embodiment of the present application;
fig. 2 is a schematic structural diagram of a handwritten signature recognition apparatus according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Fig. 1 is a flow chart of a handwritten signature recognition method according to an embodiment of the application. Referring to fig. 1, an embodiment of the present application provides a handwritten signature recognition method, which may include:
s100, extracting a vector image object of a file to be identified based on a file format of the file to be identified, wherein the vector image object comprises characteristic information of a handwriting signature;
the file to be identified is an electronic bill of the service of the operator, and the file to be identified is provided with a handwriting signature of the client.
The electronic bill of the embodiment of the application is in a PDF format, and the file format comprises a PDF structure of the file. The PDF structure comprises: file headers, cross-reference tables, directories, objects, pages, fonts, notes, images, etc., different PDF files may have different structures and contents.
The vector image object is a vector image containing characteristic information of a handwritten signature, such as a signature. A vector image is an image type described using mathematical formulas and geometric elements, corresponding to a pixel image (bitmap). Vector images represent images by defining attributes such as geometry, path, curve, and color, rather than pixel arrays.
Based on the PDF structure of the file to be identified, the target element in the file to be identified is positioned by analyzing the PDF structure, and the vector image object is further extracted from the target element, so that the handwriting signature of the client is obtained.
S200, constructing a path object based on the vector image object, wherein the path object is used for representing the path or the outline of the image;
and determining key parameters of the vector image object in the PDF file to be identified, establishing a mapping relation between the graphic elements of the handwriting signature and the grid graph in the PDF structure based on the key parameters, and determining the conversion logic from the graphic elements of the handwriting signature to the grid graph.
Constructing a path object of the vector image object according to the key parameters of the vector image object; for example, the line shape and the number of the vector image objects are extracted, and the path objects having the same line shape and number are constructed based on the line shape and the number of the vector image objects.
And S300, drawing the path object, and cutting the drawn path object to obtain a handwritten signature picture.
And drawing the constructed path object, and drawing the path object on a canvas. And cutting the path object area in the canvas to obtain the handwritten signature picture. For example, a clipping shape, such as a rectangular frame or a circular frame, is preset, the drawn path object is placed in the clipping shape, and the path object and the peripheral area are clipped according to the clipping shape, so that the handwritten signature picture is obtained.
According to the embodiment of the application, the vector image object of the file to be identified is extracted based on the file format of the file to be identified, and the vector image object comprises the characteristic information of the handwriting signature; constructing a path object based on the vector image object, wherein the path object is used for representing a path or a contour of the image; and drawing the path object, and cutting the drawn path object to obtain a handwritten signature picture. According to the embodiment of the application, the path object is constructed, drawn and cut based on the vector image object by extracting the vector image object, so that the handwritten signature picture is obtained, and based on the path object, when the handwritten signature background has the watermark or other characters and other interference factors exist on the overlapped part of document files, the intercepted handwritten signature picture has strong interference resistance, and the auditing accuracy and the handwritten signature recognition accuracy are improved.
Based on the above embodiment, constructing a path object based on a vector image object includes:
s210, constructing a path object based on characteristic information of a handwritten signature of the vector image object and at least one path construction operator.
The characteristic information includes the geometry of the handwritten signature, such as straight lines, curved lines, oblique lines, etc. The path building operator is an operator for building a path object.
The path object is constructed by handwriting the geometry of the signature using at least one path construction operator, e.g., using path construction operators such as Move To (Move To), line To (Line To), quadratic bezier curve To (Quadratic Bezier Curve To), etc., to construct the shape and outline of the path object. Creating paths using these operators according to the handwritten signature pattern and shape; for example, when the geometry of the handwritten signature is a curve shape, a cubic bezier curve is used, where the expression of the cubic bezier curve is:
R(t)=(1-t) 3 ×P 0 +3t×(1-t) 2 ×P 1 +3t 2 ×(1-t)P 2 +t 3 ×P 3 ;
wherein P is 0 As a starting point, P 1 、P 2 P is the control point 3 For the end point, R (t) is the cubic Bezier curve, and t is the position of a specific point.
By means of different parameter values t, positions of different points on the curve can be calculated, and therefore the whole cubic Bezier curve is drawn. t represents the position of a point on the curve in a cubic bezier curve, which determines the shape of the curve between the start and end points. When the value of t is between 0 and 1, the position from the start point to the end point of the curve is indicated. When t approaches 0, the point on the curve approaches the starting point P 0 The method comprises the steps of carrying out a first treatment on the surface of the When t approaches 1, the point on the curve approaches the end point P 3 。
According to the embodiment of the application, the path object is constructed through the characteristic information and the path construction operator, so that the accuracy of constructing the path object is improved.
Based on the above embodiment, drawing a path object, cutting the drawn path object to obtain a handwritten signature picture, including:
s310, adopting a path drawing operator to carry out edge drawing and filling on the path object so as to draw the path object on a canvas;
s320, a path clipping operator is adopted to carry out cross processing on the drawn path object and the clipping region so as to clip the path object outside the clipping region, and a handwritten signature picture is obtained.
Path drawing operators are operators that describe how paths are constructed and drawn, where paths include straight lines, curves, closed paths, and the like.
In the embodiment of the present application, the number of path objects may be plural, and the number of path objects is determined by the number of words of the handwritten signature, for example, the handwritten signature is "Zhang Xiaoming", and then the number of words of the handwritten signature is 3, and the number of corresponding path objects is 3.
Drawing each path object in a front-to-back order, for example, finding a 'sheet' path object first, and drawing the 'sheet'; then find the "small" route object, draw "small"; finally, finding out the 'bright' path object and drawing the 'bright'.
Drawing each path object includes tracing and filling the path object on the canvas, wherein tracing may add colors and line patterns to the path outline and filling may add colors and textures to the path interior. The drawing process of the last path object, such as the coordinate position, color setting, etc., of the last path object is considered when drawing the next path object.
When a path object needs to be clipped to a specific shape, a path clipping operator is used. By intersecting the path object with the clipping region, only the part of the path object in the clipping region is reserved, and the clipping effect of the path object is realized.
According to the embodiment of the application, the path drawing operator and the path clipping operator are used for drawing and clipping the path object, so that the handwritten signature is obtained, and the accuracy of recognition of the handwritten signature is improved.
Based on the above embodiment, extracting a vector image object of a file to be identified based on a file format of the file to be identified includes:
s110, determining a signature page of a file to be identified based on the signature page identification;
s120, identifying the content of the signature page based on the file format;
s130, extracting a vector image object from the content of the signature page obtained through recognition.
Signature page identification is a mark or identification in a document that is used to identify and locate the location of a signature to provide a user with an area to explicitly locate the signature, such as "first party or first party guardian", "principal signature", "sponsor signature", and the like. The signature page identification may be in one of several forms:
(1) Signature row: a row of space is reserved in the table or file for the Signature, typically above or below which the "Signature" or "Signature" word is noted. (2) signature box: a rectangular area is defined in the document for placing the signature. This rectangle will typically be identified by a dashed line or other special pattern. (3) signature tag: the locations in the document where the signature is required are marked using text or icons, for example using the "Sign heat" typeface or an indicated arrow. (4) handwritten signature area: a blank handwritten area is reserved in the electronic document, and a user can directly carry out handwritten signature on the area by using an input device such as a touch screen, a digital board or a mouse.
When the number of electronic documents of the operator business is large, for example, more than 20 pages, the signature page is quickly positioned by the signature page identification. And traversing the content of each page of the PDF file to be identified, and identifying the signature page based on the signature page identification. Target elements are identified and extracted from the content of the signature page, and vector graphics objects are identified based on the target elements.
Alternatively, the content of the page to be identified is directly identified, but the signature page is not identified, for example, the PDF page element of each page in the file to be identified is directly traversed, so as to obtain the content of the page to be identified. Target elements are identified and extracted from the content of the page to be identified, and vector graphic objects are identified based on the target elements.
According to the embodiment of the application, the content of the signature page is identified, so that the vector image object is extracted, and the identification efficiency of the handwritten signature is improved.
Based on the above embodiment, identifying the content of the signature page based on the file format includes:
s121, analyzing and rendering the signature page based on the file format to obtain the content of the signature page.
The parsing and rendering engine includes: pyMuPDF, muPDF, etc. The embodiments of the present application will be described with reference to PyMuPDF.
PyMuPDF is a powerful and flexible PDF processing library that can read, write and process PDF files. The PyMuPDF may parse the PDF file, extracting text, images, fonts, and other objects.
Traversing the page of the PDF signature page by using PyMuPDF, acquiring page content, detecting texts and images in the page, and further identifying the content of the signature page.
Optionally, using the PyMuPDF to traverse pages of all pages of the file to be identified to obtain page contents, detecting texts and images in the pages, and further identifying the contents of the file to be identified.
According to the embodiment of the application, the content of the signature page is identified through the analysis and rendering engine, so that the identification efficiency of the handwritten signature is improved.
Based on the above embodiment, extracting a vector image object from the content of the signature page obtained by recognition includes:
s131, extracting target elements from the content of the signature page, and performing text processing on the target elements to obtain target texts;
s132, extracting the vector image object in the target text based on the characteristics of the vector image object.
The target element is an element of the complete Stream (Stream) type. In a PDF file, an element of a stream type is an object type for storing and transmitting data. The stream type elements are used to store various types of data, such as text content, image data, font data, and the like.
The features of the vector image object include features of the storage location and structure of the vector image.
Listing the content of the identified signature page results in an identified content list. Optionally, the content of the identified file to be identified is listed to obtain an identified content list.
The list of identification contents of the embodiment of the application comprises a Cross-Reference Table (XRef). The XRef list may record location and number information identifying the content. The elements of the complete flow type are found from the XRef list. And carrying out text processing on an External Object (XObject) in the elements of the stream type to obtain target text, wherein the XObject Object is a type used for representing the elements of the embedded image, the form, the vector graph and the like. Detecting and analyzing instruction sequences in partial bytes in the target text of each xoject object, for example instruction sequences in the first 40 bytes; the storage location and structure of the xoobject object is obtained. The storage location and structure of the analyzed xoject object are compared with those of the vector image object (handwritten signature), and it is determined whether or not the xoject object is a vector image object.
For example, if the first 40 bytes of the target text of an xoject object starts with [ 1J/DeviceRGB CS 0.00 0.00 0.00SCN ] or [ q\nq\n00rg ], then the graphics state operation instruction, the color space instruction, and the instruction for saving and restoring the graphics state of the xoject object are searched, and based on the above instructions, the storage location and structure of the xoject object are analyzed, so as to determine whether the xoject object is a vector image object.
And collecting various electronic documents of the operator business containing the handwritten signature in advance, analyzing the handwritten signature in the electronic documents of the operator business, and summarizing the storage position and the structure of the vector image object (handwritten signature) as the basis for identifying the vector image object.
Based on the above method, all vector image objects are extracted.
According to the embodiment of the application, the target element is extracted, and text processing is carried out on the target element, so that the vector image object is obtained, and the recognition efficiency of the handwritten signature is improved.
Based on the above embodiment, drawing a path object, cutting the drawn path object to obtain a handwritten signature picture, and then:
s330, converting the handwritten signature picture into a grid picture and storing the grid picture.
Raster pictures include bitmap or pixel images, which are images made up of pixels. The raster image is represented in a matrix form, where each pixel has its own color value and position. The color value of each pixel represents the color and brightness information of the image at that location. The raster pictures may be saved in a variety of file formats, such as continuous picture experts group (Joint Photographic Experts Group, JPG), portable network graphics (Portable Network Graphics, PNG), and the like. Different file formats have different properties and uses, e.g. JPEG for photos, PNG for situations where transparency is required for images.
Saving the handwritten signature picture as PNG format: and saving path information in the handwritten signature picture, leaving out non-path information in the handwritten signature picture, and obtaining a transparent signature track diagram of the handwritten signature. The PNG format has lossless compression characteristics, can preserve the quality and detail of an image, and supports transparency.
Optionally, the signature trace map in PNG format is saved as JPG format.
Optionally, the handwritten signature picture is directly saved as JPG format.
According to the embodiment of the application, through the process of firstly storing the handwritten signature picture as the PNG format and then converting the PNG format into the JPG format, the file size and the compression rate can be controlled while the higher quality, the transparency and the line and character definition are maintained, and the accuracy of recognizing the handwritten signature is further improved.
In order to further analyze and explain the handwritten signature recognition method provided by the embodiment of the application, the following embodiment is specifically used for explaining:
step 1: inputting a PDF file, traversing each page of content of the PDF file, and finding out a signature page based on the signature page identification;
step 2: loading the content of the PDF signature page by using PyMuPDF to obtain an XRef list of the signature page;
step 3: traversing the XRef lists, searching complete elements of the stream types for the XRef index in each XRef list, carrying out text processing on the XOBject objects in the elements of the stream types to obtain target texts, and detecting and analyzing instruction sequences in the first 40 bytes in the target texts of each XOBject object; judging each XObject object according to the characteristics of the vector image object, and further finding out the vector image object;
step 4: analyzing path elements of the vector image object, and constructing the path elements of the vector image by adopting a path construction operator to obtain the path object; the method specifically comprises the steps of setting a line width instruction, drawing a straight line instruction and drawing a Bessel instruction;
step 5: drawing a path object: carrying out edge drawing and filling on the path object on the canvas by using a path drawing operator;
step 6: clipping a path object: and cutting the drawn path object by using a path cutting operator to obtain a handwritten signature picture.
Step 7: saving the handwritten signature picture as PNG format: and only storing the path information, leaving the non-path information blank, and obtaining the transparent signature track graph.
Step 8: and saving the hand-written signature picture in the PNG format as a JPG format.
According to the embodiment of the application, through analyzing the file structure of the electronic document of the operator service in the PDF file format, for the handwritten client signature, the element of the vector path type of the graphic element in the corresponding PDF is positioned, the vector path element is directly extracted, and then the vector path is redrawn and stored as a grid picture. The embodiment of the application has the greatest advantages that no interference factors exist, the signature layer is directly obtained, when the signature background has watermarks or other interference factors such as characters on part of document files are overlapped, the intercepted signature picture has strong interference resistance, and the accuracy of identifying the handwritten signature is greatly improved.
The handwriting signature recognition device provided by the embodiment of the application is described below, and the handwriting signature recognition device described below and the handwriting signature recognition method described above can be referred to correspondingly. Referring to fig. 2, fig. 2 is a schematic structural diagram of a handwritten signature recognition device according to an embodiment of the application. A handwritten signature recognition apparatus comprising:
an extracting module 201, configured to extract a vector image object of a file to be identified based on a file format of the file to be identified, where the vector image object includes feature information of a handwritten signature;
a construction module 202 for constructing a path object based on the vector image object, the path object being used to characterize a path or contour of the image;
and the obtaining module 203 is configured to draw the path object, and crop the drawn path object to obtain a handwritten signature picture.
According to the handwritten signature recognition device provided by the embodiment of the application, the vector image object of the file to be recognized is extracted based on the file format of the file to be recognized, and the vector image object comprises the characteristic information of the handwritten signature; constructing a path object based on the vector image object, wherein the path object is used for representing a path or a contour of the image; and drawing the path object, and cutting the drawn path object to obtain a handwritten signature picture. According to the embodiment of the application, the path object is constructed, drawn and cut based on the vector image object by extracting the vector image object, so that the handwritten signature picture is obtained, and based on the path object, when the handwritten signature background has the watermark or other characters and other interference factors exist on the overlapped part of document files, the intercepted handwritten signature picture has strong interference resistance, and the auditing accuracy and the handwritten signature recognition accuracy are improved.
In one embodiment, the building module 202 is to: and constructing a path object based on the characteristic information of the handwritten signature of the vector image object and at least one path construction operator.
In one embodiment, the obtaining module 203 is configured to: drawing and filling a path object by adopting a path drawing operator so as to draw the path object on a canvas; and (3) adopting a path clipping operator to perform cross processing on the drawn path object and the clipping region so as to clip the path object outside the clipping region and obtain a handwritten signature picture.
In one embodiment, the extraction module 201 is to: determining a signature page of the file to be identified based on the signature page identification; identifying the content of the signature page based on the file format; and extracting the vector image object from the content of the signature page obtained by recognition.
In one embodiment, the extraction module 201 is to: and analyzing and rendering the signature page based on the file format to obtain the content of the signature page.
In one embodiment, the extraction module 201 is to: extracting target elements from the content of the signature page, and performing text processing on the target elements to obtain target texts; based on the characteristics of the vector image object, the vector image object in the target text is extracted.
In one embodiment, the obtaining module 203 is further configured to: converting the handwritten signature picture into a grid picture and storing the grid picture.
Fig. 3 illustrates a physical schematic diagram of an electronic device, as shown in fig. 3, where the electronic device may include: processor 310, communication interface (Communication Interface) 320, memory 330 and communication bus 340, wherein processor 310, communication interface 320, memory 330 accomplish communication with each other through communication bus 340. Processor 310 may invoke a computer program in memory 330 to perform the steps of a handwritten signature recognition method, including, for example:
extracting a vector image object of the file to be identified based on the file format of the file to be identified, wherein the vector image object comprises characteristic information of a handwriting signature; constructing a path object based on the vector image object, wherein the path object is used for representing a path or a contour of the image; and drawing the path object, and cutting the drawn path object to obtain a handwritten signature picture.
Further, the logic instructions in the memory 330 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, embodiments of the present application further provide a computer program product, where the computer program product includes a computer program, where the computer program may be stored on a non-transitory computer readable storage medium, where the computer program when executed by a processor is capable of executing the steps of the handwriting signature recognition method provided in the foregoing embodiments, where the method includes:
extracting a vector image object of the file to be identified based on the file format of the file to be identified, wherein the vector image object comprises characteristic information of a handwriting signature; constructing a path object based on the vector image object, wherein the path object is used for representing a path or a contour of the image; and drawing the path object, and cutting the drawn path object to obtain a handwritten signature picture.
In another aspect, an embodiment of the present application further provides a processor readable storage medium, where a computer program is stored, where the computer program is configured to cause a processor to execute the steps of the handwriting signature recognition method provided in the foregoing embodiments, for example, including:
extracting a vector image object of the file to be identified based on the file format of the file to be identified, wherein the vector image object comprises characteristic information of a handwriting signature; constructing a path object based on the vector image object, wherein the path object is used for representing a path or a contour of the image; and drawing the path object, and cutting the drawn path object to obtain a handwritten signature picture.
The processor-readable storage medium may be any available medium or data storage device that can be accessed by a processor, including, but not limited to, magnetic storage (e.g., floppy disks, hard disks, magnetic tape, magneto-optical disks (MOs), etc.), optical storage (e.g., CD, DVD, BD, HVD, etc.), semiconductor storage (e.g., ROM, EPROM, EEPROM, nonvolatile storage (NAND FLASH), solid State Disk (SSD)), and the like.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present application without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.
Claims (10)
1. A method of handwriting signature recognition, comprising:
extracting a vector image object of a file to be identified based on a file format of the file to be identified, wherein the vector image object comprises characteristic information of a handwriting signature;
constructing a path object based on the vector image object, wherein the path object is used for representing a path or a contour of an image;
and drawing the path object, and cutting the drawn path object to obtain a handwritten signature picture.
2. The handwritten signature recognition method according to claim 1, wherein said constructing a path object based on said vector image object comprises:
and constructing the path object based on the characteristic information of the handwritten signature of the vector image object and at least one path construction operator.
3. The method for recognizing a handwritten signature according to claim 1, wherein the drawing the path object, cutting the drawn path object to obtain a handwritten signature picture, includes:
drawing and filling the path object by adopting a path drawing operator so as to draw the path object on a canvas;
and adopting a path clipping operator to perform cross processing on the drawn path object and a clipping region so as to clip the path object outside the clipping region, thereby obtaining the handwritten signature picture.
4. The handwritten signature recognition method according to claim 1, wherein the extracting a vector image object of a file to be recognized based on a file format of the file to be recognized includes:
determining a signature page of the file to be identified based on the signature page identification;
identifying content of the signature page based on the file format;
and extracting the vector image object from the content of the signature page obtained by recognition.
5. The handwritten signature recognition method according to claim 4, wherein the recognition of the content of the signature page based on the file format includes:
and analyzing and rendering the signature page based on the file format to obtain the content of the signature page.
6. The handwritten signature recognition method according to claim 4, wherein said extracting the vector image object from the content of the signature page obtained by the recognition includes:
extracting target elements from the content of the signature page, and performing text processing on the target elements to obtain target texts;
extracting the vector image object in the target text based on the characteristics of the vector image object.
7. The method for recognizing a handwritten signature according to claim 1, wherein the steps of drawing the path object, cutting the drawn path object to obtain a handwritten signature picture, and then:
and converting the handwritten signature picture into a grid picture, and storing the grid picture.
8. A handwritten signature recognition apparatus, comprising:
the extraction module is used for extracting a vector image object of the file to be identified based on the file format of the file to be identified, wherein the vector image object comprises characteristic information of a handwriting signature;
a construction module for constructing a path object based on the vector image object, the path object being used for characterizing a path or a contour of an image;
and the obtaining module is used for drawing the path object, cutting the drawn path object, and obtaining a handwritten signature picture.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the handwritten signature recognition method according to any one of claims 1 to 7 when the program is executed.
10. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the handwritten signature recognition method as claimed in any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310864798.3A CN116959004A (en) | 2023-07-14 | 2023-07-14 | Handwritten signature recognition method, handwritten signature recognition device, electronic equipment and computer program product |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310864798.3A CN116959004A (en) | 2023-07-14 | 2023-07-14 | Handwritten signature recognition method, handwritten signature recognition device, electronic equipment and computer program product |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116959004A true CN116959004A (en) | 2023-10-27 |
Family
ID=88450511
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310864798.3A Pending CN116959004A (en) | 2023-07-14 | 2023-07-14 | Handwritten signature recognition method, handwritten signature recognition device, electronic equipment and computer program product |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116959004A (en) |
-
2023
- 2023-07-14 CN CN202310864798.3A patent/CN116959004A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111507251B (en) | Method and device for positioning answering area in test question image, electronic equipment and computer storage medium | |
JP4829920B2 (en) | Form automatic embedding method and apparatus, graphical user interface apparatus | |
JP4676225B2 (en) | Method and apparatus for capturing electronic forms from scanned documents | |
US8718364B2 (en) | Apparatus and method for digitizing documents with extracted region data | |
US9613267B2 (en) | Method and system of extracting label:value data from a document | |
JP4347677B2 (en) | Form OCR program, method and apparatus | |
US8693790B2 (en) | Form template definition method and form template definition apparatus | |
US8213717B2 (en) | Document processing apparatus, document processing method, recording medium and data signal | |
US6614929B1 (en) | Apparatus and method of detecting character writing area in document, and document format generating apparatus | |
US9558433B2 (en) | Image processing apparatus generating partially erased image data and supplementary data supplementing partially erased image data | |
CN110796145B (en) | Multi-certificate segmentation association method and related equipment based on intelligent decision | |
CN107679442A (en) | Method, apparatus, computer equipment and the storage medium of document Data Enter | |
CN112560849B (en) | Neural network algorithm-based grammar segmentation method and system | |
CN104809099A (en) | Document file generating device and document file generation method | |
CN113806472A (en) | Method and equipment for realizing full-text retrieval of character, picture and image type scanning piece | |
CN113936187A (en) | Text image synthesis method and device, storage medium and electronic equipment | |
CN112365402B (en) | Intelligent winding method and device, storage medium and electronic equipment | |
KR102598210B1 (en) | Drawing information recognition method of engineering drawings, drawing information recognition system, computer program therefor | |
JPH10171922A (en) | Ruled line eraser and recording medium | |
CN116959004A (en) | Handwritten signature recognition method, handwritten signature recognition device, electronic equipment and computer program product | |
JP4347675B2 (en) | Form OCR program, method and apparatus | |
CN110909723B (en) | Information processing apparatus and computer-readable storage medium | |
CN113378526A (en) | PDF paragraph processing method, device, storage medium and equipment | |
CN112949514A (en) | Scanned document information processing method and device, electronic equipment and storage medium | |
JP5528410B2 (en) | Viewer device, server device, display control method, electronic comic editing method and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |