CN115130437A - Intelligent document filling method and device and storage medium - Google Patents

Intelligent document filling method and device and storage medium Download PDF

Info

Publication number
CN115130437A
CN115130437A CN202211050882.3A CN202211050882A CN115130437A CN 115130437 A CN115130437 A CN 115130437A CN 202211050882 A CN202211050882 A CN 202211050882A CN 115130437 A CN115130437 A CN 115130437A
Authority
CN
China
Prior art keywords
slot position
slot
document
document file
filling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211050882.3A
Other languages
Chinese (zh)
Other versions
CN115130437B (en
Inventor
王加伟
杜向阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Aegis Information Technology Co ltd
Original Assignee
Nanjing Aegis Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Aegis Information Technology Co ltd filed Critical Nanjing Aegis Information Technology Co ltd
Priority to CN202211050882.3A priority Critical patent/CN115130437B/en
Publication of CN115130437A publication Critical patent/CN115130437A/en
Application granted granted Critical
Publication of CN115130437B publication Critical patent/CN115130437B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/1444Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)
  • Character Input (AREA)

Abstract

The embodiment of the application discloses a method, a device and a storage medium for intelligently filling in a document, wherein the method for intelligently filling in the document comprises the following steps: acquiring a document file to be filled, wherein the type of the document file comprises a DOCX document format, a picture format and/or a PDF document format; positioning a slot position in the document file, and acquiring slot position and text information and slot position information; generating a slot position label corresponding to the slot position according to the slot position context information; and acquiring filling contents corresponding to the slot position tags based on the slot position tags corresponding to the slot positions, and restoring the filling contents corresponding to the slot positions to the document file based on the slot position information corresponding to the slot positions.

Description

Intelligent document filling method and device and storage medium
Technical Field
The application relates to the technical field of natural languages, in particular to a document intelligent filling method, a document intelligent filling device and a storage medium.
Background
In recent years, the development of artificial intelligence technology is rapid, and the artificial intelligence technology is gradually merged into a plurality of fields such as finance, judicial expertise, education and the like. The contract processing task is exactly in the crossing field of judicial and financial affairs, one contract often bears key information of processes such as commercial transaction, labor personnel change and the like, and the legal affairs bear strain pressure which is not inferior to business when drawing the contract. Meanwhile, dozens of contracts are drafted, the transaction logic is understood, various associated laws and regulations are known, the drafting speed is ensured, mistakes and omissions are prevented, risks are strictly controlled, and heavy pressure is undoubtedly brought to the legal affairs.
The existing contract drafting system scheme is mainly dedicated to generating contract blank contracts or contract templates, mainly analyzes the fixed common contracts in a template library, and cannot deal with the contract drafting of non-fixed templates which may occur in a real scene, such as contract drafting of a first party and a second party, the blank contract templates may be issued by the first party, the second party and even a third party, at this time, if the requirement of batch drafting exists, the filled contents of each slot position can be determined by reading and carefully reviewing the contracts by a legal affair. Moreover, the contracts may have various formats, such as docx, or pdf and pictures, which cannot be edited directly, and in this case, the legal staff is required to reproduce completely, so that the contracts become contract templates with editable versions.
Therefore, the contract drafting system scheme in the prior art can only analyze texts in a limited and fixed contract library, and cannot process contract template analysis and drafting in an open scene and solve image and pdf type contract drafting in a multi-modal scene, so that the problems of low contract drafting efficiency and high labor cost of legal work for contract drafting are caused.
Disclosure of Invention
An object of the embodiments of the present application is to provide a method, an apparatus, and a storage medium for filling documents intelligently, so as to solve the problems that a contract drafting system scheme in the prior art can only parse texts in a limited and fixed contract library, and cannot process contract template parsing and drafting in an open scene and cannot solve image and pdf type contract drafting in a multi-modal scene, thereby causing low contract drafting efficiency and high labor cost for contractual drafting.
In order to achieve the above object, an embodiment of the present application provides an intelligent document filling method, including: acquiring a document file to be filled, wherein the type of the document file comprises a DOCX document format, a picture format and/or a PDF document format;
positioning a slot position in the document file, and acquiring slot position and text information and slot position information;
generating a slot position label corresponding to the slot position according to the slot position context information;
and acquiring filling contents corresponding to the slot position tags based on the slot position tags corresponding to the slot positions, and restoring the filling contents corresponding to the slot positions to the document file based on the slot position information corresponding to the slot positions.
Optionally, when the type of the document file is a DOCX document format, the method for locating the slot in the document file includes:
analyzing the slot position in the document file through a regular expression, and determining the corresponding slot position information through the preceding text and the following text of the slot position.
Optionally, when the type of the document file is in a picture format or a PDF document format, the method for locating the slot in the document file includes:
obtaining a picture of the document file in a picture format or the picture obtained by transferring the document file in a PDF document format;
carrying out graying processing on the picture;
carrying out an opening operation of firstly corroding and then expanding the picture, and extracting underlines of the slot positions;
performing a re-expansion operation on the picture, wherein the re-expansion operation is to expand the outline of the image of the picture;
and detecting and extracting underlines of the slot positions in the picture by using Hough transform, and obtaining coordinates of each slot position in the picture to obtain slot position information.
Optionally, after obtaining the coordinates of each slot in the picture, the method further includes:
and acquiring the coordinates of each character in the document file by using an OCR (optical character recognition), taking the character closest to the coordinates of the lower left corner of the slot position as a former text of the slot position, and taking the character closest to the coordinates of the upper right corner of the slot position as a latter text of the slot position based on the coordinates of each character, thereby obtaining the former and latter text information of the slot position.
Optionally, the method for generating the slot tag corresponding to the slot includes:
and generating the slot position label according to the slot position contextual information by utilizing a text generation model.
Optionally, the method for generating the slot tag corresponding to the slot includes:
generating a corresponding prompt mask at the slot position according to the contextual information of the slot position by using a prompt learning method based on a pre-training language model;
and predicting and generating the slot position label by utilizing the pre-training language model according to the prompt mask.
Optionally, the method for obtaining the filling content corresponding to the slot tag includes:
and integrating the slot position and the corresponding slot position label into a form, sending the form to a user, acquiring information input by the user, and obtaining the filling content corresponding to the slot position label.
Optionally, the method for restoring the filling content corresponding to the slot into the document file includes:
for the document file in the DOCX document format, directly replacing the filling content with the blank of the corresponding slot position;
and for the document files in the picture format and the PDF document format, covering the blank of the corresponding slot position with the paste map of the filling content in a layer covering mode.
In order to achieve the above object, the present application further provides an intelligent document filling device, including: a memory; and
a processor coupled to the memory, the processor configured to:
acquiring a document file to be filled, wherein the type of the document file comprises a DOCX document format, a picture format and/or a PDF document format;
positioning a slot position in the document file, and acquiring slot position and text information and slot position information;
generating a slot position label corresponding to the slot position according to the slot position context information;
and acquiring filling contents corresponding to the slot position tags based on the slot position tags corresponding to the slot positions, and restoring the filling contents corresponding to the slot positions to the document file based on the slot position information corresponding to the slot positions.
To achieve the above object, the present application also provides a computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a machine, implements the steps of the method as described above.
The embodiment of the application has the following advantages:
1. the embodiment of the application provides an intelligent document filling method, which comprises the following steps: acquiring a document file to be filled, wherein the type of the document file comprises a DOCX document format, a picture format and/or a PDF document format; positioning a slot position in the document file, and acquiring slot position and text information and slot position information; generating a slot position label corresponding to the slot position according to the contextual information of the slot position; and acquiring filling contents corresponding to the slot position tags based on the slot position tags corresponding to the slot positions, and restoring the filling contents corresponding to the slot positions to the document file based on the slot position information corresponding to the slot positions.
By the method, the intelligent filling of documents in unlimited carriers and unlimited fields can be realized, the labor cost is greatly reduced, the problem of drafting of image and pdf contracts in a multi-modal scene is solved, the efficiency of drafting the contracts is improved, and the labor cost of legal affairs for drafting the contracts is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It should be apparent that the drawings in the following description are merely exemplary, and that other embodiments can be derived from the drawings provided by those of ordinary skill in the art without inventive effort.
Fig. 1 is a flowchart of an intelligent document filling method according to an embodiment of the present disclosure;
fig. 2a is a schematic view illustrating an effect of performing graying processing in a method for positioning a slot in an intelligent document filling method according to an embodiment of the present application;
fig. 2b is a schematic diagram illustrating an effect of performing an opening operation process in a method for positioning a slot in an intelligent document filling method according to an embodiment of the present application;
fig. 2c is a schematic diagram illustrating an effect of performing re-expansion operation in a method for positioning the slot in the intelligent document filling method according to the embodiment of the present disclosure;
fig. 2d is a schematic diagram illustrating an effect of hough transform processing in a method for positioning the slot in the intelligent document filling method according to the embodiment of the present disclosure;
fig. 3 is a schematic model diagram of a method for intelligently filling in a document according to an embodiment of the present application, where a slot tag corresponding to a slot is generated;
fig. 4 is a block diagram of modules of an intelligent document filling device according to an embodiment of the present disclosure.
Detailed Description
The present disclosure is not intended to be limited to the particular embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In addition, the technical features mentioned in the different embodiments of the present application described below may be combined with each other as long as they do not conflict with each other.
An embodiment of the present application provides a document intelligent filling method, and referring to fig. 1, fig. 1 is a flowchart of a document intelligent filling method provided in an embodiment of the present application, it should be understood that the method may further include additional blocks not shown and/or may omit the shown blocks, and the scope of the present application is not limited in this respect.
In the embodiments of the present application, the filling of the contract template with the slot to be filled is taken as an example to describe the scheme of the present application, and it should be understood that the document file may also be other types of files with slots to be filled, and the scheme of the present application is also applicable.
At step 101, a document file to be filled in is obtained, wherein the type of the document file comprises a DOCX document format, a picture format and/or a PDF document format.
Specifically, in the present embodiment, the reading of the contract template. Since the contract template carriers have various forms, the type of the contract template file needs to be judged first, and the corresponding XML document is analyzed for editable docx (XML: from 2007 Microsoft Office system, Microsoft Office uses a file format based on XML, which has better expansibility and supports developers to read and modify core source codes), so as to obtain the text content in full text and the XML tags corresponding to the slot positions. For a non-editable pdf or a picture file, because structural information and text content of an original word document cannot be directly acquired, an OCR algorithm is required to identify characters, and an image recognition algorithm is used to identify all slots in the whole text. For other types of document files, the document files can be converted into a DOCX document format, a picture format and/or a PDF document format, and then the scheme is applied to carry out document intelligent filling processing.
At step 102, a slot position in the document file is located, and slot position and context information are obtained.
In some embodiments, when the type of the document file is a DOCX document format, the method of locating the slot in the document file includes: analyzing the slot position in the document file through a regular expression, and determining the corresponding slot position information through the preceding text and the following text of the slot position.
In some embodiments, when the type of the document file is a picture format or a PDF document format, the method of locating the slot in the document file includes: obtaining a picture of the document file in a picture format or the picture obtained by unloading the document file in a PDF document format; carrying out graying processing on the picture; carrying out an opening operation of firstly corroding and then expanding the picture, and extracting underlines of the slot positions; performing a re-expansion operation on the picture, wherein the re-expansion operation is to expand the outline of the image of the picture; and detecting and extracting underlines of the slot positions in the picture by using Hough transform, and obtaining coordinates of each slot position in the picture to obtain slot position information.
In some embodiments, obtaining the coordinates of each slot in the picture further comprises:
and acquiring the coordinates of each character in the document file by using an OCR (optical character recognition), taking the character closest to the coordinates of the lower left corner of the slot position as a former text of the slot position, and taking the character closest to the coordinates of the upper right corner of the slot position as a latter text of the slot position based on the coordinates of each character, thereby obtaining the former and latter text information of the slot position.
Specifically, based on the contract template read in the foregoing steps, for the docx file, the corresponding slot position may be resolved through the regular expression, and the slot position positioning manner is as follows: the position information of the slot position and the corresponding context information are obtained by uniquely determining the previous text and the following text of the corresponding slot position.
For pdf or picture files, which are processed on the basis of Open-CV, Open CV (Open Source Vision Library) is an Open-Source Computer Vision Library that provides many functions that implement Computer Vision algorithms very efficiently (the most basic filtering to advanced object detection is covered). For pdf, each page is first read and saved as a picture, and then the pictures are morphologically processed based on open-cv, as shown in fig. 2a to 2d, the main processes are:
a. graying treatment: graying is the process of making R, G, B three components of a color image equal. Since the contract document is a black-and-white document, and the slot positioning task is independent of other color components, 3-dimensional matrix of RGB in the original color image can be formed into 2-dimensional matrix (img [ R ] [ B ] [ G ] - > img [ Grey1] [ Grey2 ]), which is convenient for the subsequent process, and the processing result is as shown in FIG. 2 a.
b. Opening operation: the first operation is the process of corrosion and then expansion. This operation is mainly used to extract horizontal or vertical lines without significantly changing the area of small objects while eliminating their boundaries and smoothing out larger objects. Combining the characters in the contract and the slot morphological characteristics, the size of the core of the start operation is set to be 60 × 1, and the processing result is as shown in fig. 2b, so that the slot underline is accurately extracted by the operation, and all the characters of the non-slot part are removed.
c. Re-expansion: the straight line of the slot position obtained by the opening operation in the step b may cause that part of the slot position is too thin and not obvious enough, which may cause that the straight line detection in the subsequent step d cannot be accurately positioned. The result of step b is expanded again, as shown in fig. 2 c. The expansion operation is to expand the outline of the image, and the formula is as follows:
Figure 993374DEST_PATH_IMAGE001
d. hough transform: hough Transform (Hough Transform) is a feature extraction technique in image processing that detects objects with a particular shape by a voting algorithm. The process obtains a set conforming to the specific shape as a hough transform result by calculating a local maximum of the accumulated result in a parameter space. It should be understood that a straight line can be represented by y = kx + b in a rectangular coordinate system, and the main idea of hough transform is to exchange the parameters and variables of the equation, i.e. using x, y as the known quantity k, b as the variable coordinates, so that the straight line y = kx + b in the rectangular coordinate system is represented as a point (k, b) in the parameter space, and a point (x1, y1) is represented as a straight line y1= x1 · k + b in the rectangular coordinate system, where (k, b) is an arbitrary point on the straight line. For ease of calculation, the coordinates of the parameter space are expressed as γ and θ in polar coordinates. Because the point correspondences (gamma, theta) on the same straight line are the same, the image can be subjected to edge detection firstly, and then each non-zero pixel point on the image is converted into a straight line under the parameter coordinate, so that the points belonging to the same straight line under the rectangular coordinate form a plurality of straight lines in the parameter space and intersect at one point internally. Therefore, the principle can be used to perform line detection, and the detection effect is shown in fig. 2 d.
Through the operation, the coordinates (x1, y1, x2 and y 2) of each slot position in the picture can be located, wherein (x1 and y1) are the coordinates of the lower left corner of the slot position, and (x 2 and y 2) are the coordinates of the upper right corner of the slot position, so that the slot position information is obtained. For the picture file and the pdf file, coordinates of each character are obtained by using an OCR, the character closest to (x1, y1) is taken as a front part of the slot, and the character closest to (x 2, y 2) is taken as a rear part of the slot, so that the front and rear text information of the slot is obtained.
At step 103, a slot tag corresponding to the slot is generated according to the contextual information of the slot.
In some embodiments, the method of generating the slot tag corresponding to the slot comprises: and generating the slot position label according to the slot position context information by utilizing a text generation model.
In other embodiments, the method of generating the slot tag corresponding to the slot includes: generating a corresponding prompt mask at the slot position according to the contextual information of the slot position by using a prompt learning method based on a pre-training language model; and predicting and generating the slot position label by utilizing the pre-training language model according to the prompt mask.
In particular, this stage performs sentence-level tag modeling since the slot positioning of the previous step can already give the required contextual sentences, i.e., contextual information.
The scheme of the application is drafted for the open field contract, so that the slot position labels cannot be limited to limited categories, and the algorithm based on label classification is not applicable. Therefore, a text generation model can be selected, and a label can be directly generated by combining the semantic information of the front and the back, so that the problem of wide label sampling space is avoided.
However, the generative model is generally a one-way model, which is a prediction method of autoregressive. The drawback of autoregressive approaches is that only the information in the preceding or following text can be utilized, and not both, which is commonly used for text summarization and machine translation tasks, because the order of such tasks as they are generated is naturally left to right, matching the autoregressive language model.
However, the task of generating the slot tag needs to consider preceding and following semantic information at the same time, for example: "Party A rents the house for ____", if the model is generated by autoregressive, the model can only focus on the semantic information of the preceding text: "party A leases this house", lacks postamble information, therefore can't judge the slot position label is: the "lease duration" is also the "lease number". After the semantics of the following year is introduced, the slot position label can be judged to be the lease duration. Therefore, the slot tag generation task needs context two-way information, and therefore, a self-coding target slot generation algorithm based on prompt-learning is further provided in the embodiment of the present application.
The prompt-learning is a learning method based on a pre-training language model: it changes the downstream task to a text generation task by adding a "prompt" to the input without significantly changing the pre-trained language model structure and parameters. Taking the emotion classification task as an example: the need to judge "this travel of Beijing is felt good. "Emotion, conventionally, predicts 0 or 1 by a classification model, with 0 representing positive and 1 representing negative. And the prompt learning is converted into an MLM (mask language model) task, a prompt 'I is ___' is added after a sentence to be predicted, an input corpus is generated, the 'I feels good at the Beijing journey and I feels ____', the result at the slot position can be generated to be 'satisfied' or 'disappointed', when 'satisfied' is generated, the sentence emotion is marked to be positive, otherwise, the sentence emotion is negative emotion.
The prompt-learning depends on a pre-training language model P (x), firstly, the pre-training language model P (x) is obtained, an appropriate template is introduced to adjust the input x to be x in a complete filling-in-space format (namely, an initial slot position in a document file is adjusted to be a target slot position), the adjusted input x' contains some empty slots (namely, a generated target slot position, and a prompt MASK is generated at the generated target slot position), an MLM task (MASK prediction task) is carried out by using the pre-training language model P, real characters corresponding to the [ MASK ] MASKs (prompt MASKs) at the corresponding slot positions are respectively predicted, then, each single character at the MASK positions is combined, and finally, a complete label word is formed.
The prompt-learning advantages are:
a. compared with the prior art, each task defines a set of parameters, and specific information is added in the input process, so that the learning is prompted without changing the parameters of the whole model, and the efficiency and the storage space are improved.
b. The traditional pretrain + finetune training mode is different, and needs to migrate from large-scale unsupervised data training to the downstream finetune task, and the prompt-based mode breaks through the mode.
In the target slot position generation algorithm based on prompt-learning provided by the embodiment, the prediction of the slot position tag is converted into the following prompt mode, and in the target slot position generation task, the target gives the tag type of the slot position according to the context of the slot position. Thus, to construct the Prompt template first, the present application provides two Prompt templates, respectively, as Prompt1 and Prompt 2:
inputting: "date of arrival of equipment: within ____ working days from the contract being validated. "
Prompt 1: "the equipment arrival date is: the contract takes effect within ____ (MASK ] [ MASK ] …) working days. "
Prompt 2: "date of arrival of equipment: the contract takes effect within ____ (labeled here as [ MASK ] [ MASK ] …) working days. "
The difference between Prompt2 and Prompt1 is that the Prompt "here label is" is added to indicate that the model learning objective is to predict here the label. The Prompt of Prompt1 is equivalent to "(", the semantic of the template expression is not clear enough, and the accuracy of Prompt2 is higher in prediction accuracy.
In addition, the length of the part to be predicted is required to be fixed in the prompt learning process, and the length of the slot position label is short or long, so that the length of the slot position label cannot be known in advance in the prediction process. Therefore, aiming at the problem of the slot position with indefinite length, the longest slot position label length is selected to be 10 based on data set analysis, for the label part which is less than 10 and is input, the input part is shielded by using [ MASK ], the output label part is output, and for the label part from the actual position end position to the length 10, [ SEP ] is used as an end mark.
And then according to the prompt mask, predicting and generating a corresponding slot position label by using a pre-training language model: recording a pre-training language model and a dictionary thereof as (M, V), wherein a mask token is recorded as [ mask ]; the one-hot tag set of the task is denoted as L.
For input sequence x =(s) 1 ,...,s k )
First, define a pattern to convert the input x into a pattern containing]P (x) epsilon V , V The elements in the representation sequence are all from the dictionary V.
L → V, and mapping each label L to a token V (L) in the dictionary.
Then, inputting P (x), the model is used for mlm task, predicting the original character v (L) at the position of 'mask', and then performing backstepping to the type L e L of the text according to the 'verbalizer'.
The method model diagram is shown in fig. 3. Through experiments, the scheme of the embodiment of the application can accurately predict the slot tag of the input part according to the [ MASK ] mark of the input part.
At step 104, based on the slot position tag corresponding to the slot position, the filling content corresponding to the slot position tag is obtained, and based on the slot position information corresponding to the slot position, the filling content corresponding to the slot position is restored to the document file.
In some embodiments, the method for obtaining the filling content corresponding to the slot tag includes: and integrating the slot position and the corresponding slot position label into a form, sending the form to a user, acquiring information input by the user, and obtaining the filling content corresponding to the slot position label.
In some embodiments, the method of restoring the filling content corresponding to the slot to the document file includes: for the document file in the DOCX document format, directly replacing the filling contents with the blanks of the corresponding slot positions; and for the document files in the picture format and the PDF document format, covering the blank of the corresponding slot position with the paste map of the filling content in a layer covering mode.
Specifically, through the above steps, extraction and tagging of contract slot positions in the open field for any format carrier and the like are realized, in order to improve contract drafting efficiency, in some embodiments of the present application, the above information is integrated into a form, and a user can directly complete batch drafting based on the prompt of a slot position tag in the form, and in this process, a check logic can be set, such as whether a slot value corresponding to an "identity card number" tag meets the specification or not. And finally, restoring the corresponding value to the document by combining the slot position information in the previous step. And for editable docx files, directly replacing slot blanks with target slot value characters, and for non-editable pdf files or picture files, mapping corresponding slot values to corresponding coordinates in a mode of covering layers.
This application is to be contrasted with prior art embodiments: the current scheme cannot systematically and simultaneously solve the contract information analysis of the multi-mode carrier. In most schemes, the drafting field is limited to a specific field, the contract templates are limited to specific parts, the contracts are analyzed, and the contract labels are generated by manual summarization.
The application divides the contract drafting task into the following key steps: 1. and reading the contract template. 2. And positioning the slot position in the contract template. 3. And intelligently generating a slot position semantic tag. 4. And (5) automatic groove filling in a tabulation mode. In the implementation process of the method, a multi-mode information processing and pre-training natural language understanding intelligent algorithm is integrated, so that the intelligent drafting of contracts in unlimited carriers and unlimited fields can be realized, and the labor cost is greatly reduced.
By the method, intelligent filling of documents in unlimited carriers and unlimited fields can be realized, labor cost is greatly reduced, the problem of drafting of image and pdf contracts in a multi-modal scene is solved, contract drafting efficiency is improved, and legal labor cost for drafting the contracts is reduced.
Fig. 4 is a block diagram of modules of an intelligent document filling device according to an embodiment of the present disclosure. The device includes:
a memory 201; and a processor 202 coupled to the memory 201, the processor 202 configured to: acquiring a document file to be filled, wherein the type of the document file comprises a DOCX document format, a picture format and/or a PDF document format;
positioning a slot position in the document file, and acquiring slot position and text information and slot position information;
generating a slot position label corresponding to the slot position according to the slot position context information;
and acquiring filling contents corresponding to the slot position tags based on the slot position tags corresponding to the slot positions, and restoring the filling contents corresponding to the slot positions to the document file based on the slot position information corresponding to the slot positions.
In some embodiments, the processor 202 is further configured to: when the type of the document file is in a DOCX document format, the method for locating the slot in the document file includes:
analyzing the slot position in the document file through a regular expression, and determining the corresponding slot position information through the preceding text and the following text of the slot position.
In some embodiments, the processor 202 is further configured to: when the type of the document file is in a picture format or a PDF document format, the method for positioning the slot in the document file comprises the following steps:
obtaining a picture of the document file in a picture format or the picture obtained by unloading the document file in a PDF document format;
carrying out graying processing on the picture;
carrying out an opening operation of firstly corroding and then expanding the picture, and extracting underlines of the slot positions;
performing a re-expansion operation on the picture, wherein the re-expansion operation is to expand the outline of the image of the picture;
and detecting and extracting underlines of the slot positions in the picture by using Hough transform, and obtaining coordinates of each slot position in the picture to obtain slot position information.
In some embodiments, the processor 202 is further configured to: after obtaining the coordinates of each slot in the picture, the method further comprises:
and acquiring the coordinates of each character in the document file by using an OCR (optical character recognition), taking the character closest to the coordinates of the lower left corner of the slot position as a former text of the slot position, and taking the character closest to the coordinates of the upper right corner of the slot position as a latter text of the slot position based on the coordinates of each character, thereby obtaining the former and latter text information of the slot position.
In some embodiments, the processor 202 is further configured to: the method for generating the slot tag corresponding to the slot comprises the following steps:
and generating the slot position label according to the slot position context information by utilizing a text generation model.
In some embodiments, the processor 202 is further configured to: the method for generating the slot tag corresponding to the slot comprises the following steps:
generating a corresponding prompt mask at the slot position according to the contextual information of the slot position by using a prompt learning method based on a pre-training language model;
and predicting and generating the slot position label by utilizing the pre-training language model according to the prompt mask.
In some embodiments, the processor 202 is further configured to: the method for acquiring the filling content corresponding to the slot position label comprises the following steps:
and integrating the slot position and the corresponding slot position label into a form, sending the form to a user, acquiring information input by the user, and obtaining the filling content corresponding to the slot position label.
In some embodiments, the processor 202 is further configured to: the method for restoring the filling content corresponding to the slot position into the document file comprises the following steps:
for the document file in the DOCX document format, directly replacing the filling content with the blank of the corresponding slot position;
and for the document files in the picture format and the PDF document format, covering the blank of the corresponding slot position with the paste map of the filling content in a layer covering mode.
For the specific implementation method, reference is made to the foregoing method embodiments, which are not described herein again.
The present application may be methods, apparatus, systems and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for carrying out various aspects of the present application.
The computer-readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present application may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry can execute computer-readable program instructions to implement aspects of the present application by utilizing state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).
Various aspects of the present application are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
These computer-readable program instructions may be provided to a processing unit of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processing unit of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It is noted that, unless expressly stated otherwise, all features disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose. Thus, unless expressly stated otherwise, each feature disclosed is one example only of a generic series of equivalent or similar features. Where used, further, preferably, still further and more preferably is a brief introduction to the description of the other embodiment based on the foregoing embodiment, the combination of the contents of the further, preferably, still further or more preferably back strap with the foregoing embodiment being a complete construction of the other embodiment. Several further, preferred, still further or more preferred arrangements of the belt after the same embodiment may be combined in any combination to form a further embodiment.
Although the present application has been described in detail with respect to the general description and the specific embodiments, it will be apparent to those skilled in the art that some modifications or improvements may be made based on the present application. Accordingly, such modifications and improvements are intended to be within the scope of this invention as claimed.

Claims (10)

1. An intelligent document filling method is characterized by comprising the following steps:
acquiring a document file to be filled, wherein the type of the document file comprises a DOCX document format, a picture format and/or a PDF document format;
positioning a slot position in the document file, and acquiring slot position and text information and slot position information;
generating a slot position label corresponding to the slot position according to the slot position context information;
and acquiring filling contents corresponding to the slot position tags based on the slot position tags corresponding to the slot positions, and restoring the filling contents corresponding to the slot positions to the document file based on the slot position information corresponding to the slot positions.
2. The intelligent document filling-in method according to claim 1, wherein when the type of the document file is DOCX document format, the method of locating the slot in the document file comprises:
analyzing the slot position in the document file through a regular expression, and determining the corresponding slot position information through the preceding text and the following text of the slot position.
3. The intelligent document filling method according to claim 1 or 2, wherein when the type of the document file is a picture format or a PDF document format, the method of locating the slot in the document file includes:
obtaining a picture of the document file in a picture format or the picture obtained by unloading the document file in a PDF document format;
carrying out graying processing on the picture;
carrying out an opening operation of firstly corroding and then expanding the picture, and extracting underlines of the slot positions;
performing a re-expansion operation on the picture, wherein the re-expansion operation is to expand the outline of the image of the picture;
and detecting and extracting underlines of the slot positions in the picture by using Hough transform, obtaining coordinates of each slot position in the picture, and obtaining the slot position information.
4. The intelligent document filling method according to claim 3, wherein after obtaining the coordinates of each slot in the picture, the method further comprises:
and acquiring the coordinates of each character in the document file by using an OCR (optical character recognition), taking the character closest to the coordinates of the lower left corner of the slot position as a former text of the slot position, and taking the character closest to the coordinates of the upper right corner of the slot position as a latter text of the slot position based on the coordinates of each character, thereby obtaining the former and latter text information of the slot position.
5. The intelligent document filling method according to claim 1, wherein the method of generating the slot tag corresponding to the slot includes:
and generating the slot position label according to the slot position context information by utilizing a text generation model.
6. The intelligent document filling method according to claim 1 or 5, wherein the method for generating the slot tag corresponding to the slot includes:
generating a corresponding prompt mask at the slot position according to the contextual information of the slot position by using a prompt learning method based on a pre-training language model;
and predicting and generating the slot position label by utilizing the pre-training language model according to the prompt mask.
7. The intelligent document filling method according to claim 1, wherein the method for acquiring the filling content corresponding to the slot tag includes:
and integrating the slot position and the corresponding slot position label into a form, sending the form to a user, acquiring information input by the user, and obtaining the filling content corresponding to the slot position label.
8. The intelligent document filling method according to claim 1, wherein the method of restoring the filling content corresponding to the slot to the document file includes:
for the document file in the DOCX document format, directly replacing the filling contents with the blanks of the corresponding slot positions;
and for the document files in the picture format and the PDF document format, covering the blank of the corresponding slot position with the paste map of the filling content in a layer covering mode.
9. An intelligent document filling device, comprising:
a memory; and
a processor coupled to the memory, the processor configured to:
acquiring a document file to be filled, wherein the type of the document file comprises a DOCX document format, a picture format and/or a PDF document format;
positioning a slot position in the document file, and acquiring the contextual information and the slot position information of the slot position;
generating a slot position label corresponding to the slot position according to the contextual information of the slot position;
and acquiring filling contents corresponding to the slot position tags based on the slot position tags corresponding to the slot positions, and restoring the filling contents corresponding to the slot positions to the document file based on the slot position information corresponding to the slot positions.
10. A computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a machine, implements the steps of the method of any of claims 1 to 8.
CN202211050882.3A 2022-08-31 2022-08-31 Intelligent document filling method and device and storage medium Active CN115130437B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211050882.3A CN115130437B (en) 2022-08-31 2022-08-31 Intelligent document filling method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211050882.3A CN115130437B (en) 2022-08-31 2022-08-31 Intelligent document filling method and device and storage medium

Publications (2)

Publication Number Publication Date
CN115130437A true CN115130437A (en) 2022-09-30
CN115130437B CN115130437B (en) 2022-12-06

Family

ID=83387904

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211050882.3A Active CN115130437B (en) 2022-08-31 2022-08-31 Intelligent document filling method and device and storage medium

Country Status (1)

Country Link
CN (1) CN115130437B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117057325A (en) * 2023-10-13 2023-11-14 湖北华中电力科技开发有限责任公司 Form filling method and system applied to power grid field and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109933768A (en) * 2019-03-11 2019-06-25 徐鹏 A kind of legal documents Intelligent treatment, write method and system
CN112529014A (en) * 2020-12-14 2021-03-19 中国平安人寿保险股份有限公司 Straight line detection method, information extraction method, device, equipment and storage medium
CN113408268A (en) * 2021-06-22 2021-09-17 平安科技(深圳)有限公司 Slot filling method, device, equipment and storage medium
CN113961705A (en) * 2021-10-29 2022-01-21 聚好看科技股份有限公司 Text classification method and server

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109933768A (en) * 2019-03-11 2019-06-25 徐鹏 A kind of legal documents Intelligent treatment, write method and system
CN112529014A (en) * 2020-12-14 2021-03-19 中国平安人寿保险股份有限公司 Straight line detection method, information extraction method, device, equipment and storage medium
CN113408268A (en) * 2021-06-22 2021-09-17 平安科技(深圳)有限公司 Slot filling method, device, equipment and storage medium
CN113961705A (en) * 2021-10-29 2022-01-21 聚好看科技股份有限公司 Text classification method and server

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117057325A (en) * 2023-10-13 2023-11-14 湖北华中电力科技开发有限责任公司 Form filling method and system applied to power grid field and electronic equipment
CN117057325B (en) * 2023-10-13 2024-01-05 湖北华中电力科技开发有限责任公司 Form filling method and system applied to power grid field and electronic equipment

Also Published As

Publication number Publication date
CN115130437B (en) 2022-12-06

Similar Documents

Publication Publication Date Title
CN113807098B (en) Model training method and device, electronic equipment and storage medium
CN107220235B (en) Speech recognition error correction method and device based on artificial intelligence and storage medium
CN109685056A (en) Obtain the method and device of document information
CN114596566B (en) Text recognition method and related device
US20230206670A1 (en) Semantic representation of text in document
CN113360699A (en) Model training method and device, image question answering method and device
CN110968725A (en) Image content description information generation method, electronic device, and storage medium
CN115952791A (en) Chapter-level event extraction method, device and equipment based on machine reading understanding and storage medium
CN115130437B (en) Intelligent document filling method and device and storage medium
CN112269872A (en) Resume analysis method and device, electronic equipment and computer storage medium
CN112839185B (en) Method, apparatus, device and medium for processing image
CN112464927B (en) Information extraction method, device and system
CN113762109A (en) Training method of character positioning model and character positioning method
CN113642569A (en) Unstructured data document processing method and related equipment
CN111881900A (en) Corpus generation, translation model training and translation method, apparatus, device and medium
CN113761209B (en) Text splicing method and device, electronic equipment and storage medium
CN115690816A (en) Text element extraction method, device, equipment and medium
CN116306506A (en) Intelligent mail template method based on content identification
CN114818718A (en) Contract text recognition method and device
CN115358186B (en) Generating method and device of slot label and storage medium
Cho et al. Design of image generation system for DCGAN-based kids' book text
CN115376153B (en) Contract comparison method, device and storage medium
CN114444470B (en) Method, device, medium and equipment for recognizing domain named entities in patent text
CN117542056A (en) Method, device, storage medium and processor for generating text from graphic data
CN116681058A (en) Text processing method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant