CN114266682A - Guarantee information acquisition method and device, storage medium and electronic equipment - Google Patents

Guarantee information acquisition method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN114266682A
CN114266682A CN202210188632.XA CN202210188632A CN114266682A CN 114266682 A CN114266682 A CN 114266682A CN 202210188632 A CN202210188632 A CN 202210188632A CN 114266682 A CN114266682 A CN 114266682A
Authority
CN
China
Prior art keywords
text
processed
information
guarantee
entity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210188632.XA
Other languages
Chinese (zh)
Inventor
冷小萱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jindi Technology Co Ltd
Original Assignee
Beijing Jindi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jindi Technology Co Ltd filed Critical Beijing Jindi Technology Co Ltd
Priority to CN202210188632.XA priority Critical patent/CN114266682A/en
Publication of CN114266682A publication Critical patent/CN114266682A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a guarantee information acquisition method, a guarantee information acquisition device, a storage medium and an electronic device. Firstly, acquiring a referee document to be processed; extracting a text to be processed in the referee document based on one or more rules of a case-by constraint rule, a keyword constraint rule and a paragraph constraint rule; extracting entities included in the text to be processed; and extracting guarantee information from the text to be processed based on the entity, wherein the guarantee information at least comprises a guaranteeing party, a guaranteed party and a creditor. The method and the system perform the mining of the guarantee risk information from the specific text type of the referee document, realize the extraction of the structured guarantee information according to the related text characteristics, and solve the problems that the guarantee information of medium and small enterprises cannot be obtained and the risk is difficult to evaluate.

Description

Guarantee information acquisition method and device, storage medium and electronic equipment
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a method and an apparatus for obtaining warranty information, a storage medium, and an electronic device.
Background
The vouching service has a certain risk for the enterprise that once the vouched-for party is unable to pay back the debt due, the vouching party has to assume the accompanying responsibility to be responsible for the debt. Guaranteed risk is one of the important dimensions for assessing the overall risk of an enterprise.
Currently, the sources of the guarantee information are mainly disclosed by listed companies, and the guarantee information of medium and small enterprises is obtained by few public channels at present.
Disclosure of Invention
In view of the above shortcomings of the prior art, the present disclosure aims to provide a method, an apparatus, a storage medium and an electronic device for obtaining guarantee information, which are used for efficiently and accurately mining guarantee information of different enterprises.
In a first aspect, the present disclosure provides a method for obtaining warranty information, including:
acquiring a referee document to be processed;
extracting a text to be processed in the referee document based on one or more rules of a case-by constraint rule, a keyword constraint rule and a paragraph constraint rule;
extracting entities included in the text to be processed;
and extracting guarantee information from the text to be processed based on the entity, wherein the guarantee information at least comprises a guaranteeing party, a guaranteed party and a creditor.
Optionally, the extracting the text to be processed in the referee document based on the case-based constraint rule includes:
identifying case-by fields in the referee document to determine case-by categories of the referee document;
and at least extracting the referee document with the case category as the borrowing contract category as the text to be processed.
Optionally, the extracting the text to be processed from the referee document based on the keyword constraint rule includes:
and carrying out full-text retrieval on the referee document, and taking the referee document as a text to be processed if specified keywords related to guarantee information are retrieved.
Optionally, the specified keywords related to the guarantee information at least include one or more of "guarantee", "guarantee of responsibility", "responsibility connection".
Optionally, the extracting a text to be processed from the referee document based on the paragraph constraint rule includes:
structuring the referee document, wherein the structured referee document at least comprises one or more text blocks of a party information text block, an original announcement appeal text block, an announced dialectical text block, a trial passing text block, a court finding text block and a trial result text block;
and screening the structured referee document to reserve the text block of the trial passing through the text block, the found text block in the institute and the text block of the trial result as the text to be processed.
Optionally, the entity includes a business entity and/or a personal entity, and extracting the entity included in the text to be processed includes:
identifying the entity full name mentioned in the text to be processed by adopting an entity identification model;
acquiring a first type regular expression, wherein the first type regular expression is determined according to an expression format of a full entity name and a short entity name in the text to be processed;
and in the text to be processed, performing regular matching based on the first type regular expression to determine a mapping pair of entity full name and entity short name.
Optionally, the guarantee information further includes a guarantee type, and the following steps are performed when the guarantee type is determined:
when the text content of the text to be processed hits a keyword 'with responsibility', determining the guarantee type as a with responsibility guarantee;
and when the text content of the text to be processed does not hit the keyword 'with responsibility', determining the guarantee type as a general responsibility guarantee.
Optionally, the extracting of the guarantee information from the text to be processed based on the entity includes:
acquiring a second type regular expression, wherein the second type regular expression is determined according to the expression formats of a guarantee party, a guaranteed party and a creditor in the text to be processed;
in the text to be processed, performing regular matching based on the second type regular expression and the entity to determine an entity matched with a sponsor, an entity matched with a sponsor and an entity matched with a creditor;
and determining the entities obtained in the regular matching as corresponding guarantors, insured parties and creditors in the guaranty information.
Optionally, the extracting of the guarantee information from the text to be processed based on the entity includes:
acquiring a third type regular expression, wherein the third type regular expression is determined according to the expression format of the eligible money amount information in the text to be processed;
performing regular matching in the text to be processed based on the third type regular expression to extract qualified money information;
converting the amount information in the Chinese format and/or the amount information in the Chinese and digital mixed format into an amount value taking the element as a unit;
and extracting the maximum one from the sum values, and determining the maximum one as the guaranteed bond principal in the guarantee information.
Optionally, the guarantee information collectively includes data of five dimensions of guarantee type, guarantor, insured party, creditor and guaranteed creditor principal, and the method further includes:
and filtering the guarantee information, and reserving guarantee information with five dimensions of which the data are not empty as effective guarantee information.
In a second aspect, based on the method for obtaining warranty information according to the first aspect of the present disclosure, an embodiment of the present disclosure further provides a warranty information obtaining apparatus, including:
the document acquisition module is used for acquiring a referee document to be processed;
the text determination module is used for extracting a text to be processed in the referee document based on one or more rules of a case-based constraint rule, a keyword constraint rule and a paragraph constraint rule;
the entity extraction module is used for extracting entities included in the text to be processed;
and the guarantee information determining module is used for extracting guarantee information from the text to be processed based on the entity, and the guarantee information at least comprises a guaranteeing party, a guaranteed party and a creditor.
Optionally, the text determining module, when extracting the text to be processed in the referee document based on a case and a constraint rule, is configured to:
identifying case-by fields in the referee document to determine case-by categories of the referee document;
and at least extracting the referee document with the case category as the borrowing contract category as the text to be processed.
Optionally, the text determining module, when extracting the text to be processed in the referee document based on the keyword constraint rule, is configured to:
and carrying out full-text retrieval on the referee document, and taking the referee document as a text to be processed if specified keywords related to guarantee information are retrieved.
Optionally, the specified keywords related to the guarantee information at least include one or more of "guarantee", "guarantee of responsibility", "responsibility connection".
Optionally, the text determining module, when the text to be processed in the referee document is extracted based on the paragraph constraint rule, is configured to:
structuring the referee document, wherein the structured referee document at least comprises one or more text blocks of a party information text block, an original announcement appeal text block, an announced dialectical text block, a trial passing text block, a court finding text block and a trial result text block;
and screening the structured referee document to reserve the text block of the trial passing through the text block, the found text block in the institute and the text block of the trial result as the text to be processed.
Optionally, the entities include business entities and/or personal entities, and the entity extraction module, when extracting the entities included in the text to be processed, is configured to:
identifying the entity full name mentioned in the text to be processed by adopting an entity identification model;
acquiring a first type regular expression, wherein the first type regular expression is determined according to an expression format of a full entity name and a short entity name in the text to be processed;
and in the text to be processed, performing regular matching based on the first type regular expression to determine a mapping pair of entity full name and entity short name.
Optionally, the guarantee information further includes a guarantee type, and the guarantee information determining module, when determining the guarantee type, is configured to:
when the text content of the text to be processed hits a keyword 'with responsibility', determining the guarantee type as a with responsibility guarantee;
and when the text content of the text to be processed does not hit the keyword 'with responsibility', determining the guarantee type as a general responsibility guarantee.
Optionally, the guarantee information determining module, when the guarantee information is extracted from the text to be processed based on the entity, is configured to:
acquiring a second type regular expression, wherein the second type regular expression is determined according to the expression formats of a guarantee party, a guaranteed party and a creditor in the text to be processed;
in the text to be processed, performing regular matching based on the second type regular expression and the entity to determine an entity matched with a sponsor, an entity matched with a sponsor and an entity matched with a creditor;
and determining the entities obtained in the regular matching as corresponding guarantors, insured parties and creditors in the guaranty information.
Optionally, the guarantee information determining module, when the guarantee information is extracted from the text to be processed based on the entity, is configured to:
acquiring a third type regular expression, wherein the third type regular expression is determined according to the expression format of the eligible money amount information in the text to be processed;
performing regular matching in the text to be processed based on the third type regular expression to extract qualified money information;
converting the amount information in the Chinese format and/or the amount information in the Chinese and digital mixed format into an amount value taking the element as a unit;
and extracting the maximum one from the sum values, and determining the maximum one as the guaranteed bond principal in the guarantee information.
Optionally, the guarantee information collectively includes data of five dimensions of guarantee type, guarantor, insured party, creditor and guaranteed creditor principal, and the apparatus further includes:
and the filtering module is used for filtering the guarantee information and reserving guarantee information with five dimensions of which the data are not empty as effective guarantee information.
In a third aspect, an embodiment of the present disclosure further provides a storage medium, where a computer program is stored on the storage medium, and when the processor executes the computer program stored on the storage medium, the processor implements any one of the methods for obtaining guarantee information according to the first aspect of the present disclosure.
In a fourth aspect, an embodiment of the present disclosure further provides an electronic device, where the electronic device includes a memory and a processor, where the memory is used to store a computer-executable program, and the processor is used to run the computer-executable program to implement any one of the guarantee information acquisition methods described in the first aspect of the present disclosure.
The present disclosure provides a method, an apparatus, a storage medium and an electronic device for obtaining warranty information, wherein the method for obtaining warranty information comprises obtaining a referee document to be processed; extracting a text to be processed in the referee document based on one or more rules of a case-by constraint rule, a keyword constraint rule and a paragraph constraint rule; extracting entities included in the text to be processed; the method comprises the steps that guarantee information is extracted from the text to be processed based on the entity, the guarantee information at least comprises a guaranteeing party, a guaranteed party and a creditor, the guarantee risk information is mined from a specific text type of a referee document, the extraction of the structured guarantee information is realized according to the related text characteristics, and the problems that the guarantee information of small and medium-sized enterprises is not obtained and the risk is difficult to evaluate are solved.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art according to the drawings.
Fig. 1 is a flowchart of a security information acquisition method according to an embodiment of the present disclosure;
fig. 2 is another work flow diagram of a method for obtaining warranty information according to an embodiment of the present disclosure;
fig. 3 is another work flow diagram of a method for obtaining warranty information according to an embodiment of the present disclosure;
fig. 4 is another work flow diagram of a method for obtaining warranty information according to an embodiment of the present disclosure;
fig. 5 is another work flow diagram of a method for obtaining warranty information according to an embodiment of the present disclosure;
fig. 6 is another work flow diagram of a method for obtaining warranty information according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of a security information acquisition apparatus according to an embodiment of the present disclosure;
fig. 8 is a schematic diagram of a hardware structure of an electronic device for obtaining warranty information according to an embodiment of the present disclosure.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the embodiments of the present disclosure, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, but not all the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present disclosure should fall within the scope of protection of the embodiments in the present disclosure.
The vouching service has a certain risk for the enterprise that once the vouched-for party is unable to pay back the debt due, the vouching party has to assume the accompanying responsibility to be responsible for the debt. Guaranteed risk is one of the important dimensions for assessing the overall risk of an enterprise.
Currently, the sources of the guarantee information are mainly disclosed by listed companies, and the guarantee information of medium and small enterprises is obtained by few public channels at present.
The disclosure is directed to solving the above problems, and the following detailed description of the embodiments of the disclosure is further described with reference to the accompanying drawings.
The first embodiment,
An embodiment of the present disclosure provides a method for acquiring warranty information, as shown in fig. 1, where fig. 1 is a work flow diagram of a method for acquiring warranty information according to an implementation of the present disclosure, and the method for acquiring warranty information includes steps S101 to S104:
and step S101, acquiring a referee document to be processed.
Specifically, in one embodiment of the present disclosure, the referee document is a document that records the trial process and results of the people's court, is a carrier of the results of the litigation activities, and is the only certificate for the people's court to determine and distribute the entity right obligations of the parties. A referee document with complete structure, complete elements and strict logic is a certificate for the right and burden of a party and is also an important basis for the upper-level people court to supervise the civil judgment activities of the lower-level people court.
Common referee documents include civil referee documents, criminal referee documents, administrative referee documents, and other general litigation documents. In one implementation mode of the embodiment, the source of the guarantee information is effectively expanded by taking the specific text such as the referee document as a basis for extracting the guarantee information between enterprises.
The prior art related to legal documents is referred to the following patent publications:
CN110781299A, CN111784505A, CN110599289A and CN 113011185A.
S102, extracting a text to be processed in the referee document based on one or more rules of a case-based constraint rule, a keyword constraint rule and a paragraph constraint rule;
after acquiring the referee document to be processed, the referee document to be processed can be further processed, and the text obtained after processing is used as the text to be processed which is actually subjected to guarantee information mining.
Because the data volume of legal litigation is huge, and the referee documents containing guarantee information are found to be smaller in the whole data volume through sampling statistics, the text of the referee documents is restrained, and then only the referee documents in the restraint range are extracted with the guarantee information.
Specifically, the text to be processed in the referee document can be extracted based on one or more of a case-by constraint rule, a keyword constraint rule and a paragraph constraint rule. The above rules are set forth below.
The middle and the back plan are defined by constraint rules:
when the text to be processed in the official document is extracted based on the case-by-case constraint rule, referring to fig. 2, the following steps may be included:
s201, identifying case-by fields in the referee document to determine the case-by category of the referee document;
s202, at least extracting the referee documents with the case category as the borrowing contract category as the text to be processed.
The referee 'case by' field comprises a plurality of values such as folk loan dispute, financial loan contract dispute, buying and selling contract dispute, leasing contract dispute, motor vehicle traffic accident liability dispute and the like. After sampling statistics, the referee documents containing guarantee information are mainly concentrated in the categories of borrowing contracts, financial borrowing contracts and the like. Therefore, the text range for extracting the warranty risk information in the present embodiment is limited to the above-mentioned case category, and the official documents of other cases are temporarily not listed in the extraction range due to the interference of the extraction result.
In the positive key word constraint rules:
when the text to be processed in the official document is extracted based on the keyword constraint rule, full-text search can be performed on the official document, and if the specified keyword related to the guarantee information is searched out, the official document is used as the text to be processed.
In some embodiments, the specified keywords related to the guarantee information at least include one or more of "guarantee", "responsibility guarantee" and "responsibility connection", and the specified keywords may also include other keywords, which is not limited in this embodiment.
In some embodiments, only the body part of the official document is retrieved when performing keyword retrieval to ensure the accuracy of the processing.
The in-front paragraph constraint rules:
when the text to be processed in the official document is extracted based on the paragraph constraint rule, referring to fig. 3, the following steps may be performed:
s301, structuring the referee document, wherein the structured referee document at least comprises a party information text block, an original appeal text block, a reported allegian text block, an audition found text block, a court found text block and one or more text blocks in an audition result text block;
s302, screening the structured referee document to reserve the text block of the trial passing through the text block, the found text block in the institute and the text block of the trial result as the text to be processed.
Paragraph constraints refer to paragraph screening of the document after structuring. The structured document mainly comprises fields of party information, original notice appeals, announcement dialects, trial passing, local court finding, trial results and the like, and because the information in the original notice alleys and the announcement appeals is not completely true, the embodiment limits the range of extracted texts, and only extracts the related fields of the guarantee information by using the text information of the three parts of the trial passing, the local source finding and the trial results.
The processing mode reduces the data volume to be processed in the process of mining the guarantee information, saves the processing resources of the system and improves the efficiency of mining the guarantee information.
In an optional embodiment, after obtaining the referee document to be processed, the case is sequentially processed by the constraint rule, the keyword constraint rule and the paragraph constraint rule, and the result obtained by the previous step of processing is applied in each step of processing. The method comprises the steps of firstly screening a referee document to be processed by applying case-based constraint rules, screening a large number of referee documents which do not meet conditions by using cases and fields, then retrieving the cases obtained in the last step by the referee documents which meet the conditions and the fields by using keyword constraint rules to obtain the referee document which really relates to guarantee information, finally applying paragraph constraint rules to the referee documents which determine the referee documents which relate to the guarantee information, and extracting texts which comprise real and effective guarantee information in the referee document to obtain the final text to be processed. Such a processing sequence may further save processing resources as each step requires different processing resources.
And step S103, determining at least two party entities from the referee document.
Specifically, in one embodiment of the present disclosure, the principal entity includes a business entity and/or a personal entity.
In an implementation manner of the embodiment of the present disclosure, the entity included in the text to be processed is extracted, and referring to fig. 4, the following steps may be performed:
s401, recognizing entity full names mentioned in the text to be processed by adopting an entity recognition model;
s402, obtaining a first type regular expression, wherein the first type regular expression is determined according to an expression format of a full entity name and a short entity name in the text to be processed;
s403, performing regular matching on the basis of the first type regular expression in the text to be processed to determine a mapping pair of entity full names and entity short names.
Firstly, an entity recognition model is adopted to recognize the enterprise full name and the name of a person, and all the names and the enterprise full name entities mentioned in the text are obtained. In some embodiments, the entity recognition model is an open source model based on bert + crf, and the entity recognition model can be trained by referee document sample data in advance.
And after the entity full name is obtained, adopting a first type regular expression to obtain a mapping pair of the entity full name and the entity short name. In particular, official documents often simplify enterprise-wide calling. For example: the text to be processed comprises the text: "review the issue of copyright dispute between X company (hereinafter referred to as" ZhongX company ") in Fuzhou of the applicant and Jinx company of Jinan of the applicant, and apply for review to the institute without taking the national judgment of Romin Final No. X of Shandong senior national institute (2020).
The first type of regular expression can be predefined in dependence on the type of text feature, for example using the form "(hereinafter abbreviated (. about.))
Figure 840198DEST_PATH_IMAGE001
) "," (hereinafter (. about.) ")
Figure 108369DEST_PATH_IMAGE001
) "," (abbreviation (. about.) ")
Figure 428492DEST_PATH_IMAGE001
) "to obtain the mapping pair of entity full name and entity short name. In the above case, company X in fuzhou corresponds to the abbreviation "company X", and company jinx in jenan corresponds to the abbreviation "company X".
And step S104, extracting guarantee information from the text to be processed based on the entity, wherein the guarantee information at least comprises a guaranteeing party, a guaranteed party and a creditor.
In some embodiments, the vouching information includes data for five dimensions in total, each being: type of assurance, party being insured, creditor, and principal being insured. Which are described separately below.
The driving warranty type:
the guarantee types are divided into joint responsibility guarantee and general responsibility guarantee, according to the stipulations, the parties have no stipulations or stipulations ambiguity on the guarantee modes in the guarantee contracts, and the guarantee responsibilities are undertaken according to the general guarantees. The present embodiment also follows the above specifications, for example: when the text content hits the 'associated responsibility' + 'guarantee' or 'associated responsibility' + 'guarantee' keyword, the corresponding guarantee type is associated responsibility guarantee; the corresponding guarantee type of the rest texts only hitting the key words of the guarantee is the general responsibility guarantee.
In-leaf vouchers, insured parties and creditors:
in one implementation manner of the embodiment of the present disclosure, when the guarantee information is extracted from the text to be processed based on the entity, referring to fig. 5, the following steps may be performed:
s501, a second type regular expression is obtained, and the second type regular expression is determined according to the expression formats of a guarantee party, a guaranteed party and a creditor in the text to be processed;
s502, performing regular matching on the basis of the second type of regular expression and the entity in the text to be processed to determine an entity matched with a security party, an entity matched with a secured party and an entity matched with a creditor;
and S503, determining the entities obtained in the regular matching as corresponding guarantors, insured parties and creditors in the guaranty information.
Specifically, the extraction of the guarantor, the insured party and the creditor has certain dependence, and can be extracted by combining the result of entity identification through a second type regular expression. Three fields, the sponsor, the insured party, and the creditor, generally appear in the referee document in two presentation formats: one expression format is "returning the principal for borrowing from A to B", and the other expression format is "C undertakes the responsibility for A with the payment".
In the first expression format, a is a debtor and typically the debtor and the insured life are the same entity and B is a creditor.
In the second expression format, C is the guarantor and a is the insured life.
The second type regular expression may be predefined according to the above two expression formats, and in some embodiments, the second type regular expression may include, for example, the following two expressions.
(.*
Figure 923582DEST_PATH_IMAGE002
) (return | payment | reimbursement | refund | settlement | payment | reimbursement) (
Figure 696366DEST_PATH_IMAGE002
) (principal | Renminbi | Payment of Payment | Payment of money) ({ 0,40} element) "
(.*
Figure 768228DEST_PATH_IMAGE002
) To
Figure 208436DEST_PATH_IMAGE002
[ undertake burden]{0,4} (together. | holds. {0,6} responsibility | (. x.) ]
Figure 340340DEST_PATH_IMAGE002
) [ undertake burden]{0,4} (with | common | guarantee) {0,6} responsibility
Based on the second type regular expression, data of three dimensions of a guarantee party, a guaranteed party and a creditor in the guarantee information can be determined.
In the future, the guaranteed credit principal:
in one implementation manner of the embodiment of the present disclosure, when the guarantee information is extracted from the text to be processed based on the entity, referring to fig. 6, the following steps may be performed:
s601, obtaining a third type regular expression, wherein the third type regular expression is determined according to the expression format of the eligible money information in the text to be processed;
s602, in the text to be processed, performing regular matching on the entity based on the third type regular expression to extract qualified money information;
s603, converting the money information in the Chinese format and/or the money information in the Chinese and digital mixed format into money numerical values with the yuan as a unit;
and S604, extracting the maximum one from the sum numerical values, and determining the maximum one as the guaranteed bond principal in the guarantee information.
In official documents according to guaranteed credit principalThe expression features of (1) can be used in the form of "borrowing. {0,20}
Figure 866000DEST_PATH_IMAGE003
Yuan | principal {0,20}
Figure 475972DEST_PATH_IMAGE003
{0,20} loan
Figure 106106DEST_PATH_IMAGE003
The third type regular expression of the element extracts the amount field in the original text, and after extraction, the amount information of the Chinese text or the mixture of the Chinese text and the number needs to be converted into the amount value taking the element as a unit. Because the referee document has a plurality of amount information such as the loan principal, the amount of the paid loan, the remaining loan as the repayment amount, etc., the most important loan principal is taken from the amounts conforming to the extraction expression during extraction, namely the guaranteed debt principal.
It should be noted that the eligible amount information does not include the cases of "total amount of interest" and "highest loan guarantee amount" included in the sentence, which are not the case of borrowing principal.
The three extraction steps of extracting the guarantee type in the guarantee information, extracting the guaranteeing party, the guaranteed party and the creditor in the guarantee information and extracting the guaranteed creditor principal in the guarantee information are mutually independent and have no sequential dependence. In some embodiments, the above three types of extraction may be performed in parallel by using three independent threads, respectively, to improve the efficiency of obtaining the guarantee information.
In some embodiments, due to the special format of the partial text expression to be processed, the predefined rule keywords or regular expressions are missed, so that the guarantee information data of partial dimensions is empty.
After acquiring the guarantee information, filtering the guarantee information, and reserving guarantee information with all five-dimensional data of guarantee type, guarantee party, guaranteed party, credited party, creditor and guaranteed credited principal as effective guarantee information.
The present disclosure provides a warranty information acquisition method. Firstly, acquiring a referee document to be processed; extracting a text to be processed in the referee document based on one or more rules of a case-by constraint rule, a keyword constraint rule and a paragraph constraint rule; extracting entities included in the text to be processed; and extracting guarantee information from the text to be processed based on the entity, wherein the guarantee information at least comprises a guaranteeing party, a guaranteed party and a creditor. According to the method and the device, the acquisition channel can be expanded to obtain the guarantee information through the referee document, so that the obtained guarantee information is more comprehensive, and on the other hand, the required information is extracted based on the common expression format of the referee document, so that the implementation cost is lower, and the method and the device are more efficient and faster.
Example II,
In a second aspect, based on the guarantee information acquiring method according to the first aspect of the present disclosure, an embodiment of the present disclosure further provides a guarantee information acquiring apparatus, as shown in fig. 7, fig. 7 is a schematic structural diagram of a guarantee information acquiring apparatus 70 according to an embodiment of the present disclosure, where the guarantee information acquiring apparatus 70 includes:
a document acquiring module 701, configured to acquire a referee document to be processed;
a text determining module 702, configured to extract a text to be processed in the referee document based on one or more rules of a case-based constraint rule, a keyword constraint rule, and a paragraph constraint rule;
an entity extraction module 703, configured to extract an entity included in the text to be processed;
and a guarantee information determining module 704, configured to extract guarantee information from the to-be-processed text based on the entity, where the guarantee information includes at least a guaranteeing party, a guaranteed party, and a creditor.
Optionally, the text determining module, when extracting the text to be processed in the referee document based on a case and a constraint rule, is configured to:
identifying case-by fields in the referee document to determine case-by categories of the referee document;
and at least extracting the referee document with the case category as the borrowing contract category as the text to be processed.
Optionally, the text determining module, when extracting the text to be processed in the referee document based on the keyword constraint rule, is configured to:
and carrying out full-text retrieval on the referee document, and taking the referee document as a text to be processed if specified keywords related to guarantee information are retrieved.
Optionally, the specified keywords related to the guarantee information at least include one or more of "guarantee", "guarantee of responsibility", "responsibility connection".
Optionally, the text determining module, when the text to be processed in the referee document is extracted based on the paragraph constraint rule, is configured to:
structuring the referee document, wherein the structured referee document at least comprises one or more text blocks of a party information text block, an original announcement appeal text block, an announced dialectical text block, a trial passing text block, a court finding text block and a trial result text block;
and screening the structured referee document to reserve the text block of the trial passing through the text block, the found text block in the institute and the text block of the trial result as the text to be processed.
Optionally, the entities include business entities and/or personal entities, and the entity extraction module, when extracting the entities included in the text to be processed, is configured to:
identifying the entity full name mentioned in the text to be processed by adopting an entity identification model;
acquiring a first type regular expression, wherein the first type regular expression is determined according to an expression format of a full entity name and a short entity name in the text to be processed;
and in the text to be processed, performing regular matching based on the first type regular expression to determine a mapping pair of entity full name and entity short name.
Optionally, the guarantee information further includes a guarantee type, and the guarantee information determining module, when determining the guarantee type, is configured to:
when the text content of the text to be processed hits a keyword 'with responsibility', determining the guarantee type as a with responsibility guarantee;
and when the text content of the text to be processed does not hit the keyword 'with responsibility', determining the guarantee type as a general responsibility guarantee.
Optionally, the guarantee information determining module, when the guarantee information is extracted from the text to be processed based on the entity, is configured to:
acquiring a second type regular expression, wherein the second type regular expression is determined according to the expression formats of a guarantee party, a guaranteed party and a creditor in the text to be processed;
in the text to be processed, performing regular matching based on the second type regular expression and the entity to determine an entity matched with a sponsor, an entity matched with a sponsor and an entity matched with a creditor;
and determining the entities obtained in the regular matching as corresponding guarantors, insured parties and creditors in the guaranty information.
Optionally, the guarantee information determining module, when the guarantee information is extracted from the text to be processed based on the entity, is configured to:
acquiring a third type regular expression, wherein the third type regular expression is determined according to the expression format of the eligible money amount information in the text to be processed;
performing regular matching in the text to be processed based on the third type regular expression to extract qualified money information;
converting the amount information in the Chinese format and/or the amount information in the Chinese and digital mixed format into an amount value taking the element as a unit;
and extracting the maximum one from the sum values, and determining the maximum one as the guaranteed bond principal in the guarantee information.
Optionally, the guarantee information collectively includes data of five dimensions of guarantee type, guarantor, insured party, creditor and guaranteed creditor principal, and the apparatus further includes:
and the filtering module is used for filtering the guarantee information and reserving guarantee information with five dimensions of which the data are not empty as effective guarantee information.
Example III,
In a third aspect, an embodiment of the present disclosure further provides a storage medium, where a computer program is stored on the storage medium, and when the processor executes the computer program stored on the storage medium, the processor implements any one of the guarantee information obtaining methods according to the first aspect of the present disclosure, the guarantee information obtaining method includes, but is not limited to:
acquiring a referee document to be processed;
extracting a text to be processed in the referee document based on one or more rules of a case-by constraint rule, a keyword constraint rule and a paragraph constraint rule;
extracting entities included in the text to be processed;
and extracting guarantee information from the text to be processed based on the entity, wherein the guarantee information at least comprises a guaranteeing party, a guaranteed party and a creditor.
Example four,
Based on the video playing test method described in the first embodiment of the present disclosure, an electronic device for relationship acquisition is further provided in the first embodiment of the present disclosure, as shown in fig. 8, fig. 8 is a schematic diagram of a hardware structure of the electronic device for relationship acquisition provided in the first embodiment of the present disclosure; the hardware structure of the electronic device may include: a processor 801, a communication interface 802, a computer-readable medium 803, and a communication bus 804;
the processor 801, the communication interface 802 and the computer-readable medium 803 complete communication with each other through a communication bus 808;
optionally, the communication interface 802 may be an interface of a communication module, such as an interface of a GSM module;
the processor 801 may be specifically configured to run the executable program stored in the memory, so as to execute all or part of the method of any one of the above-described method embodiments.
The Processor 801 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present disclosure may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The present disclosure has thus described specific embodiments of the present subject matter. Other embodiments are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may be advantageous.
In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. The person involved almost always obtains the corresponding hardware circuit configuration by programming the improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the design personnel without requiring the chip manufacturer to design and manufacture a dedicated integrated circuit chip. Furthermore, nowadays, instead of manually making an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Hardware Description Language), traffic, pl (core universal Programming Language), HDCal (jhdware Description Language), lang, Lola, HDL, laspam, hardward Description Language (vhr Description Language), vhal (Hardware Description Language), and vhigh-Language, which are currently used in most common. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the various elements may be implemented in the same one or more software and/or hardware implementations in practicing the disclosure.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The disclosure may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular transactions or implement particular abstract data types. The present disclosure may also be practiced in distributed computing environments where transactions are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only an example of the present disclosure and is not intended to limit the present disclosure. Various modifications and variations of this disclosure will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure should be included in the scope of the claims of the present disclosure.

Claims (13)

1. A method for obtaining warranty information, comprising:
acquiring a referee document to be processed;
extracting a text to be processed in the referee document based on one or more rules of a case-by constraint rule, a keyword constraint rule and a paragraph constraint rule;
extracting entities included in the text to be processed;
and extracting guarantee information from the text to be processed based on the entity, wherein the guarantee information at least comprises a guaranteeing party, a guaranteed party and a creditor.
2. The method of claim 1, wherein the extracting the text to be processed in the official document based on the case-by-case constraint rule comprises:
identifying case-by fields in the referee document to determine case-by categories of the referee document;
and at least extracting the referee document with the case category as the borrowing contract category as the text to be processed.
3. The method of claim 1, wherein the extracting the text to be processed from the official document based on the keyword constraint rule comprises:
and carrying out full-text retrieval on the referee document, and taking the referee document as a text to be processed if specified keywords related to guarantee information are retrieved.
4. The collateral information acquisition method according to claim 3, wherein the specified keyword relating to the collateral information includes at least one or more of "guarantee", "guarantee of responsibility", and "responsibility in connection".
5. The method of claim 1, wherein the extracting text to be processed from the official document based on the paragraph constraint rule comprises:
structuring the referee document, wherein the structured referee document at least comprises one or more text blocks of a party information text block, an original announcement appeal text block, an announced dialectical text block, a trial passing text block, a court finding text block and a trial result text block;
and screening the structured referee document to reserve the text block of the trial passing through the text block, the found text block in the institute and the text block of the trial result as the text to be processed.
6. The method according to claim 1, wherein the entity includes a business entity and/or a personal entity, and the extracting the entity included in the text to be processed includes:
identifying the entity full name mentioned in the text to be processed by adopting an entity identification model;
acquiring a first type regular expression, wherein the first type regular expression is determined according to an expression format of a full entity name and a short entity name in the text to be processed;
and in the text to be processed, performing regular matching based on the first type regular expression to determine a mapping pair of entity full name and entity short name.
7. The vouching information obtaining method according to claim 1, wherein the vouching information further includes a guarantee type, and when the guarantee type is determined, the following steps are performed:
when the text content of the text to be processed hits a keyword 'with responsibility', determining the guarantee type as a with responsibility guarantee;
and when the text content of the text to be processed does not hit the keyword 'with responsibility', determining the guarantee type as a general responsibility guarantee.
8. The method of claim 1, wherein the extracting of the collateral information from the text to be processed based on the entity comprises:
acquiring a second type regular expression, wherein the second type regular expression is determined according to the expression formats of a guarantee party, a guaranteed party and a creditor in the text to be processed;
in the text to be processed, performing regular matching based on the second type regular expression and the entity to determine an entity matched with a sponsor, an entity matched with a sponsor and an entity matched with a creditor;
and determining the entities obtained in the regular matching as corresponding guarantors, insured parties and creditors in the guaranty information.
9. The method of claim 1, wherein the extracting of the collateral information from the text to be processed based on the entity comprises:
acquiring a third type regular expression, wherein the third type regular expression is determined according to the expression format of the eligible money amount information in the text to be processed;
performing regular matching in the text to be processed based on the third type regular expression to extract qualified money information;
converting the amount information in the Chinese format and/or the amount information in the Chinese and digital mixed format into an amount value taking the element as a unit;
and extracting the maximum one from the sum values, and determining the maximum one as the guaranteed bond principal in the guarantee information.
10. The vouching information acquiring method according to any one of claims 1 to 9, wherein the vouching information collectively includes data of five dimensions of a guarantee type, a guarantor, a guaranteed party, a creditor, and a guaranteed creditor principal, the method further comprising:
and filtering the guarantee information, and reserving guarantee information with five dimensions of which the data are not empty as effective guarantee information.
11. A warranty information acquisition apparatus characterized by comprising:
the document acquisition module is used for acquiring a referee document to be processed;
the text determination module is used for extracting a text to be processed in the referee document based on one or more rules of a case-based constraint rule, a keyword constraint rule and a paragraph constraint rule;
the entity extraction module is used for extracting entities included in the text to be processed;
and the guarantee information determining module is used for extracting guarantee information from the text to be processed based on the entity, and the guarantee information at least comprises a guaranteeing party, a guaranteed party and a creditor.
12. A storage medium having a computer program stored thereon, wherein the processor executes the computer program stored on the storage medium to implement the collateral information acquisition method according to any one of claims 1 to 10.
13. An electronic device, comprising a memory for storing a computer-executable program and a processor for executing the computer-executable program to implement the vouching information obtaining method according to any one of claims 1-10.
CN202210188632.XA 2022-03-01 2022-03-01 Guarantee information acquisition method and device, storage medium and electronic equipment Pending CN114266682A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210188632.XA CN114266682A (en) 2022-03-01 2022-03-01 Guarantee information acquisition method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210188632.XA CN114266682A (en) 2022-03-01 2022-03-01 Guarantee information acquisition method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN114266682A true CN114266682A (en) 2022-04-01

Family

ID=80833738

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210188632.XA Pending CN114266682A (en) 2022-03-01 2022-03-01 Guarantee information acquisition method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN114266682A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110599289A (en) * 2019-07-31 2019-12-20 长春市万易科技有限公司 Method for formatting official document
EP3716104A1 (en) * 2019-03-27 2020-09-30 Siemens Aktiengesellschaft Extracting named entities based using document structure
CN111783449A (en) * 2020-06-24 2020-10-16 鼎富智能科技有限公司 Method and device for extracting elements of judgment result in judgment document
CN113010684A (en) * 2020-12-31 2021-06-22 北京法意科技有限公司 Construction method and system of civil complaint and judgment map
CN114092119A (en) * 2021-11-29 2022-02-25 北京金堤科技有限公司 Supply relation obtaining method and device, storage medium and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3716104A1 (en) * 2019-03-27 2020-09-30 Siemens Aktiengesellschaft Extracting named entities based using document structure
CN110599289A (en) * 2019-07-31 2019-12-20 长春市万易科技有限公司 Method for formatting official document
CN111783449A (en) * 2020-06-24 2020-10-16 鼎富智能科技有限公司 Method and device for extracting elements of judgment result in judgment document
CN113010684A (en) * 2020-12-31 2021-06-22 北京法意科技有限公司 Construction method and system of civil complaint and judgment map
CN114092119A (en) * 2021-11-29 2022-02-25 北京金堤科技有限公司 Supply relation obtaining method and device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
US8589262B1 (en) Technique for computing relevancy between tax information
WO2021063045A1 (en) Blockchain-based transaction query method, apparatus and device, and blockchain-based transaction data processing method, apparatus and device
CN111259160B (en) Knowledge graph construction method, device, equipment and storage medium
US10860661B1 (en) Content-dependent processing of questions and answers
CN111930623B (en) Test case construction method and device and electronic equipment
Fei et al. Lawbench: Benchmarking legal knowledge of large language models
WO2012111226A1 (en) Time-series document summarization device, time-series document summarization method and computer-readable recording medium
CN110348003A (en) The abstracting method and device of text effective information
CN110069594B (en) Contract confirmation method, contract confirmation device, electronic equipment and storage medium
Yang et al. Machine learning–driven model to analyze particular conditions of contracts: A multifunctional and risk perspective
US9418385B1 (en) Assembling a tax-information data structure
Das et al. RST signalling corpus annotation manual
CN114239561B (en) Supply relation acquisition method and device, storage medium and electronic equipment
CN114266682A (en) Guarantee information acquisition method and device, storage medium and electronic equipment
CN116563006A (en) Service risk early warning method, device, storage medium and device
CN114092119A (en) Supply relation obtaining method and device, storage medium and electronic equipment
CN114519568A (en) Order examination method and device, electronic equipment and storage medium
Chun et al. CR-COPEC: Causal Rationale of Corporate Performance Changes to Learn from Financial Reports
US8024347B2 (en) Method and apparatus for automatically differentiating between types of names stored in a data collection
Yuan et al. Linguistic Feature Analysis on Judicial Decisions Based on Keyword Extraction and High-Frequency Word Statistics—Taking Paper of Sentence for Example
Vu et al. NAVIGATING RISKS: HOW EXTERNAL ENVIRONMENTS SHAPE NON-PERFORMING LOANS IN VIETNAM'S COMMERCIAL BANKS
Forneris et al. How Countries Can Fully Implement the New York Convention
PA et al. Impact of COVID-19 and Gulf Return Migrants: A Special Focus on India
CN110232138B (en) Service guiding method, device and storage medium
Tajti Contemporary European debt collection practices through the prism of the rule of law

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20220401