CN116306621A - Violation detection method and device for bidding text and electronic equipment - Google Patents

Violation detection method and device for bidding text and electronic equipment Download PDF

Info

Publication number
CN116306621A
CN116306621A CN202310587277.8A CN202310587277A CN116306621A CN 116306621 A CN116306621 A CN 116306621A CN 202310587277 A CN202310587277 A CN 202310587277A CN 116306621 A CN116306621 A CN 116306621A
Authority
CN
China
Prior art keywords
violation
text
keyword
matched
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310587277.8A
Other languages
Chinese (zh)
Other versions
CN116306621B (en
Inventor
贾新
李海运
邵强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Tuopu Fenglian Information Technology Co ltd
Original Assignee
Beijing Tuopu Fenglian Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Tuopu Fenglian Information Technology Co ltd filed Critical Beijing Tuopu Fenglian Information Technology Co ltd
Priority to CN202310587277.8A priority Critical patent/CN116306621B/en
Publication of CN116306621A publication Critical patent/CN116306621A/en
Application granted granted Critical
Publication of CN116306621B publication Critical patent/CN116306621B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application provides a method, a device and electronic equipment for detecting violations of a bidding document, and relates to the technical field of text violation detection, wherein the method comprises the following steps: acquiring a bidding text to be detected; performing sentence decomposition on the bid text to be detected according to a text format corresponding to the bid text to be detected to obtain a plurality of project sentences and a plurality of phrases; performing violation matching processing on a plurality of project sentences and a plurality of phrases by utilizing a plurality of keyword entries in a preset violation word library to obtain a violation index corresponding to a to-be-detected bid text; and determining a risk assessment result corresponding to the bid text to be detected by using the violation index. According to the method and the device, automatic violation detection of the bid-tendering text is achieved through the preset violation word bank, the accuracy and the accuracy of the violation detection are guaranteed, and the screening efficiency is improved.

Description

Violation detection method and device for bidding text and electronic equipment
Technical Field
The application relates to the technical field of rule violation detection of bidding documents, in particular to a rule violation detection method and device of bidding documents and electronic equipment.
Background
In the prior art, a purchasing mode of public bidding is mainly used for purchasing, and in the process of purchasing through bidding by related departments, bidding documents are generated according to requirements, the bidding documents are displayed to suppliers, and the suppliers bid according to the contents of the bidding documents.
However, the contents in the bidding documents are likely to violate some policy documents, in the prior art, the check personnel are specially used for checking the contents of the bidding documents, if a problem is found, the bidding document writing department is returned to be modified, if the problem is not found, the manual checking mode is directly passed, and firstly, the manual checking mode is unfamiliar with the policy documents, the illegal contents are likely to be missed, and the manual efficiency is low, and the comprehensiveness is difficult to ensure.
Disclosure of Invention
In view of this, the purpose of the present application is to provide at least a method, a device and an electronic device for detecting violations of a bid-tendering text, which can automatically detect violations of the bid-tendering text by presetting a violation word library, thereby ensuring the accuracy and precision of the violations detection and improving the screening efficiency.
The application mainly comprises the following aspects:
in a first aspect, an embodiment of the present application provides a method for detecting violations of a bid text, where the method includes: acquiring a bidding text to be detected; performing sentence decomposition on the bid text to be detected according to a text format corresponding to the bid text to be detected to obtain a plurality of project sentences and a plurality of phrases; performing violation matching processing on a plurality of project sentences and a plurality of phrases by utilizing a plurality of keyword entries in a preset violation word library to obtain a violation index corresponding to a to-be-detected bid text; and determining a risk assessment result corresponding to the bid text to be detected by using the violation index.
In one possible implementation, the method comprises the steps of obtaining a plurality of item sentences and a plurality of phrases corresponding to the to-be-detected bid text by removing forms in the to-be-detected bid text to obtain the processed to-be-detected bid text; and disassembling the processed bidding text to be detected according to the text format to obtain a plurality of project sentences and a plurality of phrases.
In one possible implementation manner, the text format includes a document format and a crawler format, wherein the step of disassembling the processed to-be-detected bid text according to the text format to obtain a plurality of project sentences and a plurality of phrases includes: if the text format corresponding to the to-be-detected bidding text is the document format, the processed to-be-detected bidding text is directly disassembled according to periods, semicolons and commas to obtain a plurality of project sentences and a plurality of phrases; and if the text format corresponding to the to-be-detected bid text is a crawler format, the processed to-be-detected bid text is disassembled by sequentially utilizing </table >, the tag and the space character to obtain a plurality of paragraphs, and the paragraphs are segmented by using periods, semicolons and commas to obtain a plurality of project sentences and a plurality of phrases.
In a possible implementation manner, the keyword entries include a plurality of keywords having a combination logical relationship with each other, wherein the step of performing the rule violation matching processing on the plurality of item sentences and the plurality of phrases by using the plurality of keyword entries in the preset rule violation word library to obtain the rule violation index corresponding to the to-be-detected bid text includes: aiming at each keyword entry, recombining a plurality of keywords according to a combination logic relation among the keywords corresponding to the keyword entry to obtain a plurality of keyword groups corresponding to the keyword entry; constructing a word stock set by a plurality of keywords corresponding to each keyword entry; screening a plurality of key phrases formed by a preset illegal word bank by using the word bank set to obtain a plurality of key phrases to be matched; and carrying out violation matching processing on the plurality of project sentences and/or the plurality of phrases by utilizing the plurality of key phrases to be matched to obtain the violation index corresponding to the to-be-detected bid text.
In one possible implementation manner, the plurality of keyword groups include a first keyword group and a second keyword group, the combined logic relationship includes a logic or relationship, a first logic and relationship and a second logic and relationship, the logic or relationship indicates that the two keywords are or relationship, the first logic and relationship indicates that the two keywords are in parallel relationship and the sequence is variable, the second logic and relationship indicates that the two words are in parallel relationship and the sequence is not variable, and the plurality of keyword groups corresponding to each keyword item are determined by the following method: according to the or relation, the first logic and relation and/or the second logic and relation among the keywords in the keyword entries, a plurality of keywords in the keyword entries are recombined to obtain a plurality of first keyword groups; and carrying out synonym replacement and addition on the first keyword groups according to a plurality of synonyms recorded in a preset violation word bank to obtain a plurality of second keyword groups.
In one possible implementation manner, the step of screening a plurality of key phrases formed by a preset violation word bank by using the word bank set to obtain a plurality of key phrases to be matched includes: determining a plurality of word lengths related to all keywords in the word stock set; aiming at each word length, segmenting a plurality of item sentences according to the word length to obtain a word set corresponding to the word length; for each word set, determining an intersection between the word set and a word stock set; determining a plurality of target keywords from the word stock set by corresponding intersections of the word sets; and determining the keyword group corresponding to each target keyword as a keyword group to be matched according to each target keyword.
In one possible implementation manner, the step of performing the rule violation matching processing on the plurality of item sentences and/or the plurality of phrases by using the plurality of key phrases to be matched to obtain the rule violation index corresponding to the to-be-detected bid text includes: for each keyword group to be matched, carrying out illegal matching processing on a plurality of project sentences or a plurality of phrases according to a matching mode, forbidden phrases and forbidden word groups corresponding to the keyword group to be matched recorded in a preset illegal word library to obtain a matching result between the keyword group to be matched and each project sentence or each phrase, wherein the forbidden phrases comprise a plurality of forbidden words, and the forbidden word groups comprise a plurality of forbidden words; and determining the violation index corresponding to the bidding text to be detected according to a plurality of matching results corresponding to each keyword group to be matched.
In one possible implementation manner, the matching manner includes item sentence matching and phrase matching, wherein a plurality of matching results corresponding to each keyword group to be matched are determined by the following manner: if the matching mode of the key phrase to be matched is item sentence matching, judging whether the key phrase to be matched exists in the item sentence or not according to each item sentence, if the key phrase to be matched exists in the item sentence, and if a plurality of forbidden terms and a plurality of forbidden terms corresponding to the key phrase to be matched do not exist in the item sentence, determining that the item sentence and the key phrase to be matched are successfully matched, and if any forbidden term or a forbidden term corresponding to the key phrase to be matched exists in the item sentence, determining that the item sentence and the key phrase to be matched are failed to be matched; if the matching mode of the keyword group to be matched is phrase matching, judging whether the keyword group to be matched exists in the phrase aiming at each phrase, if the keyword group to be matched exists in the phrase, and if a plurality of forbidden words and a plurality of forbidden words corresponding to the keyword group to be matched do not exist in the phrase, determining that the phrase and the keyword group to be matched are successfully matched, and if any forbidden word or forbidden word corresponding to the keyword group to be matched exists in the phrase, determining that the phrase and the keyword group to be matched are failed to be matched.
In one possible implementation, the violation index corresponding to the bid text to be detected is determined by: for each matching result, if the matching result indicates that the matching is successful, determining the matching result as a target matching result; aiming at each target matching result, determining the violation probability corresponding to the keyword item of the keyword group to be matched corresponding to the target matching result as the violation probability corresponding to the target matching result; determining a violation index corresponding to the bidding text to be detected according to the violation probability corresponding to each target matching result; wherein the violation index is determined by the following formula:
Figure SMS_1
in the course of this formula (ii) the formula,
Figure SMS_2
indicating the offence probability corresponding to the nth target match result,/->
Figure SMS_3
Indicating violationsA number.
In one possible embodiment, the risk assessment result includes a risk level and a plurality of pieces of violation information including a violation category, a violation of a rule, and a violation case, wherein the risk level and the plurality of pieces of violation information are determined by: determining a target index range interval to which the violation index belongs; determining a target risk level corresponding to the bidding text to be detected according to the corresponding relation between the index range intervals and the risk levels; and determining a plurality of target keyword entries related to a plurality of target matching results, and acquiring violation information corresponding to each target keyword entry from a preset violation word library.
In a second aspect, an embodiment of the present application further provides a device for detecting violations of a bid text, where the device includes: the acquisition module is used for acquiring the bidding text to be detected; the disassembly module is used for carrying out sentence disassembly on the to-be-detected bid text according to a text format corresponding to the to-be-detected bid text to obtain a plurality of project sentences and a plurality of phrases; the violation detection module is used for carrying out violation matching processing on a plurality of project sentences and a plurality of phrases by utilizing a plurality of keyword entries in a preset violation word library to obtain a violation index corresponding to the to-be-detected bid text; and the risk assessment module is used for determining a risk assessment result corresponding to the bid text to be detected by using the violation index.
In a third aspect, embodiments of the present application further provide an electronic device, including: a processor, a memory and a bus, the memory storing machine readable instructions executable by the processor, the processor and the memory in communication via the bus when the electronic device is running, the machine readable instructions when executed by the processor performing the steps of the method for detecting violations of the signpost text as described in the first aspect or any possible implementation of the first aspect.
The embodiment of the application provides a violation detection method and device for a bid-bidding text and electronic equipment, wherein the method comprises the following steps: acquiring a bidding text to be detected; performing sentence decomposition on the bid text to be detected according to a text format corresponding to the bid text to be detected to obtain a plurality of project sentences and a plurality of phrases; and carrying out violation matching processing on a plurality of project sentences and a plurality of phrases by utilizing a plurality of keyword entries in a preset violation word library to obtain a violation detection result corresponding to the to-be-detected bid text. According to the method and the device, automatic violation detection of the bid-tendering text is achieved through the preset violation word bank, the accuracy and the accuracy of the violation detection are guaranteed, and the screening efficiency is improved.
In order to make the above objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered limiting the scope, and that other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 shows a flow chart of a method for detecting violations of a bid text provided by an embodiment of the present application;
FIG. 2 illustrates a flowchart one of a method for determining a violation index provided by an embodiment of the present application;
FIG. 3 illustrates a second flowchart of a method for determining a violation index provided by an embodiment of the present application;
fig. 4 is a schematic structural diagram of a violation detection device of a bidding document according to an embodiment of the present application;
fig. 5 shows a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it should be understood that the accompanying drawings in the present application are only for the purpose of illustration and description, and are not intended to limit the protection scope of the present application. In addition, it should be understood that the schematic drawings are not drawn to scale. A flowchart, as used in this application, illustrates operations implemented according to some embodiments of the present application. It should be appreciated that the operations of the flow diagrams may be implemented out of order and that steps without logical context may be performed in reverse order or concurrently. Moreover, one or more other operations may be added to the flow diagrams and one or more operations may be removed from the flow diagrams as directed by those skilled in the art.
In addition, the described embodiments are only some, but not all, of the embodiments of the present application. The components of the embodiments of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as provided in the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.
The government mainly uses the purchasing mode of public bidding to make purchases, the public bidding specifically refers to a purchasing mode that a buyer pays out bidding by issuing bidding notices according to legal procedure, invites all potential unspecified suppliers to participate in bidding, and the buyer preferentially selects a winning bid supplier from all bidding suppliers through a certain predetermined standard and signs a government purchasing contract with the winning bid. In the process of purchasing through bidding by government departments, bidding documents are generated according to requirements, the bidding documents are displayed to suppliers, and the suppliers bid according to the contents of the bidding documents.
The contents in the bidding documents are likely to violate some policy documents, in the prior art, the contents of the bidding documents are checked by a verification personnel, if the relevant violation problem of the bidding documents is found, the bidding documents are returned to the bidding document writing department for prompt modification, and if the relevant violation problem is not found, the bidding documents are directly passed through.
The manual checking mode is that firstly, people are unfamiliar with the policy files, illegal contents are likely to be missed to be checked, and the manual checking mode is low in manual efficiency and difficult to ensure comprehensiveness.
Based on this, the embodiment of the application provides a method, a device and an electronic device for detecting violations of a bid-tendering text, which realize automatic violation detection of the bid-tendering text by presetting a violation word bank, ensure the precision and accuracy of the violation detection and improve the screening efficiency, and are specifically as follows:
referring to fig. 1, fig. 1 shows a flowchart of a method for detecting violations of a bidding document according to an embodiment of the present application. As shown in fig. 1, the method for detecting violations of the bidding documents provided in the embodiment of the present application may be applied to a bidding document auditing system, and specifically includes the following steps:
s100, acquiring a bidding text to be detected.
Specifically, the user can upload the bidding text to be detected to the bidding text auditing system through the relevant channel, and on the other hand, the bidding text to be detected can be obtained in a data crawling mode, and the text formats corresponding to the bidding text are different due to different channel sources, wherein the text formats comprise a document format and a crawler format, and the document formats comprise PDF and word.
S110, carrying out sentence decomposition on the to-be-detected bid text according to a text format corresponding to the to-be-detected bid text, and obtaining a plurality of project sentences and a plurality of phrases.
In a preferred embodiment, the plurality of item sentences and the plurality of phrases corresponding to the bid text to be detected are obtained by:
and eliminating the table in the bidding text to be detected to obtain the processed bidding text to be detected, and disassembling the bidding text to be detected according to the text format to obtain a plurality of project sentences and a plurality of phrases.
Specifically, in the case that the bidding text to be detected is in a document format, table contents in the bidding text to be detected are not used as objects of violation detection processing, and the violation detection in the application only relates to text contents.
In another preferred embodiment, the step of decomposing the bid text to be detected according to the text format to obtain a plurality of project sentences and a plurality of phrases includes:
if the text format corresponding to the to-be-detected bid text is a document format, the processed to-be-detected bid text is directly disassembled according to periods, semicolons and commas to obtain project sentences and multiple phrases, and if the text format corresponding to the to-be-detected bid text is a crawler format, the processed to-be-detected bid text is disassembled by using </table >, labels and space characters in sequence to obtain multiple paragraphs, and the multiple paragraphs are segmented by using the periods, semicolons and commas to obtain multiple project sentences and multiple phrases.
Specifically, the text paragraphs of the to-be-detected bid text in the document format are standard and clear, the text content corresponding to the to-be-detected bid text can be directly disassembled through the paragraphs, specifically, for each paragraph, the paragraph is firstly disassembled into a plurality of natural sentences according to periods and semicolons, then for each obtained natural sentence, the natural sentences are further disassembled through commas to form a plurality of item sentences, specifically, if one natural sentence has two or more commas, the front and rear parts of the content of each comma are determined to be one item sentence, for example: the paragraphs are: "a, b, c; d. splitting to obtain natural sentences 'a, b, c' and'd', further splitting to obtain three project sentences 'a, b', 'b, c' and'd', and then directly aiming at 'a, b and c' according to commas; d. the phrase "split to obtain four phrases" a "," b "," c "and" d ".
In another embodiment, for the to-be-detected bid text in the crawler format, the corresponding text content is not in the form of a standard paragraph, so that the paragraph division of the text content needs to be completed together according to the tag, the space and the </table >, specifically, the text content can be divided by using the </table >, each divided part finds the last position of the < table ', the content before the < table' is reserved, then the tag (i.e. the separator) and the space are used for continuously dividing each reserved part of the text content to obtain a plurality of natural sentences, and the subsequent division mode of the obtained natural sentences is similar to the document format, and is not repeated here.
And S120, carrying out violation matching processing on a plurality of project sentences and a plurality of phrases by utilizing a plurality of keyword entries in a preset violation word library to obtain a violation index corresponding to the to-be-detected bid text.
Preferably, the preset violation word library comprises a plurality of keyword entries, each keyword entry comprises a plurality of attribute fields, and the plurality of attribute fields comprise a keyword field, a forbidden word field, a matching mode field, an enabling flag field, a violation risk field, a violation probability field, a synonym field, a forbidden word field, a violation category field, a violation regulation field, a violation entry total segment, a case name field, a case brief field and a case link field, wherein the synonym field and the forbidden word field are common fields of the plurality of keyword entries.
In a preferred embodiment, referring to fig. 2, fig. 2 is a flowchart illustrating a method for determining a violation index according to an embodiment of the present application. As shown in fig. 2, step S120 includes:
s121, for each keyword entry, recombining the keywords according to the combination logic relation among the keywords corresponding to the keyword entry to obtain a plurality of keyword groups corresponding to the keyword entry.
In a specific embodiment, the keyword fields in the keyword entries include a plurality of keywords having a combination logic relationship therebetween, the combination logic relationship includes a logical or relationship, a first logical and relationship and a second logical and relationship, the logical or relationship indicates that the two keywords are or relationship, the first logical and relationship indicates that the two keywords are in parallel relationship and the sequence is variable, the second logical and relationship indicates that the two words are in parallel relationship and the sequence is not variable, and the plurality of keyword groups include a first keyword group and a second keyword group.
For example, if the keyword field is: (continuous |near), (three months |three months), (tax payment) | tax requirements, which includes a plurality of keywords of continuous, near, three months, tax payment and tax requirements, "|" represents a second logical and relationship, "|" represents a logical or relationship, "&" represents a first logical and relationship.
Specifically, the plurality of keyword groups corresponding to each keyword entry are determined by the following methods:
and recombining a plurality of keywords in the keyword entry according to the or relation, the first logic and relation and/or the second logic and relation among the keywords in the keyword entry to obtain a plurality of first keyword groups.
For example, for (e|f) ×g, the splitting may be divided into e×g and f×g, where it should be noted that the second logical and relationship "×" cannot change the position order of the two parts before and after splitting, that is, the splitting result cannot be g×e, and specifically, if the keyword field is: and when the first keyword phrase ' continuous three months ' and tax payment ' is utilized to match the project sentences, the keyword continuous three months and tax payment all need to appear in the sentences, and the appearance sequence of ' continuous ' and ' three months ' can not be reversed.
And carrying out synonym replacement addition on the first keyword groups according to the synonyms recorded in the preset violation word library so as to obtain second keyword groups.
Specifically, the plurality of first keyword groups include "continuous three months & tax payment", "near three months & tax payment" and "near three months & tax payment", and the synonym group corresponding to the synonym field includes "continuous |continuous", which indicates that the keyword "continuous" has synonyms, and the replacement of the first keyword group including "continuous" is required to obtain a plurality of second keyword groups, which are "continuous three months & tax payment", "continuous three months & tax payment".
S122, constructing a word stock set by a plurality of keywords corresponding to each keyword entry.
S123, screening a plurality of key phrases formed by a preset illegal word bank by using the word bank set to obtain a plurality of key phrases to be matched.
Specifically, after determining the plurality of keyword phrases corresponding to each keyword item, the method and the device need to firstly apply keywords in a preset violation word bank to screen the plurality of keyword phrases in order to reduce the matching calculation amount, and if the text to be detected is directly subjected to matching verification by using the disassembled plurality of keyword phrases, the efficiency is relatively low, for example, if millions of project sentences are contained in the text to be detected, and the millions of project sentences are matched with the plurality of keyword phrases, the calculation amount is quite huge, therefore, the method and the device firstly use the plurality of keywords in the preset violation word bank to carry out inclusion judgment, namely screen out a plurality of target keywords contained in the text to be detected, and then use the screened plurality of target keywords to further screen the plurality of keyword phrases to obtain a plurality of corresponding keyword phrases to be matched, so that the calculation amount is greatly reduced, and the matching efficiency is improved.
Specifically, step S123 includes:
Determining a plurality of word lengths related to all keywords in a word stock set, dividing a plurality of item sentences according to the word lengths aiming at each word length to obtain word sets corresponding to the word lengths, determining intersections between the word sets and the word stock set aiming at each word set, determining a plurality of target keywords from the word stock set by the intersections corresponding to the word sets, and determining a keyword group corresponding to each target keyword as a keyword group to be matched aiming at each target keyword.
Specifically, if the word stock set includes [ home city, tax payment ], the word length is determined to be 2 and 4, the text to be detected is "home city continuous three months tax payment", for word length 2, "home city continuous three months tax payment" is divided into 2 word sets, "home city, city continuous, continuous three, month tax payment" and for word length 4, "home city continuous three months tax payment" is divided into 3 word sets, "home city continuous, city continuous three, continuous three tax" and "tax payment" and then intersection is calculated by using the 2 word sets and the 3 word sets and the word stock sets, respectively, the intersection of the 2 word sets and the word stock sets includes the target keyword "home city", the intersection of the 3 word sets and the word stock sets includes the target keyword "tax payment", and then further, the keyword sets including the target keywords "home city" and "tax" are determined to be matched.
And S124, carrying out rule violation matching processing on the plurality of project sentences and/or the plurality of phrases by utilizing the plurality of key word groups to be matched to obtain rule violation indexes corresponding to the to-be-detected bid texts.
In a preferred embodiment, referring to fig. 3, fig. 3 shows a second flowchart of a method for determining a violation index according to an embodiment of the present application. As shown in fig. 3, step S124 includes:
s1241, aiming at each keyword group to be matched, carrying out rule-breaking matching processing on a plurality of project sentences or a plurality of phrases according to a matching mode, forbidden phrases and forbidden word groups corresponding to the keyword group to be matched recorded in a preset rule-breaking word library, and obtaining a matching result between the keyword group to be matched and each project sentence or each phrase; wherein, the forbidden word group is composed of forbidden words.
Specifically, the matching mode corresponding to the keyword group to be matched corresponding to each keyword item can be determined through a matching mode field in a preset violation word bank, the matching mode comprises item sentence matching and phrase matching, the item sentence matching indicates whether the keyword group to be matched appears in the item sentence, the phrase matching indicates whether the keyword group to be matched appears in the phrase, for example, for each keyword item, if the corresponding matching mode field is 1, the matching mode corresponding to the keyword group to be matched corresponding to the keyword item is item sentence matching, and if the corresponding matching mode field is 0, the matching mode corresponding to the keyword group to be matched corresponding to the keyword item is phrase matching.
The forbidden word field in the preset illegal word bank indicates forbidden word groups corresponding to each keyword item respectively, that is, the forbidden word groups and the keyword items are in corresponding relation, the preset illegal word bank only comprises one forbidden word group, the forbidden word groups belong to the common use of a plurality of keyword items, the forbidden word is a word which cannot be used with the keyword groups at the same time, and the forbidden word is a word which cannot be extracted from the word.
In another preferred embodiment, the multiple matching results corresponding to each keyword group to be matched are determined by:
if the matching mode of the key phrase to be matched is item sentence matching, judging whether the key phrase to be matched exists in the item sentence or not according to each item sentence, if the key phrase to be matched exists in the item sentence, and if the plurality of forbidden words and the plurality of forbidden words corresponding to the key phrase to be matched do not exist in the item sentence, determining that the item sentence and the key phrase to be matched are successfully matched, and if any forbidden word exists in the item sentence or the forbidden word corresponding to the key phrase to be matched, determining that the item sentence and the key phrase to be matched are failed to be matched.
In a specific embodiment, if the keyword group to be matched is "continuous three months of tax payment & home market", the corresponding forbidden word group includes "home market|capital market", the forbidden word group includes "no continuous three months of tax payment & home market", the item sentence is "tax payment record continuously participating in home market for three months", at this time, when the keyword group to be matched is utilized to match the item sentence, since the item sentence "tax payment & home market continuously participating in home market for three months" exists, and the appearance sequence of "continuous three months, tax payment" is consistent with the appearance sequence specified in the keyword group to be matched, and the forbidden word "no continuous three months of tax payment & home market" exists in the sentence, and the item sentence is not extracted from the market appearing in the item sentence, in summary, the result of successful matching is the keyword group between the item and the example.
In another case, if the project sentence in the above example is "the tax payment record continuously participating in three months in the capital market is.+ -.)", when matching is performed, although no forbidden word exists in the project sentence, when matching is performed on the keyword "home market" in the keywords to be matched, the "home market" belongs to the word extracted from the "capital market", and accords with the forbidden word condition, in addition, the project sentence is not in the present market outside the capital market, so that in the situation, the matching result between the key phrase to be matched and the project sentence is failure, namely if the project sentence is the tax payment record of the present market in three months continuously, the matching result between the key phrase to be matched and the project sentence is successful.
If the matching mode of the keyword group to be matched is phrase matching, judging whether the keyword group to be matched exists in the phrase aiming at each phrase, if the keyword group to be matched exists in the phrase, and if a plurality of forbidden words and a plurality of forbidden words corresponding to the keyword group to be matched do not exist in the phrase, determining that the phrase and the keyword group to be matched are successfully matched, and if any forbidden word or forbidden word corresponding to the keyword group to be matched exists in the phrase, determining that the phrase and the keyword group to be matched are failed to be matched.
S1242, determining the violation index corresponding to the bidding text to be detected according to a plurality of matching results corresponding to each keyword group to be matched.
In a preferred embodiment, the violation index corresponding to the bid text to be detected is determined by:
and determining the matching result as a target matching result if the matching result indicates successful matching, determining the violation probability corresponding to the keyword item of the keyword group to be matched corresponding to the target matching result as the violation probability corresponding to the target matching result according to the target matching result, and determining the violation index corresponding to the to-be-detected bidding text.
Specifically, a value recorded in a violation probability field corresponding to each keyword entry in the preset violation word indicates the violation probability corresponding to the keyword entry.
Wherein the violation index is determined by the following formula:
Figure SMS_4
in the course of this formula (ii) the formula,
Figure SMS_5
indicating the offence probability corresponding to the nth target match result,/->
Figure SMS_6
Indicating the violation index.
Returning to fig. 1, S130, determining a risk assessment result corresponding to the bidding text to be detected by using the violation index.
The risk assessment result comprises a risk level and a plurality of pieces of violation information, wherein the violation information comprises a violation category, a violation rule and a violation case.
In a preferred embodiment, the risk level and pieces of violation information are determined by:
determining a target index range interval to which the violation index belongs, determining a target risk level corresponding to a to-be-detected bidding text according to the corresponding relation between the index range intervals and the risk levels, determining a plurality of target keyword entries related to a plurality of target matching results, and acquiring violation information corresponding to each target keyword entry from a preset violation word library.
Based on the same application conception, the embodiment of the application also provides a violation detection device of the bidding text corresponding to the violation detection method of the bidding text provided by the embodiment, and because the principle of solving the problem by the device in the embodiment of the application is similar to that of the violation detection method of the bidding text in the embodiment of the application, the implementation of the device can refer to the implementation of the method, and the repetition is omitted.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a violation detection device for a bidding document according to an embodiment of the present application. As shown in fig. 4, the above-mentioned violation detection device includes:
the obtaining module 200 is configured to obtain the bidding text to be detected.
The disassembly module 210 is configured to disassemble sentences of the to-be-detected bid text according to a text format corresponding to the to-be-detected bid text, so as to obtain a plurality of project sentences and a plurality of phrases.
The violation detection module 220 is configured to perform violation matching processing on the multiple project sentences and the multiple phrases by using multiple keyword entries in a preset violation word bank, so as to obtain a violation index corresponding to the to-be-detected bid text.
The risk assessment module 230 is configured to determine a risk assessment result corresponding to the bid text to be detected by using the violation index.
Preferably, the disassembly module 210 is further configured to: removing the form in the bidding text to be detected to obtain the processed bidding text to be detected; and disassembling the processed bidding text to be detected according to the text format to obtain a plurality of project sentences and a plurality of phrases.
Preferably, the text format includes a document format and a crawler format, wherein the disassembly module 210 is further configured to: if the text format corresponding to the to-be-detected bidding text is the document format, the processed to-be-detected bidding text is directly disassembled according to periods, semicolons and commas to obtain a plurality of project sentences and a plurality of phrases; and if the text format corresponding to the to-be-detected bid text is a crawler format, the processed to-be-detected bid text is disassembled by sequentially utilizing </table >, the tag and the space character to obtain a plurality of paragraphs, and the paragraphs are segmented by using periods, semicolons and commas to obtain a plurality of project sentences and a plurality of phrases.
Preferably, the keyword entry includes a plurality of keywords having a combinational logic relationship with each other, wherein the violation detection module 220 is further configured to: aiming at each keyword entry, recombining a plurality of keywords according to a combination logic relation among the keywords corresponding to the keyword entry to obtain a plurality of keyword groups corresponding to the keyword entry; constructing a word stock set by a plurality of keywords corresponding to each keyword entry; screening a plurality of key phrases formed by a preset illegal word bank by using the word bank set to obtain a plurality of key phrases to be matched; and carrying out violation matching processing on the plurality of project sentences and/or the plurality of phrases by utilizing the plurality of key phrases to be matched to obtain the violation index corresponding to the bid text to be detected.
Preferably, the plurality of keyword groups includes a first keyword group and a second keyword group, the combination logic relationship includes a logical or relationship, a first logical and relationship and a second logical and relationship, the logical or relationship indicates that the two keywords are in parallel relationship and the sequence is variable, the first logical and relationship indicates that the two keywords are in parallel relationship and the sequence is not variable, and the violation detection module 220 is further configured to: according to the or relation, the first logic and relation and/or the second logic and relation among the keywords in the keyword entries, a plurality of keywords in the keyword entries are recombined to obtain a plurality of first keyword groups; and carrying out synonym replacement and addition on the first keyword groups according to a plurality of synonyms recorded in a preset violation word bank to obtain a plurality of second keyword groups.
Preferably, the violation detection module 220 is further configured to: determining a plurality of word lengths related to all keywords in the word stock set; aiming at each word length, segmenting a plurality of item sentences according to the word length to obtain a word set corresponding to the word length; for each word set, determining an intersection between the word set and a word stock set; determining a plurality of target keywords from the word stock set by corresponding intersections of the word sets; and determining the keyword group corresponding to each target keyword as a keyword group to be matched according to each target keyword.
Preferably, the violation detection module 220 is further configured to: for each keyword group to be matched, carrying out illegal matching processing on a plurality of project sentences or a plurality of phrases according to a matching mode, forbidden phrases and forbidden word groups corresponding to the keyword group to be matched recorded in a preset illegal word library to obtain a matching result between the keyword group to be matched and each project sentence or each phrase, wherein the forbidden phrases comprise a plurality of forbidden words, and the forbidden word groups comprise a plurality of forbidden words; and determining the violation index corresponding to the bidding text to be detected according to a plurality of matching results corresponding to each keyword group to be matched.
The matching modes include item sentence matching and phrase matching, and preferably, the violation detection module 220 is further configured to: if the matching mode of the key phrase to be matched is item sentence matching, judging whether the key phrase to be matched exists in the item sentence or not according to each item sentence, if the key phrase to be matched exists in the item sentence, and if a plurality of forbidden terms and a plurality of forbidden terms corresponding to the key phrase to be matched do not exist in the item sentence, determining that the item sentence and the key phrase to be matched are successfully matched, and if any forbidden term or a forbidden term corresponding to the key phrase to be matched exists in the item sentence, determining that the item sentence and the key phrase to be matched are failed to be matched; if the matching mode of the keyword group to be matched is phrase matching, judging whether the keyword group to be matched exists in the phrase aiming at each phrase, if the keyword group to be matched exists in the phrase, and if a plurality of forbidden words and a plurality of forbidden words corresponding to the keyword group to be matched do not exist in the phrase, determining that the phrase and the keyword group to be matched are successfully matched, and if any forbidden word or forbidden word corresponding to the keyword group to be matched exists in the phrase, determining that the phrase and the keyword group to be matched are failed to be matched.
Preferably, the violation detection module 220 is further configured to: for each matching result, if the matching result indicates that the matching is successful, determining the matching result as a target matching result; aiming at each target matching result, determining the violation probability corresponding to the keyword item of the keyword group to be matched corresponding to the target matching result as the violation probability corresponding to the target matching result; determining a violation index corresponding to the bidding text to be detected according to the violation probability corresponding to each target matching result;
preferably, the violation index is determined by the following formula:
Figure SMS_7
in the course of this formula (ii) the formula,
Figure SMS_8
indicating the offence probability corresponding to the nth target match result,/->
Figure SMS_9
Indicating the violation index.
Preferably, the risk assessment result includes a risk level and a plurality of pieces of violation information, the violation information including a violation category, a violation of a rule, and a violation case, and the risk assessment module 230 is further configured to: determining a target index range interval to which the violation index belongs; determining a target risk level corresponding to the bidding text to be detected according to the corresponding relation between the index range intervals and the risk levels; and determining a plurality of target keyword entries related to a plurality of target matching results, and acquiring violation information corresponding to each target keyword entry from a preset violation word library.
Referring to fig. 5, fig. 5 shows a schematic structural diagram of an electronic device according to an embodiment of the present application, where the electronic device 300 includes: the system comprises a processor 310, a memory 320 and a bus 330, wherein the memory 320 stores machine-readable instructions executable by the processor 310, the processor 310 and the memory 320 communicating through the bus 330 when the electronic device 300 is running, the machine-readable instructions being executed by the processor 310 to perform the steps of the method for detecting violations of a bidding document as provided in the above embodiments.
Based on the same application concept, the embodiment of the application further provides a computer readable storage medium, and the computer readable storage medium stores a computer program, and the computer program is executed by a processor to execute the steps of the violation detection method of the bidding document provided by the embodiment.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system and apparatus may refer to corresponding procedures in the foregoing method embodiments, which are not described herein again. In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer readable storage medium executable by a processor. Based on such understanding, the technical solutions of the present application may be embodied in essence or a part contributing to the prior art or a part of the technical solutions, or in the form of a software product, which is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely a specific embodiment of the present application, but the protection scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes or substitutions are covered in the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (12)

1. A method for detecting violations of a bidding document, the method comprising:
acquiring a bidding text to be detected;
performing sentence decomposition on the to-be-detected bid text according to a text format corresponding to the to-be-detected bid text to obtain a plurality of project sentences and a plurality of phrases;
performing violation matching processing on the plurality of project sentences and the plurality of phrases by utilizing a plurality of keyword entries in a preset violation word library to obtain a violation index corresponding to the to-be-detected bid text;
and determining a risk assessment result corresponding to the bid text to be detected by utilizing the violation index.
2. The method of claim 1, wherein the plurality of item sentences and the plurality of phrases corresponding to the bid text to be detected are obtained by:
removing the form in the to-be-detected bidding text to obtain a processed to-be-detected bidding text;
And disassembling the processed bidding text to be detected according to the text format to obtain a plurality of project sentences and a plurality of phrases.
3. The method of claim 2, wherein the text format comprises a document format and a crawler format,
the step of disassembling the processed bidding text to be detected according to the text format to obtain a plurality of project sentences and a plurality of phrases comprises the following steps:
if the text format corresponding to the to-be-detected bidding text is a document format, the processed to-be-detected bidding text is directly disassembled according to periods, semicolons and commas to obtain a plurality of project sentences and a plurality of phrases;
and if the text format corresponding to the to-be-detected bid text is a crawler format, the processed to-be-detected bid text is disassembled by sequentially utilizing a table, a tag and a space character to obtain a plurality of paragraphs, and the paragraphs are segmented by using periods, semicolons and commas to obtain a plurality of item sentences and a plurality of phrases.
4. The method of claim 1, wherein the keyword entry comprises a plurality of keywords having a combinational logic relationship with each other,
the step of obtaining the violation index corresponding to the to-be-detected bid text comprises the following steps of:
For each keyword entry, recombining a plurality of keywords according to a combination logic relation among the keywords corresponding to the keyword entry to obtain a plurality of keyword groups corresponding to the keyword entry;
constructing a word stock set by a plurality of keywords corresponding to each keyword entry;
screening the plurality of key phrases formed by the preset violation word bank by utilizing the word bank set to obtain a plurality of key phrases to be matched;
and carrying out illegal matching processing on the plurality of project sentences and/or the plurality of phrases by utilizing the plurality of key word groups to be matched to obtain the illegal index corresponding to the to-be-detected bid text.
5. The method of claim 4, wherein the plurality of keyword groups comprises a first keyword group and a second keyword group, the combined logical relationship comprises a logical OR relationship, a first logical AND relationship and a second logical AND relationship, the logical OR relationship represents a logical OR relationship between two keywords, the first logical AND relationship represents that two keywords are in parallel relationship and in a variable order, the second logical AND relationship represents that two words are in parallel relationship and in an invariable order,
wherein, a plurality of key phrases corresponding to each key word entry is determined by:
According to the or relation, the first logic and relation and/or the second logic and relation among the keywords in the keyword entries, a plurality of keywords in the keyword entries are recombined to obtain a plurality of first keyword groups;
and carrying out synonym replacement and addition on the first keyword groups according to a plurality of synonyms recorded in the preset violation word library so as to obtain a plurality of second keyword groups.
6. The method of claim 4, wherein the step of screening the plurality of keyword groups formed by the preset violation word bank by using the word bank set to obtain a plurality of keyword groups to be matched comprises:
determining a plurality of word lengths involved by all keywords in the word stock set;
aiming at each word length, segmenting the plurality of item sentences according to the word length to obtain a word set corresponding to the word length;
for each word set, determining an intersection between the word set and the word stock set;
determining a plurality of target keywords from the word stock set by the intersection corresponding to each word set;
and determining the keyword group corresponding to each target keyword as a keyword group to be matched according to each target keyword.
7. The method of claim 4, wherein the step of obtaining the violation index corresponding to the bid text to be detected by performing the violation matching processing on the plurality of item sentences and/or the plurality of phrases by using the plurality of key phrases to be matched comprises:
aiming at each keyword group to be matched, carrying out illegal matching processing on the plurality of project sentences or the plurality of phrases according to a matching mode, forbidden phrases and forbidden word groups corresponding to the keyword group to be matched recorded in the preset illegal word library to obtain a matching result between the keyword group to be matched and each project sentence or each phrase, wherein the forbidden phrases comprise a plurality of forbidden words, and the forbidden word groups comprise a plurality of forbidden words;
and determining the violation index corresponding to the bidding text to be detected according to a plurality of matching results corresponding to each keyword group to be matched.
8. The method of claim 7, wherein the matching means comprises item sentence matching and phrase matching,
the method comprises the following steps of determining a plurality of matching results corresponding to each keyword group to be matched:
if the matching mode of the key phrase to be matched is item sentence matching, judging whether the key phrase to be matched exists in the item sentence or not according to each item sentence, if the key phrase to be matched exists in the item sentence, and if the plurality of forbidden words and a plurality of forbidden words corresponding to the key phrase to be matched do not exist in the item sentence, determining that the item sentence and the key phrase to be matched are successfully matched, and if any forbidden word or a forbidden word corresponding to the key phrase to be matched exists in the item sentence, determining that the item sentence and the key phrase to be matched are failed to be matched;
If the matching mode of the keyword group to be matched is phrase matching, judging whether the keyword group to be matched exists in the phrase aiming at each phrase, if the keyword group to be matched exists in the phrase, and if the plurality of forbidden words and the plurality of forbidden words corresponding to the keyword group to be matched do not exist in the phrase, determining that the phrase and the keyword group to be matched are successfully matched, and if any forbidden word exists in the phrase or the forbidden word corresponding to the keyword group to be matched, determining that the phrase and the keyword group to be matched are failed to be matched.
9. The method of claim 4, wherein the violation index corresponding to the bid text to be detected is determined by:
for each matching result, if the matching result indicates that the matching is successful, determining the matching result as a target matching result;
aiming at each target matching result, determining the violation probability corresponding to the keyword item of the keyword group to be matched corresponding to the target matching result as the violation probability corresponding to the target matching result;
determining the violation index corresponding to the bidding text to be detected according to the violation probability corresponding to each target matching result;
Wherein the violation index is determined by the following formula:
Figure QLYQS_1
in the course of this formula (ii) the formula,
Figure QLYQS_2
indicating the offence probability corresponding to the nth target match result,/->
Figure QLYQS_3
Indicating the violation index.
10. The method of claim 9, wherein the risk assessment results include a risk level and a plurality of pieces of violation information including a violation category, a violation of a rule, and a violation case,
wherein the risk level and pieces of violation information are determined by:
determining a target index range interval to which the violation index belongs;
determining a target risk level corresponding to the bidding text to be detected according to the corresponding relation between the index range intervals and the risk levels;
and determining a plurality of target keyword entries related to a plurality of target matching results, and acquiring violation information corresponding to each target keyword entry from a preset violation word library.
11. A violation detection device of a bidding document, the device comprising:
the acquisition module is used for acquiring the bidding text to be detected;
the disassembly module is used for carrying out sentence disassembly on the to-be-detected bid text according to a text format corresponding to the to-be-detected bid text to obtain a plurality of project sentences and a plurality of phrases;
The violation detection module is used for carrying out violation matching processing on the plurality of project sentences and the plurality of phrases by utilizing a plurality of keyword entries in a preset violation word library to obtain a violation index corresponding to the to-be-detected bid text;
and the risk assessment module is used for determining a risk assessment result corresponding to the to-be-detected bid text by utilizing the violation index.
12. An electronic device, comprising: a processor, a memory and a bus, said memory storing machine readable instructions executable by said processor, said processor and said memory communicating via said bus when the electronic device is running, said machine readable instructions when executed by said processor performing the steps of the method of detecting violations of the signpost texts according to any of claims 1 to 10.
CN202310587277.8A 2023-05-24 2023-05-24 Violation detection method and device for bidding text and electronic equipment Active CN116306621B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310587277.8A CN116306621B (en) 2023-05-24 2023-05-24 Violation detection method and device for bidding text and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310587277.8A CN116306621B (en) 2023-05-24 2023-05-24 Violation detection method and device for bidding text and electronic equipment

Publications (2)

Publication Number Publication Date
CN116306621A true CN116306621A (en) 2023-06-23
CN116306621B CN116306621B (en) 2023-08-04

Family

ID=86783655

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310587277.8A Active CN116306621B (en) 2023-05-24 2023-05-24 Violation detection method and device for bidding text and electronic equipment

Country Status (1)

Country Link
CN (1) CN116306621B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117787800A (en) * 2023-12-29 2024-03-29 北京中水卓越认证有限公司 Authentication management system based on engineering construction

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740302A (en) * 2014-12-12 2016-07-06 北京海尔广科数字技术有限公司 Screening method and system for demand information
CN108319582A (en) * 2017-12-29 2018-07-24 北京城市网邻信息技术有限公司 Processing method, device and the server of text message
CN110909118A (en) * 2018-08-28 2020-03-24 中国移动通信集团重庆有限公司 Method, apparatus, device and medium for screening information
CN111738011A (en) * 2020-05-09 2020-10-02 完美世界(北京)软件科技发展有限公司 Illegal text recognition method and device, storage medium and electronic device
CN112699645A (en) * 2021-03-25 2021-04-23 北京健康之家科技有限公司 Corpus labeling method, apparatus and device
US20220318286A1 (en) * 2020-02-24 2022-10-06 Boe Technology Group Co., Ltd. Data updating method and apparatus, electronic device and computer readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740302A (en) * 2014-12-12 2016-07-06 北京海尔广科数字技术有限公司 Screening method and system for demand information
CN108319582A (en) * 2017-12-29 2018-07-24 北京城市网邻信息技术有限公司 Processing method, device and the server of text message
CN110909118A (en) * 2018-08-28 2020-03-24 中国移动通信集团重庆有限公司 Method, apparatus, device and medium for screening information
US20220318286A1 (en) * 2020-02-24 2022-10-06 Boe Technology Group Co., Ltd. Data updating method and apparatus, electronic device and computer readable storage medium
CN111738011A (en) * 2020-05-09 2020-10-02 完美世界(北京)软件科技发展有限公司 Illegal text recognition method and device, storage medium and electronic device
CN112699645A (en) * 2021-03-25 2021-04-23 北京健康之家科技有限公司 Corpus labeling method, apparatus and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117787800A (en) * 2023-12-29 2024-03-29 北京中水卓越认证有限公司 Authentication management system based on engineering construction

Also Published As

Publication number Publication date
CN116306621B (en) 2023-08-04

Similar Documents

Publication Publication Date Title
Härdle et al. Applied quantitative finance
CN107247707B (en) Enterprise association relation information extraction method and device based on completion strategy
CA3094442C (en) Financial event and relationship extraction
JP2693780B2 (en) Text processing systems and methods for checking in text processing systems whether units or chemical formulas are used correctly and consistently
US20090222395A1 (en) Systems, methods, and software for entity extraction and resolution coupled with event and relationship extraction
CN108491377A (en) A kind of electric business product comprehensive score method based on multi-dimension information fusion
US20070255694A1 (en) Document-drafting system using document components
CN112686036B (en) Risk text recognition method and device, computer equipment and storage medium
CN116306621B (en) Violation detection method and device for bidding text and electronic equipment
CN112149387A (en) Visualization method and device for financial data, computer equipment and storage medium
Jurgens et al. Event detection in blogs using temporal random indexing
Arman et al. Generating use case models from Arabic user requirements in a semiautomated approach using a natural language processing tool
Piper The CONLIT dataset of contemporary literature
CN111177771B (en) Method and device for generating resume of people
Khritankov et al. Discovering text reuse in large collections of documents: A study of theses in history sciences
Jácome et al. Contextual Analysis of Comments in B2C Facebook Fan Pages Based on the Levenshtein Algorithm
Vesanto Detecting and analyzing text reuse with BLAST
Garabík et al. Naïve Terminological Annotation of Legal Texts in Slovak–Can it Be Useful
CN116306619B (en) Document detection method and device, electronic equipment and storage medium
CN116257602B (en) Method and device for constructing universal word stock based on public words and electronic equipment
Polyanin et al. The Similarity Index of Scientific Publications with Mathematical Equations and Formulas
JP3416918B2 (en) Automatic keyword extraction method and device
Toprak et al. Enhanced Named Entity Recognition algorithm for financial document verification
Garabík et al. Naivno terminološko označivanje zakonskih tekstova u slovačkom–može li biti korisno?
CN115329199A (en) Product pushing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant