CN113761880B - Data processing method for text verification, electronic equipment and storage medium - Google Patents
Data processing method for text verification, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN113761880B CN113761880B CN202111310983.5A CN202111310983A CN113761880B CN 113761880 B CN113761880 B CN 113761880B CN 202111310983 A CN202111310983 A CN 202111310983A CN 113761880 B CN113761880 B CN 113761880B
- Authority
- CN
- China
- Prior art keywords
- text
- data
- target
- list
- verification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/226—Validation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a data processing method, electronic equipment and a storage medium for text verification, wherein the method comprises the following steps: obtaining a sample text list from a text database, marking a keyword position of the sample text as a designated initial position and a finishing position of the sample text as a designated finishing position when a keyword consistent with any preset keyword in the preset keyword list exists in any sample text, and taking a speech segment between the designated initial position and the designated finishing position as a target speech segment, and taking the sample text based on the existing target speech segment as training set data to construct a training set; inputting the training set into a preset language model for training to obtain a trained language model; and acquiring the knowledge graph of the target text through the trained language model so as to compare the knowledge graph with preset verification data. The method and the device can improve the accuracy and efficiency of comparing the structured text data with the semi-structured text data.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to a data processing method for text verification, electronic equipment and a storage medium.
Background
In the prior art, text data is divided into three types: structured text data, random text data and semi-structured text data; in the structured text data, the text data at a specific position has a specific meaning and is easy to be converted into a table structure in a relational database, such as text data in a cvs format, invoice text data after OCR processing or settlement statement data in a specific field of a power system; in the random text data, the text data at each text position has a random meaning, for example, the text data of literary works such as news, novels, prose and the like spread on the internet; the semi-structured text data is intermediate between the structured text data and the random text data, and text data at a specific position may have a specific meaning, but is difficult to be converted into a table structure in a relational database, for example, settlement terms in a contract in a specific field such as an electric power system, and the like.
In some application scenarios, especially in a settlement auditing scenario of an electric power system, structured text data and semi-structured text data need to be compared, that is, whether the structured data in a settlement document meets the requirements of the semi-structured settlement terms in a contract or not is judged, but because the semi-structured text data is difficult to be converted into a table structure of a relational database, the efficiency and accuracy of data comparison are low due to the fact that the semi-structured text data is compared in a manual mode in the prior art, and the data verification process is affected.
Disclosure of Invention
In order to solve the above technical problems, the present application adopts a technical solution of a data processing method, an electronic device, and a storage medium for text verification, where the method includes the steps of:
s100, acquiring m first texts from a first text set of a text database as sample texts, and constructing a sample text list A = (A)1,A2,A3,……,Am),AiI =1 … … m, and when A is the ith sample textiWhen the keyword exists in the preset keyword list, the A is matched with any preset keyword in the preset keyword listiIs marked as specifying a starting position and AiIs marked as a specified end position, and the speech segment between the specified start position and the specified end position is taken as AiBased on the presence of A of the target language fragmentiConstructing a training set as training set data;
s200, inputting the training set into a preset language model for training to obtain a trained language model;
s300, obtaining a target text, inputting the target text into a trained language model, and obtaining a target data list B = (B) corresponding to the target text1,B2,B3,……,Bn),BjJ =2 … … n, n is the target data number, and each B in B isjWith a number of preset ternary groupsThe frame is used for acquiring a target knowledge graph corresponding to the target text;
s400, acquiring a text ID of a target text, acquiring all verification data corresponding to the text ID of the target text from a verification data list according to the text ID of the target text, and constructing a first intermediate data list by taking each verification data as first intermediate data;
s500, traversing the target knowledge graph, and replacing any target data in the target knowledge graph with corresponding target data when the target data is inconsistent with the corresponding first intermediate data in the first intermediate data list.
The present invention also provides a non-transitory computer-readable storage medium that can be configured in an electronic device to store at least one instruction or at least one program for implementing a method of the method embodiments, where the at least one instruction or the at least one program is loaded by a processor and executed to implement the method provided by the above embodiments.
The invention also provides an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.
Compared with the prior art, the invention has obvious advantages and beneficial effects. By the technical scheme, the data processing system for acquiring the target position can achieve considerable technical progress and practicability, has wide industrial utilization value and at least has the following advantages:
the method comprises the steps of obtaining a sample text list, marking a keyword position of the sample text as a designated initial position and marking an end position of the sample text as a designated end position when a keyword consistent with any preset keyword in the preset keyword list exists in the sample text, taking a speech segment between the designated initial position and the designated end position as a target speech segment of the sample text, and taking the sample text based on the target speech segment as training set data to construct a training set; inputting the training set into a preset language model for training to obtain a trained language model;
the language model is optimized, the target language segment capable of extracting the data with the specific meaning can be accurately and efficiently determined, the extraction of the full text data and the interference of other data are reduced, and the comparison of the data in the text is facilitated;
meanwhile, inputting a target text into a trained language model, acquiring a characteristic value list corresponding to the target text, and acquiring a target knowledge graph corresponding to the target text by constructing each characteristic value by a plurality of preset triples; the data in the semi-structured text can be stored in a knowledge map form, the storage mode is optimized, the comparison of the data in the text is facilitated, and the efficiency and the accuracy of the verification of the structured text data and the semi-structured text data are improved.
The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical means of the present invention more clearly understood, the present invention may be implemented in accordance with the content of the description, and in order to make the above and other objects, features, and advantages of the present invention more clearly understood, the following preferred embodiments are described in detail with reference to the accompanying drawings.
Drawings
Fig. 1 is a flowchart of a data processing method for text verification according to an embodiment of the present invention.
Detailed Description
To further illustrate the technical means and effects of the present invention adopted to achieve the predetermined objects, the following detailed description will be given with reference to the accompanying drawings and preferred embodiments of a data processing method, an electronic device and a storage medium for text verification according to the present invention.
The embodiment of the invention provides a data processing method for text verification, which further comprises the following steps, as shown in fig. 1:
s100, acquiring m first texts from a first text set of a text database as sample texts, and constructing a sample text list A = (A)1,A2,A3,……,Am),AiI =1 … … m, and when the text is the ith sample textAiWhen the keyword exists in the preset keyword list, the A is matched with any preset keyword in the preset keyword listiIs marked as specifying a starting position and AiIs marked as a specified end position, and the speech segment between the specified start position and the specified end position is taken as AiBased on the presence of A of the target language fragmentiAs training set data, a training set is constructed.
Specifically, the method further includes the following steps before the step S100:
the text types of all the first texts are obtained, and the first texts of the same type are classified according to preset text division rules to construct a plurality of first text sets.
Preferably, the text division rule refers to a preset rule for dividing the text by the text type of the first text, where the text type of the first text is, for example, a purchase text, a statistical text, or an order text.
Specifically, the first text is a text storing semi-structured text data, wherein all sample texts in a constructed based on the first text set are texts of the same type, so that a preset language model can be conveniently trained, the accuracy of model training is improved, and the accuracy and efficiency of comparison between the structured text data and the semi-structured text data are improved.
Specifically, in step S100, AiThe keywords in the text are determined by a natural language processing method, and the keywords can be extracted from the sample text to determine the language segments capable of obtaining the key data, so that the accuracy and efficiency of comparing the structured text data with the semi-structured text data are improved.
Preferably, the preset keyword list is a preset keyword list, and the keyword list field includes a keyword corresponding to a text type of any of the first texts, which can be understood as follows: at step S100In step, traverse AiAnd according to AiText type, obtaining A from preset keyword listiAll preset keywords corresponding to the text type are used as target keywords to obtain AiThe comparison of the keywords with all the target keywords can be facilitated, the comparison of the keywords in the sample text can be facilitated, the language segment capable of obtaining the key data can be determined, and the accuracy and the efficiency of the comparison of the structured text data and the semi-structured text data are improved.
Specifically, the key data refers to data with a local special meaning in the sample text, and the special meaning needs to be determined according to the text type, which is not described herein again.
S200, inputting the training set into a preset language model for training to obtain a trained language model.
Specifically, the step S200 further includes the steps of:
s201, concentrating the training set AiInputting the data into a preset language model to obtain AiCorresponding key data are constructed into a key data list SiIn this embodiment, a method for obtaining a feature value by any language model in the art may be adopted, which is not described herein again;
s203, obtaining AiCorresponding text ID, and according to AiCorresponding text ID, obtaining A from the verification data listiConstructing a second intermediate data list by using all the verification data of the corresponding text ID as second intermediate data;
s205, traverse AiCorresponding key data list and according to said AiCorresponding Key data List and AiAnd determining the probability value F of the A by the corresponding second intermediate data list, wherein the F meets the following conditions:
wherein S isiIs the AiThe amount of critical data in the corresponding list of critical data,is the AiThe number of data in the corresponding key data list which is inconsistent with the corresponding second intermediate data in the second intermediate data list;
s207, traversing A, and obtaining a trained language model when F is larger than or equal to a preset probability threshold;
s209, when F is less than the preset probability threshold, the sample text list is obtained againAccording toIterating until F is larger than or equal to a preset probability threshold value to obtain a trained language model, wherein the iteration process is based onAfter the step S100 processing is executed, the method reacquiresThe corresponding probability process is not described herein again.
Further, the text ID refers to a unique identification for identifying the text.
Preferably, the language model is a Bert model.
Preferably, in the step S209,the same sample text as a can be further understood as: requiring re-acquisition when retraining the language modelIs the same text type as A, andincluding AiCorresponding probability FiSample text < preset probability threshold and not including AiCorresponding probability FiSample text of a probability threshold value, wherein FiThe following conditions are met:。
further, the probability threshold range is 90-98%, preferably, the probability threshold is 90%.
In another specific embodiment, the method comprises the following steps:
obtaining the same sample text list A, and collecting A in the training setiInputting the data into a preset language model to obtain AiCorresponding key data are constructed into a key data list;
obtaining AiCorresponding text ID, and according to AiCorresponding text ID, obtaining A from the verification data listiConstructing a second intermediate data list by using all the verification data of the corresponding text ID as second intermediate data;
traverse AiCorresponding key data list and according to said AiCorresponding Key data List and AiCorresponding to the second intermediate data list, determining the probability value of A。
As can be seen from the large amount of experimental data obtained by the method of the above embodiment, in the case of using the same sample text list,compared with F, the number of the target language segments is reduced by at least 10%, namely F corresponding to target language segment marking of the sample text is not reduced by 10% and F corresponding to target language segment marking of the sample text, so that the method can further explain that the extraction of the full text data and the interference of other data are reduced by checking the determination of the target language segments in the implementation, and the comparison of the data in the text is facilitated.
S300, obtaining the eyesMarking a text and inputting the target text into a trained language model, and acquiring a target data list B = (B) corresponding to the target text1,B2,B3,……,Bn),BjJ =2 … … n, n is the target data number, and each B in B isjAnd acquiring a target knowledge graph corresponding to the target text by using a plurality of preset triple frameworks.
Specifically, the step S300 further includes the steps of:
all B arejInserting the target texts into each preset triple framework as entities to construct a plurality of knowledge graphs of the target texts, and inserting the maximum quantity B into the knowledge graphs of the target textsjThe target knowledge graph is understood as follows: each text type of the first text corresponds to a plurality of preset triad frameworks, and B is used for determining the type of the first textjThe constructed knowledge graph is used as a target knowledge graph, so that a suitable knowledge graph can be quickly constructed to store data, and meanwhile, comparison between the knowledge graph and verification data is facilitated, namely comparison between semi-structured text data and structured text data; the target data refers to data with special meaning in the target text, and the special meaning needs to be determined according to the text type, which is not described herein again.
Specifically, the target text refers to any first text in the text database except the sample text, and the target text is consistent with the text type of the sample text in the training set used for training the language model, which can be understood as: the target text is consistent with the text types of all sample texts in A, and meanwhile, the target text does not need to mark the starting position of a speech fragment.
S400, acquiring the text ID of the target text, acquiring all verification data corresponding to the text ID of the target text from a verification data list according to the text ID of the target text, and constructing a first intermediate data list by taking each verification data as first intermediate data.
Specifically, the step S400 further includes the steps of:
according to the text ID of the first text, a plurality of second texts corresponding to the text ID of the first text are obtained from a text database, all the second texts are preprocessed, designated data are obtained from the second texts and serve as verification data of the first text, a verification data list is constructed according to the verification data of all the first texts and the text ID of the first text, the second text is a text which records data corresponding to the data used for verifying the first text, and the second text is a structured text.
S500, traversing the target knowledge graph, and replacing any target data in the target knowledge graph with corresponding target data when the target data is inconsistent with the corresponding first intermediate data in the first intermediate data list.
Specifically, the step S500 further includes the steps of:
s501, traversing the target knowledge graph and acquiring target data corresponding to each entity in a target triple framework from the target knowledge graph, wherein the target triple framework in the step S501 refers to the triple framework corresponding to the target knowledge graph;
s502, according to the entity of the target triple structure, obtaining first intermediate data corresponding to the entity from the first intermediate data list, which can be understood as: the entity in the target triple structure is a field name in a check data list;
s503, comparing the target data with the corresponding first intermediate data;
and S505, when the target data is inconsistent with the corresponding first intermediate data, replacing the first intermediate data with the corresponding target data.
In the embodiment, the comparison of the structured data to the semi-structured data can be realized, and the efficiency and the accuracy of the verification of the structured data to the semi-structured data are improved.
The method comprises the steps that a sample text list is obtained, when a keyword consistent with any preset keyword in the preset keyword list exists in a sample text, the position of the keyword of the sample text is marked as a specified starting position, the end position of the sample text is marked as a specified end position, a speech segment between the specified starting position and the specified end position is used as a target speech segment of the sample text, and the sample text based on the target speech segment exists as training set data to construct a training set; the training set is input into a preset language model for training to obtain a trained language model, so that the language model is optimized, a target language segment capable of extracting specific meaning data can be accurately and efficiently determined, extraction of full text data and interference of other data are reduced, and comparison of data in a text is facilitated.
Meanwhile, the target text is input into the trained language model, the characteristic value list corresponding to the target text is obtained, each characteristic value is constructed by a plurality of preset triples, the target knowledge graph corresponding to the target text is obtained, data in the semi-structured text can be stored in the form of the knowledge graph, the storage mode is optimized, comparison of the data in the text is facilitated, and the efficiency and accuracy of data verification are improved.
Embodiments of the present application also provide a non-transitory computer-readable storage medium that can be disposed in an electronic device to store at least one instruction or at least one program for implementing a method of the method embodiments, where the at least one instruction or the at least one program is loaded into and executed by a processor to implement the method provided by the above embodiments.
Embodiments of the present application also provide an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.
Although the present invention has been described with reference to a preferred embodiment, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (8)
1. A data processing method for text verification, the method further comprising the steps of:
s100, acquiring m first texts from a first text set of a text database as sample texts, and constructing a sample text list A = (A)1,A2,A3,……,Am),AiI =1 … … m, and when A is the ith sample textiWhen the keyword exists in the preset keyword list, the A is matched with any preset keyword in the preset keyword listiIs marked as specifying a starting position and AiIs marked as a specified end position, and the speech segment between the specified start position and the specified end position is taken as AiBased on the presence of A of the target language fragmentiConstructing a training set as training set data, wherein the first text refers to a text storing semi-structured data;
s200, inputting the training set into a preset language model for training to obtain a trained language model, wherein the step S200 further comprises the following steps:
s201, concentrating the training set AiInputting the data into a preset language model to obtain AiCorresponding key data are constructed into a key data list Si;
S203, obtaining AiCorresponding text ID, and according to AiCorresponding text ID, obtaining A from the verification data listiConstructing a second intermediate data list by using all the verification data of the corresponding text ID as second intermediate data;
s205, traverse AiCorresponding key data list andaccording to said AiCorresponding Key data List and AiAnd determining the probability value F of the A by the corresponding second intermediate data list, wherein the F meets the following conditions:
wherein S isiIs the AiThe amount of critical data in the corresponding list of critical data,is the AiThe number of data in the corresponding key data list which is inconsistent with the corresponding second intermediate data in the second intermediate data list;
s207, traversing A, and obtaining a trained language model when F is larger than or equal to a preset probability threshold;
s209, when F is less than the preset probability threshold, the sample text list is obtained againAccording toPerforming iteration until F is larger than or equal to a preset probability threshold value to obtain a trained language model;
the step S209 includes:may have the same sample text as A, and need to be retrieved when the language model is retrainedIs the same text type as A, andincluding AiCorresponding probability FiSamples of < Preset probability thresholdText and not including AiCorresponding probability FiSample text that is greater than or equal to a preset probability threshold, wherein,
s300, obtaining a target text, inputting the target text into a trained language model, and obtaining a target data list B = (B) corresponding to the target text1,B2,B3,……,Bn),BjJ =2 … … n, n is the target data number, and each B in B isjAcquiring a target knowledge graph corresponding to the target text by using a plurality of preset triple frameworks;
s400, acquiring a text ID of a target text, acquiring all verification data corresponding to the text ID of the target text from a verification data list according to the text ID of the target text, and constructing a first intermediate data list by taking each verification data as first intermediate data, wherein the target text refers to any first text except a sample text in a text database;
wherein, the step of S400 further comprises the following steps: according to the text ID of the first text, acquiring a plurality of second texts corresponding to the text ID of the first text from a text database, preprocessing all the second texts, acquiring designated data from the second texts to serve as verification data of the first text, and constructing a verification data list according to the verification data of all the first texts and the text ID of the first text, wherein the second text is a text recorded with data corresponding to the data for verifying the first text, and the second text is a structured text;
s500, traversing the target knowledge graph, and replacing any target data in the target knowledge graph with corresponding target data when the target data is inconsistent with the corresponding first intermediate data in the first intermediate data list.
2. The data processing method for text verification according to claim 1, wherein in step S100, aiThe keywords in (1) are determined by a natural language processing method.
3. The data processing method for text verification according to claim 1, further comprising the following steps in the step S300:
all B arejInserting the target texts into each preset triple framework as entities to construct a plurality of knowledge graphs of the target texts, and inserting the maximum quantity B into the knowledge graphs of the target textsjThe target knowledge graph is used as the knowledge graph of (1).
4. The data processing method for text verification according to claim 1, wherein the target text refers to any first text in the text database except the sample text.
5. The data processing method for text verification according to claim 1, further comprising the following steps in the step S400:
according to the text ID of the first text, a plurality of second texts corresponding to the text ID of the first text are obtained from a text database, all the second texts are preprocessed, key data are extracted to serve as verification data of the first text, and a verification data list is constructed according to the verification data of all the first texts and the text ID of the first text.
6. The data processing method for text verification according to claim 5, wherein the second text is a text corresponding to the data recorded for verifying the first text.
7. A non-transitory computer readable storage medium having stored therein at least one instruction or at least one program, the at least one instruction or the at least one program being loaded and executed by a processor to implement the method of any of claims 1-6.
8. An electronic device comprising a processor and the non-transitory computer readable storage medium of claim 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111310983.5A CN113761880B (en) | 2021-11-08 | 2021-11-08 | Data processing method for text verification, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111310983.5A CN113761880B (en) | 2021-11-08 | 2021-11-08 | Data processing method for text verification, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113761880A CN113761880A (en) | 2021-12-07 |
CN113761880B true CN113761880B (en) | 2022-03-04 |
Family
ID=78784725
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111310983.5A Active CN113761880B (en) | 2021-11-08 | 2021-11-08 | Data processing method for text verification, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113761880B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114168608B (en) * | 2021-12-16 | 2022-07-15 | 中科雨辰科技有限公司 | Data processing system for updating knowledge graph |
CN114297653B (en) * | 2021-12-31 | 2024-09-13 | 安天科技集团股份有限公司 | De-duplication method for derivative data |
CN114021200B (en) * | 2022-01-07 | 2022-04-15 | 每日互动股份有限公司 | Data processing system for pkg fuzzification |
CN115858208B (en) * | 2022-09-29 | 2024-05-14 | 杭州中电安科现代科技有限公司 | Method for acquiring target data and extracting text list |
CN115544974A (en) * | 2022-11-28 | 2022-12-30 | 药融云数字科技(成都)有限公司 | Text data extraction method, system, storage medium and terminal |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200364233A1 (en) * | 2019-05-15 | 2020-11-19 | WeR.AI, Inc. | Systems and methods for a context sensitive search engine using search criteria and implicit user feedback |
CN111753086A (en) * | 2020-06-11 | 2020-10-09 | 北京天空卫士网络安全技术有限公司 | Junk mail identification method and device |
CN112860872B (en) * | 2021-03-17 | 2024-06-28 | 广东电网有限责任公司 | Power distribution network operation ticket semantic compliance verification method and system based on self-learning |
CN113239208A (en) * | 2021-05-06 | 2021-08-10 | 广东博维创远科技有限公司 | Mark training model based on knowledge graph |
CN113254667A (en) * | 2021-06-07 | 2021-08-13 | 成都工物科云科技有限公司 | Scientific and technological figure knowledge graph construction method and device based on deep learning model and terminal |
-
2021
- 2021-11-08 CN CN202111310983.5A patent/CN113761880B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN113761880A (en) | 2021-12-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113761880B (en) | Data processing method for text verification, electronic equipment and storage medium | |
WO2019174132A1 (en) | Data processing method, server and computer storage medium | |
WO2019091026A1 (en) | Knowledge base document rapid search method, application server, and computer readable storage medium | |
JP5785617B2 (en) | Method and arrangement for handling data sets, data processing program and computer program product | |
EP3819785A1 (en) | Feature word determining method, apparatus, and server | |
US10163063B2 (en) | Automatically mining patterns for rule based data standardization systems | |
US9852122B2 (en) | Method of automated analysis of text documents | |
CN107102993B (en) | User appeal analysis method and device | |
US20200364216A1 (en) | Method, apparatus and storage medium for updating model parameter | |
CN108153728B (en) | Keyword determination method and device | |
CN112183102A (en) | Named entity identification method based on attention mechanism and graph attention network | |
CN114780746A (en) | Knowledge graph-based document retrieval method and related equipment thereof | |
CN111209373A (en) | Sensitive text recognition method and device based on natural semantics | |
CN107958068B (en) | Language model smoothing method based on entity knowledge base | |
CN114266256A (en) | Method and system for extracting new words in field | |
CN109344233B (en) | Chinese name recognition method | |
CN103440292A (en) | Method and system for retrieving multimedia information based on bit vector | |
CN110909532B (en) | User name matching method and device, computer equipment and storage medium | |
CN110888977B (en) | Text classification method, apparatus, computer device and storage medium | |
CN113420564B (en) | Hybrid matching-based electric power nameplate semantic structuring method and system | |
CN113343051B (en) | Abnormal SQL detection model construction method and detection method | |
CN112989040B (en) | Dialogue text labeling method and device, electronic equipment and storage medium | |
CN111341404B (en) | Electronic medical record data set analysis method and system based on ernie model | |
WO2021056740A1 (en) | Language model construction method and system, computer device and readable storage medium | |
CN105824871A (en) | Picture detecting method and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |