CN111881288A - Method and device for judging authenticity of record information, storage medium and electronic equipment - Google Patents
Method and device for judging authenticity of record information, storage medium and electronic equipment Download PDFInfo
- Publication number
- CN111881288A CN111881288A CN202010426275.7A CN202010426275A CN111881288A CN 111881288 A CN111881288 A CN 111881288A CN 202010426275 A CN202010426275 A CN 202010426275A CN 111881288 A CN111881288 A CN 111881288A
- Authority
- CN
- China
- Prior art keywords
- entity
- information
- consistency detection
- record information
- event information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000001514 detection method Methods 0.000 claims abstract description 63
- 230000006399 behavior Effects 0.000 claims description 29
- 238000010586 diagram Methods 0.000 claims description 23
- 230000014509 gene expression Effects 0.000 claims description 16
- 239000011159 matrix material Substances 0.000 claims description 13
- 238000013139 quantization Methods 0.000 claims description 10
- 230000011218 segmentation Effects 0.000 claims description 10
- 230000008094 contradictory effect Effects 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 3
- 230000007704 transition Effects 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 2
- 230000008569 process Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 235000021152 breakfast Nutrition 0.000 description 3
- 230000008030 elimination Effects 0.000 description 2
- 238000003379 elimination reaction Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Computational Linguistics (AREA)
- Machine Translation (AREA)
Abstract
The application provides a method and a device for judging authenticity of stroke record information, a storage medium and electronic equipment. Firstly, extracting entities, entity relations and event information in the record information; constructing a knowledge graph according to the entity, the entity relation and the event information; performing relation consistency detection and behavior consistency detection on the knowledge graph based on a historical database; adjusting a confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection; and when the confidence score is smaller than a preset score threshold value, determining that the record information is in doubt. And acquiring the coincidence degree of the record information and the content recorded by the historical database through relation consistency detection, and acquiring the self-contradictory degree of the record information through behavior consistency detection, so as to adjust the confidence score and comprehensively judge whether the record information is real and credible. The judgment result is more accurate, and the case handling by related personnel is more facilitated.
Description
Technical Field
The application relates to the field of computers, in particular to a method and a device for judging authenticity of stroke record information, a storage medium and electronic equipment.
Background
In the process of processing some cases, the record information is recorded according to the expression of related personnel. The record information is an important clue and basis for processing cases. The wrong record information may mislead the judgment of the case by the related personnel, thereby possibly affecting the whole case processing flow.
It is often necessary to verify the authenticity of the transcript information. The prior method is to judge the truth of the record information by the experience of the police. It is highly demanding for the judge and takes much time for the judge.
Disclosure of Invention
The present application aims to provide a method and an apparatus for determining authenticity of handwriting information, a storage medium, and an electronic device, so as to solve the above problems.
In order to achieve the above purpose, the embodiments of the present application employ the following technical solutions:
in a first aspect, an embodiment of the present application provides a method for determining authenticity of handwriting information, where the method includes:
extracting entities, entity relations and event information in the record information, wherein the entities comprise one or more of people, time, places, articles and case types, the entity relations represent relation information among the entities, and the event information represents events corresponding to the entities;
constructing a knowledge graph according to the entity, the entity relationship and the event information, wherein the knowledge graph comprises corresponding relationships between the entity and the entity relationship and between the entity and the event information;
performing relation consistency detection and behavior consistency detection on the knowledge graph based on a historical database;
adjusting a confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection;
when the confidence score is smaller than a preset score threshold value, determining that the record information is in doubt;
the historical database comprises record information which is verified to be true, the behavior consistency detection is used for detecting whether contradiction exists between event information corresponding to the entity, and the relationship consistency detection is used for detecting whether expression of entity relationships corresponding to the entity in the historical database and the knowledge graph is consistent.
In a second aspect, an embodiment of the present application provides an apparatus for determining whether a transcript information is true or false, where the apparatus includes:
the processing unit is used for extracting entities, entity relations and event information in the record information, wherein the entities comprise one or more of people, time, places, articles and case types, the entity relations represent relation information among the entities, and the event information represents events corresponding to the entities; the system is further used for constructing a knowledge graph according to the entity, the entity relationship and the event information, wherein the knowledge graph comprises corresponding relationships between the entity and the entity relationship and between the entity and the event information; the system is also used for carrying out relation consistency detection and behavior consistency detection on the knowledge graph based on a historical database; the system is further used for adjusting the confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection;
the judging unit is used for determining that the record information is in doubt when the confidence score is smaller than a preset score threshold;
the historical database comprises record information which is verified to be true, the behavior consistency detection is used for detecting whether contradiction exists between event information corresponding to the entity, and the relationship consistency detection is used for detecting whether expression of entity relationships corresponding to the entity in the historical database and the knowledge graph is consistent.
In a third aspect, the present application provides a storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the method described above.
In a fourth aspect, an embodiment of the present application provides an electronic device, including: a processor and memory for storing one or more programs; the one or more programs, when executed by the processor, implement the methods described above.
Compared with the prior art, the method, the device, the storage medium and the electronic equipment for judging the authenticity of the handwriting information have the advantages that: firstly, extracting entities, entity relations and event information in the record information; constructing a knowledge graph according to the entity, the entity relation and the event information; performing relation consistency detection and behavior consistency detection on the knowledge graph based on a historical database; adjusting a confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection; and when the confidence score is smaller than a preset score threshold value, determining that the record information is in doubt. And acquiring the coincidence degree of the record information and the content recorded by the historical database through relation consistency detection, and acquiring the self-contradictory degree of the record information through behavior consistency detection, so as to adjust the confidence score and comprehensively judge whether the record information is real and credible. The judgment result is more accurate, and the case handling by related personnel is more facilitated.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and it will be apparent to those skilled in the art that other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of a method for determining authenticity of record information according to an embodiment of the present application;
fig. 3 is a schematic diagram illustrating the substeps of S102 according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram illustrating the substeps of S102-2 provided in the embodiments of the present application;
fig. 5 is a schematic view of substeps of S101 provided in an embodiment of the present application;
FIG. 6 is a schematic diagram illustrating the substeps of S101-6 provided in the embodiments of the present application;
fig. 7 is a schematic diagram illustrating sub-steps of S104 according to an embodiment of the present disclosure;
fig. 8 is a schematic unit diagram of a device for determining authenticity of transcript information according to an embodiment of the present disclosure.
In the figure: 10-a processor; 11-a memory; 12-a bus; 13-a communication interface; 201-a processing unit; 202-judging unit.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
In the description of the present application, it should be noted that the terms "upper", "lower", "inner", "outer", and the like indicate orientations or positional relationships based on orientations or positional relationships shown in the drawings or orientations or positional relationships conventionally found in use of products of the application, and are used only for convenience in describing the present application and for simplification of description, but do not indicate or imply that the referred devices or elements must have a specific orientation, be constructed in a specific orientation, and be operated, and thus should not be construed as limiting the present application.
In the description of the present application, it is also to be noted that, unless otherwise explicitly specified or limited, the terms "disposed" and "connected" are to be interpreted broadly, e.g., as being either fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meaning of the above terms in the present application can be understood in a specific case by those of ordinary skill in the art.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
The embodiment of the application provides an electronic device which can be a computer device or other intelligent devices. Please refer to fig. 1, a schematic structural diagram of an electronic device. The electronic device comprises a processor 10, a memory 11, a bus 12. The processor 10 and the memory 11 are connected by a bus 12, and the processor 10 is configured to execute an executable module, such as a computer program, stored in the memory 11.
The processor 10 may be an integrated circuit chip having signal processing capabilities. In the implementation process, the steps of the method for determining the authenticity of the transcript information may be implemented by an integrated logic circuit of hardware in the processor 10 or an instruction in the form of software. The Processor 10 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
The Memory 11 may comprise a high-speed Random Access Memory (RAM) and may further comprise a non-volatile Memory (non-volatile Memory), such as at least one disk Memory.
The bus 12 may be an ISA (Industry Standard architecture) bus, a PCI (peripheral component interconnect) bus, an EISA (extended Industry Standard architecture) bus, or the like. Only one bi-directional arrow is shown in fig. 1, but this does not indicate only one bus 12 or one type of bus 12.
The memory 11 is used for storing programs, such as programs corresponding to the device for judging the authenticity of the record information. The device for judging the authenticity of the record information comprises at least one software functional module which can be stored in the memory 11 in the form of software or firmware (firmware) or solidified in an Operating System (OS) of the electronic device. After receiving the execution instruction, the processor 10 executes the program to implement the method for determining the authenticity of the record information.
Possibly, the electronic device provided by the embodiment of the present application further includes a communication interface 13. The communication interface 13 is connected to the processor 10 via a bus. The electronic device may receive information, such as the bibliographic information, transmitted by an external device through the communication interface 13.
It should be understood that the structure shown in fig. 1 is merely a structural schematic diagram of a portion of an electronic device, which may also include more or fewer components than shown in fig. 1, or have a different configuration than shown in fig. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.
The method for determining the authenticity of the handwriting information provided by the embodiment of the invention can be applied to the electronic device shown in fig. 1, but is not limited to the specific process, please refer to fig. 2:
and S101, extracting entities, entity relationships and event information in the record information.
Bibliographic information often consists of people, places, times, events, and relationships. The components in the record information are related to each other. In order to verify the authenticity of the bibliographic information, it is necessary to determine whether the composition components and the association between the composition components are correct, so that the content in the bibliographic information needs to be extracted.
The entities comprise one or more of people, time, places, articles and case types, the entity relationship represents relationship information among the entities, and the event information represents events corresponding to the entities. For example, "three open and four Li", three open and four Li are entities, and event information is "three open and four Li"; "Xiaoming is the student of the teacher of king", Xiaoming and teacher of king are the entities, the student is the entity relation between the two; "eight morning spots of the small army play basketball in playground", the small army, eight morning spots and playground are entities.
And S102, constructing a knowledge graph according to the entity, the entity relation and the event information.
The knowledge graph comprises corresponding relations between entities and entity relations and event information respectively. For example, "Xiaoming and Xiaohong are couples", the entity relationship "couple" corresponds to the entities "Xiaoming" and "Xiaohong", respectively; the event information "fighting" corresponds to the entities "Dugdng" and "Xiaoli", respectively.
And S103, carrying out relation consistency detection and behavior consistency detection on the knowledge graph based on the historical database.
Wherein the history database contains bibliographic information that has been verified to be genuine. The behavior consistency detection is used for detecting whether event information corresponding to the entity is contradictory, and the relationship consistency detection is used for detecting whether expression of entity relationships corresponding to the entity in the historical database and the knowledge graph is consistent. For example, Zhang Xiao hong and Wang Xiao Ming in the historical database are the relationships between couples, and Zhang Xiao hong and Li Xiao Jun are the relationships between couples, then the expression of the entity relationship corresponding to Zhang Xiao hong in the historical database and the knowledge map is inconsistent. Event information, for example, documented in a knowledge graph includes: eight times in XX month XX morning of Xiaoming XX year when the roman eats breakfast and 9 times in XX month XX morning of Xiaoming XX year when the Hangzhou takes part in auction. Obviously, 8 o 'clock in roman and 9 o' clock in Hangzhou on the same day is impossible to achieve. There is a conflict between the two event information.
And S104, adjusting the confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection.
Specifically, relationship consistency detection and behavior consistency detection are performed on different entities respectively, and confidence scores corresponding to the record information are adjusted based on detection results of the different entities.
And S105, judging whether the confidence score is smaller than a preset score threshold value. If yes, executing S106; if not, S107 is executed.
Specifically, when the confidence score is smaller than the preset score threshold, it indicates that there are many nonconformities between the entry information and the content recorded in the history database or there are many contradictions between the entry information and the content, and then S106 is performed. Otherwise, it indicates that there are few self-contradictory records and the contents recorded in the history database are substantially the same, and S107 is executed.
And S106, determining that the record information is in doubt.
And S107, determining that the record information is true.
To sum up, in the method for determining the authenticity of the record information provided in the embodiment of the present application, first, an entity relationship, and event information in the record information are extracted; constructing a knowledge graph according to the entity, the entity relation and the event information; performing relation consistency detection and behavior consistency detection on the knowledge graph based on a historical database; adjusting a confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection; and when the confidence score is smaller than a preset score threshold value, determining that the record information is in doubt. And acquiring the coincidence degree of the record information and the content recorded by the historical database through relation consistency detection, and acquiring the self-contradictory degree of the record information through behavior consistency detection, so as to adjust the confidence score and comprehensively judge whether the record information is real and credible. The judgment result is more accurate, and the case handling by related personnel is more facilitated.
On the basis of fig. 2, for the content in S104, the embodiment of the present application provides a possible implementation manner, please refer to the following, and the confidence score is obtained by the following equation:
wherein S isiThe confidence score of the ith record information is represented; rFCharacterizing the number of entity relationships for which there are inconsistencies; rallCharacterizing a total number of entity relationships; a. theFCharacterizing the number of event information for which there is a conflict; a. theallCharacterizing a total amount of event information; omega1Representing a relationship consistency score weight coefficient; omega2Representing the behavior consistency score weight coefficient.
It should be noted that the entity relationships with contradiction include the entity relationships between the bibliographic information and the history database and the entity relationships with contradiction in the bibliographic information. The quantity of the contradictory event information comprises the self-contradictory event information in the record information and the contradictory event information of the record information and the historical database.
On the basis of fig. 2, for the content in S102, the embodiment of the present application further provides a possible implementation manner, please refer to fig. 3, where S102 includes:
s102-1, establishing a knowledge graph by taking the entities, the entity relations and the event information as different nodes.
Specifically, different entities correspond to different nodes, different entity relationships correspond to different nodes, and different event information corresponds to different nodes.
And S102-2, eliminating repeated nodes in the knowledge graph.
The repeated nodes are any two nodes with the same corresponding entities, or any two nodes with the same corresponding entity relationship, or any two nodes with the same corresponding event information. Specifically, due to the difference of personal memory methods, there is a difference in expressions such as a case location, a case tool, etc., which may cause different expressions corresponding to the same entity, entity relationship, and event information. When the expression modes are different, the corresponding nodes in the knowledge graph are different. Duplicate node elimination, i.e., entity disambiguation, is required.
On the basis of fig. 3, for the content in S102-2, the embodiment of the present application further provides a possible implementation manner, please refer to fig. 4, where S102-2 includes:
and S102-2-1, calculating the similarity of any two nodes.
Possibly, the similarity between any two nodes of the same type is computed. For example, the type includes a node corresponding to the entity, a node corresponding to the entity relationship, and a node corresponding to the event information.
And S102-2-2, judging whether the similarity is larger than a preset threshold value. If yes, executing S102-2-4; if not, S102-2-3 is executed.
When the similarity between any two nodes is greater than a preset threshold, the nature characterizing the two nodes may be the same. Such as dun and san go, may represent the same person, but the expressions are not consistent. At this time, S102-2-4 is performed. Otherwise, S102-2-3 is executed.
S102-2-3, no elimination processing is performed.
And S102-2-4, two nodes are repeated nodes, and one node is eliminated.
Possibly, any of the nodes that remain duplicated.
The embodiment of the present application provides a possible implementation manner, and the similarity between any two nodes is calculated by the following equation:
wherein sim (A, B) characterizes the similarity of node A and node B, AsCharacterizing similarity features of node A, A0The quantitative characteristics of the characteristic node A, m is the number of the first-order neighbor nodes of the characteristic node A in the knowledge graph, PiRepresenting the quantization characteristics of the ith neighbor node of the node A, i is less than or equal to m, BsCharacterizing similarity characteristics of node Bs, B0Characterizing the quantitative characteristics of the node B, n characterizing the number of first-order neighbor nodes of the node B in the knowledge graph, PkAnd characterizing the quantization characteristics of the kth neighbor node of the node B, wherein k is less than or equal to n.
On the basis of fig. 2, for the content in S101, the embodiment of the present application further provides a possible implementation manner, please refer to fig. 5, where S101 includes:
s101-1, carrying out quantization coding on characters in the stroke record information.
Specifically, the quantization code may be a code consisting of 0 and 1, i.e., a one-hot code for chinese characters.
S101-2, using the quantization code as the input of the entity extraction model to obtain the mark sequence score of the corresponding character.
In particular, the entity extraction model may be a bilstm model. The annotation sequence is scored as the number of digits in the entity to which the word belongs. For example, a word in Zhangxian, the possible labeling sequences are:
the first character: 7 min; the second position of the character: 0.5 min; the third person: 0.5 point …
Time first bit: 0 minute; second bit of time: 0 minute; third bit of time: 0 point …
The first place of the site: 2 min; the second place of the site: 0.5 min; third place of site: 0.5 point …
The above annotation sequence score is only for ease of understanding, and the expression and form of the annotation sequence score are not limited herein.
And S101-3, outputting the entity type of the characters according to the type transition probability between the characters and the characters before and after the characters and the mark sequence scores.
For example, Zhang Xiaoming eats breakfast, and obviously, eating is a verb, which is not the same type as the previous Ming and the following breakfast. When the type between a word and its predecessor and successor is not transferred, often the word is a combination with its predecessor and successor. The "sheet" is the first digit in the character name, "small" is the second digit, and "bright" is the third digit, which can be obtained by the type transition probability and the annotation sequence score.
Possibly, the Chinese character entity type can be output through the bilstm model and the softmax layer.
And S101-4, combining the characters associated with the entity types together to obtain the entity.
For example, the three words "zhang", "xiao" and "ming" mentioned above are combined to obtain zhang xiao ming.
S101-5, acquiring entity relations based on the entities.
For example, when the obtained entity contains the small red and the small king, the entity relationship between the small red and the small king is obtained from the record information in combination with semantic analysis.
And S101-6, acquiring event information based on the syntactic relation.
The syntactic relations are, for example, a predicate relation, a move-guest relation, and the like. As the composition of an event often contains predicates or verbs. Therefore, event information can be acquired by using the predicate or verb as a trigger and combining the syntactic relationship.
On the basis of fig. 5, for the content in S106-6, the embodiment of the present application further provides a possible implementation manner, please refer to fig. 6, where S106-6 includes:
s106-6-1, the stroke information is used as the input of the word segmentation model to obtain a word segmentation result.
The word segmentation result includes words in the bibliographic information, such as verbs, nouns, adjectives, and the like.
And S106-6-2, constructing the word segmentation result into a syntactic network diagram based on the syntactic relation.
The syntactic network diagram comprises vocabularies in the word segmentation result, syntactic relations among the vocabularies and an adjacency matrix, wherein the adjacency matrix represents whether the syntactic relations exist between any two vocabularies or not.
Possibly, the syntactic network graph G ═ V, E, a. V denotes any one of the nodes (each vocabulary is a node) in the syntactic network diagram G. E denotes a relationship between nodes in the syntactic network diagram G. A denotes an adjacency matrix of the syntactic network diagram G. A may be composed of (0, 1), 1 indicating that there is a syntactic relationship between two nodes, and 0 indicating that there is no syntactic relationship.
And S106-6-3, embedding and representing the sentence method network graph based on the quantitative representation and the adjacency matrix of each vocabulary to obtain a vocabulary feature list.
Specifically, the vocabulary may be obtained according to the following equations.
Wherein A is an adjacency matrix;is a repair matrix; i is an identity matrix; d isA degree matrix of (c); x is a quantization matrix; w1、W2、b1、b2Is a trainable parameter;is a vocabulary feature list. The quantization matrix consists of quantized encodings of the words.
And S106-6-4, calculating Euclidean distances between the verbs and other vocabularies based on the vocabulary feature table.
The verbs and other vocabularies are all vocabularies contained in the syntactic network diagram.
S106-6-5, judging whether the Euclidean distance between other vocabularies and the verb is smaller than a preset distance threshold value. If yes, executing S106-6-7; if not, S106-6-6 is executed.
Specifically, the Euclidean distance between the other vocabularies and the verb is smaller than the preset distance threshold, which indicates that the association exists between the vocabularies and the verb, and S106-6-7 is executed. Otherwise, S106-6-6 is executed.
S106-6-6, other vocabularies are not associated with the verbs.
And S106-6-7, using other vocabularies as the components of the event corresponding to the verb to obtain the candidate event.
And S106-6-8, taking the candidate event as the input of the classifier model to output the final event information.
Specifically, the event type of the candidate event may be obtained according to the following equation, and the classifier model may be trained in a manner of outputting final event information.
Wherein, CiRepresenting a subsequent time; n represents the number of sentences in the sample; y isiRepresenting real event categories;representing a predicted event category; n ispIndicating the number of event categories.
On the basis of fig. 2, as for the content in S104, a possible implementation manner is further provided in the embodiment of the present application, please refer to fig. 7, where S104 includes:
s104-1, when the result of the relationship consistency detection indicates that the event information corresponding to the entity is contradictory, the confidence score is reduced.
And S104-2, when the behavior consistency detection result shows that the expression of the entity relation corresponding to the entity in the historical database and the knowledge graph is inconsistent, reducing the confidence score.
Referring to fig. 8, fig. 8 is a schematic diagram illustrating an apparatus for determining authenticity of transcript information according to an embodiment of the present application, where the apparatus for determining authenticity of transcript information is optionally applied to the electronic device described above.
The device for judging the authenticity of the record information comprises: a processing unit 201 and a judging unit 202.
The processing unit 201 is configured to extract entities, entity relationships, and event information in the entry information, where the entities include one or more of people, time, places, articles, and case types, the entity relationships represent relationship information between the entities, and the event information represents events corresponding to the entities; the system is also used for constructing a knowledge graph according to the entity, the entity relationship and the event information, wherein the knowledge graph comprises the corresponding relationship between the entity and the entity relationship and the corresponding relationship between the entity and the event information; the system is also used for carrying out relation consistency detection and behavior consistency detection on the knowledge graph based on the historical database; and the method is also used for adjusting the confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection. Specifically, the processing unit 201 may execute S101 to S104 described above.
The determining unit 202 is configured to determine that the entry information is in doubt when the confidence score is smaller than a preset score threshold. Specifically, the judgment unit 202 may execute the above-described S105-S107.
The historical database comprises record information which is verified to be true, behavior consistency detection is used for detecting whether event information corresponding to the entity is contradictory, and relationship consistency detection is used for detecting whether expression of entity relationships corresponding to the entity in the historical database is consistent with expression of the entity relationships in the knowledge graph.
It should be noted that the device for determining the authenticity of the bibliographic information provided in this embodiment may execute the method flows shown in the above method flow embodiments to achieve the corresponding technical effects. For the sake of brevity, the corresponding contents in the above embodiments may be referred to where not mentioned in this embodiment.
The embodiment of the invention also provides a storage medium, wherein the storage medium stores computer instructions and a program, and the computer instructions and the program execute the method for judging the truth of the record information of the embodiment when being read and run. The storage medium may include memory, flash memory, registers, or a combination thereof, etc.
The following provides an electronic device, which may be a computing device or other intelligent terminal device, and as shown in fig. 1, the electronic device may implement the method for determining whether the record information is true or false; specifically, the electronic device includes: processor 10, memory 11, bus 12. The processor 10 may be a CPU. The memory 11 is used for storing one or more programs, and when the one or more programs are executed by the processor 10, the method for determining the authenticity of the bibliographic information of the above embodiment is performed.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.
It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Claims (10)
1. A method for judging the authenticity of stroke record information is characterized by comprising the following steps:
extracting entities, entity relations and event information in the record information, wherein the entities comprise one or more of people, time, places, articles and case types, the entity relations represent relation information among the entities, and the event information represents events corresponding to the entities;
constructing a knowledge graph according to the entity, the entity relationship and the event information, wherein the knowledge graph comprises corresponding relationships between the entity and the entity relationship and between the entity and the event information;
performing relation consistency detection and behavior consistency detection on the knowledge graph based on a historical database;
adjusting a confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection;
when the confidence score is smaller than a preset score threshold value, determining that the record information is in doubt;
the historical database comprises record information which is verified to be true, the behavior consistency detection is used for detecting whether contradiction exists between event information corresponding to the entity, and the relationship consistency detection is used for detecting whether expression of entity relationships corresponding to the entity in the historical database and the knowledge graph is consistent.
2. The method for determining the authenticity of bibliographic information according to claim 1, wherein said step of constructing a knowledge graph based on said entities, said entity relationships, and said event information comprises:
constructing the knowledge graph by taking the entities, the entity relations and the event information as different nodes;
and eliminating repeated nodes in the knowledge graph, wherein the repeated nodes are any two nodes with the same corresponding entities, or any two nodes with the same corresponding entity relationship, or any two nodes with the same corresponding event information.
3. The method for determining the authenticity of bibliographic information according to claim 2, wherein said step of eliminating duplicate nodes in said knowledge-graph comprises:
calculating the similarity of any two nodes;
judging whether the similarity is larger than a preset threshold value or not;
if yes, the two nodes are repeated nodes, and one of the nodes is eliminated.
4. The method for determining the authenticity of bibliographic information according to claim 3, wherein the similarity between any two nodes is calculated by the following formula:
wherein sim (A, B) characterizes the similarity of node A and node B, AsCharacterizing similarity features of node A, A0A quantitative feature characterizing a node A, m characterizing the number of first-order neighbor nodes of the node A in the knowledge graph, PiRepresenting the quantization characteristics of the ith neighbor node of the node A, i is less than or equal to m, BsCharacterizing similarity characteristics of node Bs, B0A quantitative feature characterizing the node B, n characterizing the number of first-order neighbor nodes of the node B in the knowledge graph, PkAnd characterizing the quantization characteristics of the kth neighbor node of the node B, wherein k is less than or equal to n.
5. The method for determining the authenticity of bibliographic information according to claim 1, wherein the step of extracting the entity, the entity relationship and the event information in the bibliographic information comprises:
carrying out quantitative coding on characters in the stroke record information;
taking the quantization code as the input of an entity extraction model to obtain the mark sequence score of the corresponding character;
outputting the entity type of the characters according to the type transition probability between the characters and the characters before and after the characters and the scores of the labeling sequences;
combining the words associated with the entity types together to obtain the entity;
obtaining the entity relationship based on the entity;
and acquiring the event information based on the syntactic relation.
6. The method for determining the authenticity of the transcript information as claimed in claim 5, wherein said step of obtaining said event information based on syntactic relations comprises:
taking the stroke record information as input of a word segmentation model to obtain a word segmentation result, wherein the word segmentation result comprises words in the stroke record information;
constructing the word segmentation result into a syntactic network diagram based on the syntactic relation, wherein the syntactic network diagram comprises words in the word segmentation result, the syntactic relation among the words and an adjacency matrix, and the adjacency matrix represents whether the syntactic relation exists between any two words or not;
embedding and representing the syntactic network diagram based on the quantized representation of each vocabulary and the adjacency matrix to obtain a vocabulary feature list;
calculating Euclidean distances between the verbs and other vocabularies based on the vocabulary feature table, wherein the verbs and the other vocabularies are all the vocabularies contained in the syntactic network diagram;
if the Euclidean distance between other vocabularies and the verb is smaller than a preset distance threshold value, taking the other vocabularies as the components of the event corresponding to the verb to obtain a candidate event;
and taking the candidate event as an input of a classifier model to output final event information.
7. The method for determining the authenticity of bibliographic information according to claim 1, wherein the step of adjusting the confidence score corresponding to the bibliographic information based on the results of the relationship consistency detection and the behavior consistency detection comprises:
when the result of the relationship consistency detection indicates that the event information corresponding to the entity is contradictory, the confidence score is reduced;
and when the behavior consistency detection result indicates that the expression of the entity relationship corresponding to the entity in the historical database and the knowledge graph is inconsistent, reducing the confidence score.
8. A device for judging the authenticity of a note information, which is characterized by comprising:
the processing unit is used for extracting entities, entity relations and event information in the record information, wherein the entities comprise one or more of people, time, places, articles and case types, the entity relations represent relation information among the entities, and the event information represents events corresponding to the entities; the system is further used for constructing a knowledge graph according to the entity, the entity relationship and the event information, wherein the knowledge graph comprises corresponding relationships between the entity and the entity relationship and between the entity and the event information; the system is also used for carrying out relation consistency detection and behavior consistency detection on the knowledge graph based on a historical database; the system is further used for adjusting the confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection;
the judging unit is used for determining that the record information is in doubt when the confidence score is smaller than a preset score threshold;
the historical database comprises record information which is verified to be true, the behavior consistency detection is used for detecting whether contradiction exists between event information corresponding to the entity, and the relationship consistency detection is used for detecting whether expression of entity relationships corresponding to the entity in the historical database and the knowledge graph is consistent.
9. A storage medium on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-7.
10. An electronic device, comprising: a processor and memory for storing one or more programs; the one or more programs, when executed by the processor, implement the method of any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010426275.7A CN111881288B (en) | 2020-05-19 | 2020-05-19 | Method and device for judging true and false of stroke information, storage medium and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010426275.7A CN111881288B (en) | 2020-05-19 | 2020-05-19 | Method and device for judging true and false of stroke information, storage medium and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111881288A true CN111881288A (en) | 2020-11-03 |
CN111881288B CN111881288B (en) | 2024-04-09 |
Family
ID=73154345
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010426275.7A Active CN111881288B (en) | 2020-05-19 | 2020-05-19 | Method and device for judging true and false of stroke information, storage medium and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111881288B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114637829A (en) * | 2022-02-21 | 2022-06-17 | 阿里巴巴(中国)有限公司 | Recording text processing method, recording text processing device and computer readable storage medium |
CN117591660A (en) * | 2024-01-18 | 2024-02-23 | 杭州威灿科技有限公司 | Material generation method, equipment and medium based on digital person |
Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101520740A (en) * | 2009-04-03 | 2009-09-02 | 北京航空航天大学 | Method for realizing event consistency based on time mapping |
WO2012130489A1 (en) * | 2011-04-01 | 2012-10-04 | Siemens Aktiengesellschaft | Method, system, and computer program product for maintaining data consistency between two databases |
CN108241727A (en) * | 2017-09-01 | 2018-07-03 | 新华智云科技有限公司 | News reliability evaluation method and equipment |
CN109388648A (en) * | 2018-08-15 | 2019-02-26 | 王小易 | A method of extracting personal information and party in electronic record |
CN109766445A (en) * | 2018-12-13 | 2019-05-17 | 平安科技(深圳)有限公司 | A kind of knowledge mapping construction method and data processing equipment |
CN109785968A (en) * | 2018-12-27 | 2019-05-21 | 东软集团股份有限公司 | A kind of event prediction method, apparatus, equipment and program product |
CN109902151A (en) * | 2019-03-08 | 2019-06-18 | 南阳市烟草公司城区分公司 | Recording method, device and the electronic equipment of interrogation record |
US20190354544A1 (en) * | 2011-02-22 | 2019-11-21 | Refinitiv Us Organization Llc | Machine learning-based relationship association and related discovery and search engines |
CN110489569A (en) * | 2019-08-26 | 2019-11-22 | 上海秒针网络科技有限公司 | A kind of event-handling method and device of knowledge based map |
CN110634088A (en) * | 2018-06-25 | 2019-12-31 | 阿里巴巴集团控股有限公司 | Case refereeing method, device and system |
WO2020001373A1 (en) * | 2018-06-26 | 2020-01-02 | 杭州海康威视数字技术股份有限公司 | Method and apparatus for ontology construction |
CN110825879A (en) * | 2019-09-18 | 2020-02-21 | 平安科技(深圳)有限公司 | Case decision result determination method, device and equipment and computer readable storage medium |
CN110825880A (en) * | 2019-09-18 | 2020-02-21 | 平安科技(深圳)有限公司 | Case winning rate determining method, device, equipment and computer readable storage medium |
CN110895568A (en) * | 2018-09-13 | 2020-03-20 | 阿里巴巴集团控股有限公司 | Method and system for processing court trial records |
US20200117732A1 (en) * | 2018-10-11 | 2020-04-16 | International Business Machines Corporation | Analysis and determination of relative consistency of identified relationships |
CN111159428A (en) * | 2019-12-30 | 2020-05-15 | 智慧神州(北京)科技有限公司 | Method and device for automatically extracting event relation of knowledge graph in economic field |
-
2020
- 2020-05-19 CN CN202010426275.7A patent/CN111881288B/en active Active
Patent Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101520740A (en) * | 2009-04-03 | 2009-09-02 | 北京航空航天大学 | Method for realizing event consistency based on time mapping |
US20190354544A1 (en) * | 2011-02-22 | 2019-11-21 | Refinitiv Us Organization Llc | Machine learning-based relationship association and related discovery and search engines |
WO2012130489A1 (en) * | 2011-04-01 | 2012-10-04 | Siemens Aktiengesellschaft | Method, system, and computer program product for maintaining data consistency between two databases |
CN108241727A (en) * | 2017-09-01 | 2018-07-03 | 新华智云科技有限公司 | News reliability evaluation method and equipment |
CN110634088A (en) * | 2018-06-25 | 2019-12-31 | 阿里巴巴集团控股有限公司 | Case refereeing method, device and system |
WO2020001373A1 (en) * | 2018-06-26 | 2020-01-02 | 杭州海康威视数字技术股份有限公司 | Method and apparatus for ontology construction |
CN109388648A (en) * | 2018-08-15 | 2019-02-26 | 王小易 | A method of extracting personal information and party in electronic record |
CN110895568A (en) * | 2018-09-13 | 2020-03-20 | 阿里巴巴集团控股有限公司 | Method and system for processing court trial records |
US20200117732A1 (en) * | 2018-10-11 | 2020-04-16 | International Business Machines Corporation | Analysis and determination of relative consistency of identified relationships |
CN109766445A (en) * | 2018-12-13 | 2019-05-17 | 平安科技(深圳)有限公司 | A kind of knowledge mapping construction method and data processing equipment |
CN109785968A (en) * | 2018-12-27 | 2019-05-21 | 东软集团股份有限公司 | A kind of event prediction method, apparatus, equipment and program product |
CN109902151A (en) * | 2019-03-08 | 2019-06-18 | 南阳市烟草公司城区分公司 | Recording method, device and the electronic equipment of interrogation record |
CN110489569A (en) * | 2019-08-26 | 2019-11-22 | 上海秒针网络科技有限公司 | A kind of event-handling method and device of knowledge based map |
CN110825880A (en) * | 2019-09-18 | 2020-02-21 | 平安科技(深圳)有限公司 | Case winning rate determining method, device, equipment and computer readable storage medium |
CN110825879A (en) * | 2019-09-18 | 2020-02-21 | 平安科技(深圳)有限公司 | Case decision result determination method, device and equipment and computer readable storage medium |
CN111159428A (en) * | 2019-12-30 | 2020-05-15 | 智慧神州(北京)科技有限公司 | Method and device for automatically extracting event relation of knowledge graph in economic field |
Non-Patent Citations (4)
Title |
---|
刘稳 等: "法院判决书关键信息抽取系统设计与实现", 《湖北工业大学学报》, vol. 33, no. 01, pages 63 - 67 * |
李明 等: "勘验笔录证明力的认证规则探讨", 《证据科学》, vol. 26, no. 02, pages 151 - 160 * |
董坤 等: "论行政笔录在刑事诉讼中的使用", 《苏州大学学报(哲学社会科学版)》, vol. 36, no. 04, pages 107 - 114 * |
郭文利: "刑事司法印证式采纳言词笔录实践之反思", 《证据科学》, vol. 23, no. 06, pages 686 - 693 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114637829A (en) * | 2022-02-21 | 2022-06-17 | 阿里巴巴(中国)有限公司 | Recording text processing method, recording text processing device and computer readable storage medium |
CN117591660A (en) * | 2024-01-18 | 2024-02-23 | 杭州威灿科技有限公司 | Material generation method, equipment and medium based on digital person |
CN117591660B (en) * | 2024-01-18 | 2024-04-16 | 杭州威灿科技有限公司 | Material generation method, equipment and medium based on digital person |
Also Published As
Publication number | Publication date |
---|---|
CN111881288B (en) | 2024-04-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019184217A1 (en) | Hotspot event classification method and apparatus, and storage medium | |
CN110232923B (en) | Voice control instruction generation method and device and electronic equipment | |
US20130238611A1 (en) | Automatically Mining Patterns for Rule Based Data Standardization Systems | |
WO2021208727A1 (en) | Text error detection method and apparatus based on artificial intelligence, and computer device | |
CN112084381A (en) | Event extraction method, system, storage medium and equipment | |
CN110096573B (en) | Text parsing method and device | |
CN112686036B (en) | Risk text recognition method and device, computer equipment and storage medium | |
CN110741376A (en) | Automatic document analysis for different natural languages | |
CN114090794A (en) | Event map construction method based on artificial intelligence and related equipment | |
CN111881288A (en) | Method and device for judging authenticity of record information, storage medium and electronic equipment | |
CN111125295A (en) | Method and system for obtaining food safety question answers based on LSTM | |
CN111782759B (en) | Question-answering processing method and device and computer readable storage medium | |
CN112132238A (en) | Method, device, equipment and readable medium for identifying private data | |
CN112836039A (en) | Voice data processing method and device based on deep learning | |
CN111369148A (en) | Object index monitoring method, electronic device and storage medium | |
CN111079433A (en) | Event extraction method and device and electronic equipment | |
CN111813896A (en) | Text triple relation identification method and device, training method and electronic equipment | |
CN112182448A (en) | Page information processing method, device and equipment | |
CN116189215A (en) | Automatic auditing method and device, electronic equipment and storage medium | |
CN112541357B (en) | Entity identification method and device and intelligent equipment | |
CN115238092A (en) | Entity relationship extraction method, device, equipment and storage medium | |
CN114548113A (en) | Event-based reference resolution system, method, terminal and storage medium | |
CN114722832A (en) | Abstract extraction method, device, equipment and storage medium | |
Hatzivassiloglou et al. | A quantitative evaluation of linguistic tests for the automatic prediction of semantic markedness | |
CN111708870A (en) | Deep neural network-based question answering method and device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |