CN111881288A - Method and device for judging authenticity of record information, storage medium and electronic equipment - Google Patents

Method and device for judging authenticity of record information, storage medium and electronic equipment Download PDF

Info

Publication number
CN111881288A
CN111881288A CN202010426275.7A CN202010426275A CN111881288A CN 111881288 A CN111881288 A CN 111881288A CN 202010426275 A CN202010426275 A CN 202010426275A CN 111881288 A CN111881288 A CN 111881288A
Authority
CN
China
Prior art keywords
entity
information
consistency detection
record information
event information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010426275.7A
Other languages
Chinese (zh)
Other versions
CN111881288B (en
Inventor
陆韵
李冰
倪骏
黄刚
盛丽兰
陆克贤
俞山青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Chinaoly Technology Co ltd
Original Assignee
Hangzhou Chinaoly Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Chinaoly Technology Co ltd filed Critical Hangzhou Chinaoly Technology Co ltd
Priority to CN202010426275.7A priority Critical patent/CN111881288B/en
Publication of CN111881288A publication Critical patent/CN111881288A/en
Application granted granted Critical
Publication of CN111881288B publication Critical patent/CN111881288B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Machine Translation (AREA)

Abstract

The application provides a method and a device for judging authenticity of stroke record information, a storage medium and electronic equipment. Firstly, extracting entities, entity relations and event information in the record information; constructing a knowledge graph according to the entity, the entity relation and the event information; performing relation consistency detection and behavior consistency detection on the knowledge graph based on a historical database; adjusting a confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection; and when the confidence score is smaller than a preset score threshold value, determining that the record information is in doubt. And acquiring the coincidence degree of the record information and the content recorded by the historical database through relation consistency detection, and acquiring the self-contradictory degree of the record information through behavior consistency detection, so as to adjust the confidence score and comprehensively judge whether the record information is real and credible. The judgment result is more accurate, and the case handling by related personnel is more facilitated.

Description

Method and device for judging authenticity of record information, storage medium and electronic equipment
Technical Field
The application relates to the field of computers, in particular to a method and a device for judging authenticity of stroke record information, a storage medium and electronic equipment.
Background
In the process of processing some cases, the record information is recorded according to the expression of related personnel. The record information is an important clue and basis for processing cases. The wrong record information may mislead the judgment of the case by the related personnel, thereby possibly affecting the whole case processing flow.
It is often necessary to verify the authenticity of the transcript information. The prior method is to judge the truth of the record information by the experience of the police. It is highly demanding for the judge and takes much time for the judge.
Disclosure of Invention
The present application aims to provide a method and an apparatus for determining authenticity of handwriting information, a storage medium, and an electronic device, so as to solve the above problems.
In order to achieve the above purpose, the embodiments of the present application employ the following technical solutions:
in a first aspect, an embodiment of the present application provides a method for determining authenticity of handwriting information, where the method includes:
extracting entities, entity relations and event information in the record information, wherein the entities comprise one or more of people, time, places, articles and case types, the entity relations represent relation information among the entities, and the event information represents events corresponding to the entities;
constructing a knowledge graph according to the entity, the entity relationship and the event information, wherein the knowledge graph comprises corresponding relationships between the entity and the entity relationship and between the entity and the event information;
performing relation consistency detection and behavior consistency detection on the knowledge graph based on a historical database;
adjusting a confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection;
when the confidence score is smaller than a preset score threshold value, determining that the record information is in doubt;
the historical database comprises record information which is verified to be true, the behavior consistency detection is used for detecting whether contradiction exists between event information corresponding to the entity, and the relationship consistency detection is used for detecting whether expression of entity relationships corresponding to the entity in the historical database and the knowledge graph is consistent.
In a second aspect, an embodiment of the present application provides an apparatus for determining whether a transcript information is true or false, where the apparatus includes:
the processing unit is used for extracting entities, entity relations and event information in the record information, wherein the entities comprise one or more of people, time, places, articles and case types, the entity relations represent relation information among the entities, and the event information represents events corresponding to the entities; the system is further used for constructing a knowledge graph according to the entity, the entity relationship and the event information, wherein the knowledge graph comprises corresponding relationships between the entity and the entity relationship and between the entity and the event information; the system is also used for carrying out relation consistency detection and behavior consistency detection on the knowledge graph based on a historical database; the system is further used for adjusting the confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection;
the judging unit is used for determining that the record information is in doubt when the confidence score is smaller than a preset score threshold;
the historical database comprises record information which is verified to be true, the behavior consistency detection is used for detecting whether contradiction exists between event information corresponding to the entity, and the relationship consistency detection is used for detecting whether expression of entity relationships corresponding to the entity in the historical database and the knowledge graph is consistent.
In a third aspect, the present application provides a storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the method described above.
In a fourth aspect, an embodiment of the present application provides an electronic device, including: a processor and memory for storing one or more programs; the one or more programs, when executed by the processor, implement the methods described above.
Compared with the prior art, the method, the device, the storage medium and the electronic equipment for judging the authenticity of the handwriting information have the advantages that: firstly, extracting entities, entity relations and event information in the record information; constructing a knowledge graph according to the entity, the entity relation and the event information; performing relation consistency detection and behavior consistency detection on the knowledge graph based on a historical database; adjusting a confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection; and when the confidence score is smaller than a preset score threshold value, determining that the record information is in doubt. And acquiring the coincidence degree of the record information and the content recorded by the historical database through relation consistency detection, and acquiring the self-contradictory degree of the record information through behavior consistency detection, so as to adjust the confidence score and comprehensively judge whether the record information is real and credible. The judgment result is more accurate, and the case handling by related personnel is more facilitated.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and it will be apparent to those skilled in the art that other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of a method for determining authenticity of record information according to an embodiment of the present application;
fig. 3 is a schematic diagram illustrating the substeps of S102 according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram illustrating the substeps of S102-2 provided in the embodiments of the present application;
fig. 5 is a schematic view of substeps of S101 provided in an embodiment of the present application;
FIG. 6 is a schematic diagram illustrating the substeps of S101-6 provided in the embodiments of the present application;
fig. 7 is a schematic diagram illustrating sub-steps of S104 according to an embodiment of the present disclosure;
fig. 8 is a schematic unit diagram of a device for determining authenticity of transcript information according to an embodiment of the present disclosure.
In the figure: 10-a processor; 11-a memory; 12-a bus; 13-a communication interface; 201-a processing unit; 202-judging unit.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
In the description of the present application, it should be noted that the terms "upper", "lower", "inner", "outer", and the like indicate orientations or positional relationships based on orientations or positional relationships shown in the drawings or orientations or positional relationships conventionally found in use of products of the application, and are used only for convenience in describing the present application and for simplification of description, but do not indicate or imply that the referred devices or elements must have a specific orientation, be constructed in a specific orientation, and be operated, and thus should not be construed as limiting the present application.
In the description of the present application, it is also to be noted that, unless otherwise explicitly specified or limited, the terms "disposed" and "connected" are to be interpreted broadly, e.g., as being either fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meaning of the above terms in the present application can be understood in a specific case by those of ordinary skill in the art.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
The embodiment of the application provides an electronic device which can be a computer device or other intelligent devices. Please refer to fig. 1, a schematic structural diagram of an electronic device. The electronic device comprises a processor 10, a memory 11, a bus 12. The processor 10 and the memory 11 are connected by a bus 12, and the processor 10 is configured to execute an executable module, such as a computer program, stored in the memory 11.
The processor 10 may be an integrated circuit chip having signal processing capabilities. In the implementation process, the steps of the method for determining the authenticity of the transcript information may be implemented by an integrated logic circuit of hardware in the processor 10 or an instruction in the form of software. The Processor 10 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
The Memory 11 may comprise a high-speed Random Access Memory (RAM) and may further comprise a non-volatile Memory (non-volatile Memory), such as at least one disk Memory.
The bus 12 may be an ISA (Industry Standard architecture) bus, a PCI (peripheral component interconnect) bus, an EISA (extended Industry Standard architecture) bus, or the like. Only one bi-directional arrow is shown in fig. 1, but this does not indicate only one bus 12 or one type of bus 12.
The memory 11 is used for storing programs, such as programs corresponding to the device for judging the authenticity of the record information. The device for judging the authenticity of the record information comprises at least one software functional module which can be stored in the memory 11 in the form of software or firmware (firmware) or solidified in an Operating System (OS) of the electronic device. After receiving the execution instruction, the processor 10 executes the program to implement the method for determining the authenticity of the record information.
Possibly, the electronic device provided by the embodiment of the present application further includes a communication interface 13. The communication interface 13 is connected to the processor 10 via a bus. The electronic device may receive information, such as the bibliographic information, transmitted by an external device through the communication interface 13.
It should be understood that the structure shown in fig. 1 is merely a structural schematic diagram of a portion of an electronic device, which may also include more or fewer components than shown in fig. 1, or have a different configuration than shown in fig. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.
The method for determining the authenticity of the handwriting information provided by the embodiment of the invention can be applied to the electronic device shown in fig. 1, but is not limited to the specific process, please refer to fig. 2:
and S101, extracting entities, entity relationships and event information in the record information.
Bibliographic information often consists of people, places, times, events, and relationships. The components in the record information are related to each other. In order to verify the authenticity of the bibliographic information, it is necessary to determine whether the composition components and the association between the composition components are correct, so that the content in the bibliographic information needs to be extracted.
The entities comprise one or more of people, time, places, articles and case types, the entity relationship represents relationship information among the entities, and the event information represents events corresponding to the entities. For example, "three open and four Li", three open and four Li are entities, and event information is "three open and four Li"; "Xiaoming is the student of the teacher of king", Xiaoming and teacher of king are the entities, the student is the entity relation between the two; "eight morning spots of the small army play basketball in playground", the small army, eight morning spots and playground are entities.
And S102, constructing a knowledge graph according to the entity, the entity relation and the event information.
The knowledge graph comprises corresponding relations between entities and entity relations and event information respectively. For example, "Xiaoming and Xiaohong are couples", the entity relationship "couple" corresponds to the entities "Xiaoming" and "Xiaohong", respectively; the event information "fighting" corresponds to the entities "Dugdng" and "Xiaoli", respectively.
And S103, carrying out relation consistency detection and behavior consistency detection on the knowledge graph based on the historical database.
Wherein the history database contains bibliographic information that has been verified to be genuine. The behavior consistency detection is used for detecting whether event information corresponding to the entity is contradictory, and the relationship consistency detection is used for detecting whether expression of entity relationships corresponding to the entity in the historical database and the knowledge graph is consistent. For example, Zhang Xiao hong and Wang Xiao Ming in the historical database are the relationships between couples, and Zhang Xiao hong and Li Xiao Jun are the relationships between couples, then the expression of the entity relationship corresponding to Zhang Xiao hong in the historical database and the knowledge map is inconsistent. Event information, for example, documented in a knowledge graph includes: eight times in XX month XX morning of Xiaoming XX year when the roman eats breakfast and 9 times in XX month XX morning of Xiaoming XX year when the Hangzhou takes part in auction. Obviously, 8 o 'clock in roman and 9 o' clock in Hangzhou on the same day is impossible to achieve. There is a conflict between the two event information.
And S104, adjusting the confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection.
Specifically, relationship consistency detection and behavior consistency detection are performed on different entities respectively, and confidence scores corresponding to the record information are adjusted based on detection results of the different entities.
And S105, judging whether the confidence score is smaller than a preset score threshold value. If yes, executing S106; if not, S107 is executed.
Specifically, when the confidence score is smaller than the preset score threshold, it indicates that there are many nonconformities between the entry information and the content recorded in the history database or there are many contradictions between the entry information and the content, and then S106 is performed. Otherwise, it indicates that there are few self-contradictory records and the contents recorded in the history database are substantially the same, and S107 is executed.
And S106, determining that the record information is in doubt.
And S107, determining that the record information is true.
To sum up, in the method for determining the authenticity of the record information provided in the embodiment of the present application, first, an entity relationship, and event information in the record information are extracted; constructing a knowledge graph according to the entity, the entity relation and the event information; performing relation consistency detection and behavior consistency detection on the knowledge graph based on a historical database; adjusting a confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection; and when the confidence score is smaller than a preset score threshold value, determining that the record information is in doubt. And acquiring the coincidence degree of the record information and the content recorded by the historical database through relation consistency detection, and acquiring the self-contradictory degree of the record information through behavior consistency detection, so as to adjust the confidence score and comprehensively judge whether the record information is real and credible. The judgment result is more accurate, and the case handling by related personnel is more facilitated.
On the basis of fig. 2, for the content in S104, the embodiment of the present application provides a possible implementation manner, please refer to the following, and the confidence score is obtained by the following equation:
Figure BDA0002498781200000091
wherein S isiThe confidence score of the ith record information is represented; rFCharacterizing the number of entity relationships for which there are inconsistencies; rallCharacterizing a total number of entity relationships; a. theFCharacterizing the number of event information for which there is a conflict; a. theallCharacterizing a total amount of event information; omega1Representing a relationship consistency score weight coefficient; omega2Representing the behavior consistency score weight coefficient.
It should be noted that the entity relationships with contradiction include the entity relationships between the bibliographic information and the history database and the entity relationships with contradiction in the bibliographic information. The quantity of the contradictory event information comprises the self-contradictory event information in the record information and the contradictory event information of the record information and the historical database.
On the basis of fig. 2, for the content in S102, the embodiment of the present application further provides a possible implementation manner, please refer to fig. 3, where S102 includes:
s102-1, establishing a knowledge graph by taking the entities, the entity relations and the event information as different nodes.
Specifically, different entities correspond to different nodes, different entity relationships correspond to different nodes, and different event information corresponds to different nodes.
And S102-2, eliminating repeated nodes in the knowledge graph.
The repeated nodes are any two nodes with the same corresponding entities, or any two nodes with the same corresponding entity relationship, or any two nodes with the same corresponding event information. Specifically, due to the difference of personal memory methods, there is a difference in expressions such as a case location, a case tool, etc., which may cause different expressions corresponding to the same entity, entity relationship, and event information. When the expression modes are different, the corresponding nodes in the knowledge graph are different. Duplicate node elimination, i.e., entity disambiguation, is required.
On the basis of fig. 3, for the content in S102-2, the embodiment of the present application further provides a possible implementation manner, please refer to fig. 4, where S102-2 includes:
and S102-2-1, calculating the similarity of any two nodes.
Possibly, the similarity between any two nodes of the same type is computed. For example, the type includes a node corresponding to the entity, a node corresponding to the entity relationship, and a node corresponding to the event information.
And S102-2-2, judging whether the similarity is larger than a preset threshold value. If yes, executing S102-2-4; if not, S102-2-3 is executed.
When the similarity between any two nodes is greater than a preset threshold, the nature characterizing the two nodes may be the same. Such as dun and san go, may represent the same person, but the expressions are not consistent. At this time, S102-2-4 is performed. Otherwise, S102-2-3 is executed.
S102-2-3, no elimination processing is performed.
And S102-2-4, two nodes are repeated nodes, and one node is eliminated.
Possibly, any of the nodes that remain duplicated.
The embodiment of the present application provides a possible implementation manner, and the similarity between any two nodes is calculated by the following equation:
Figure BDA0002498781200000111
Figure BDA0002498781200000112
Figure BDA0002498781200000113
wherein sim (A, B) characterizes the similarity of node A and node B, AsCharacterizing similarity features of node A, A0The quantitative characteristics of the characteristic node A, m is the number of the first-order neighbor nodes of the characteristic node A in the knowledge graph, PiRepresenting the quantization characteristics of the ith neighbor node of the node A, i is less than or equal to m, BsCharacterizing similarity characteristics of node Bs, B0Characterizing the quantitative characteristics of the node B, n characterizing the number of first-order neighbor nodes of the node B in the knowledge graph, PkAnd characterizing the quantization characteristics of the kth neighbor node of the node B, wherein k is less than or equal to n.
On the basis of fig. 2, for the content in S101, the embodiment of the present application further provides a possible implementation manner, please refer to fig. 5, where S101 includes:
s101-1, carrying out quantization coding on characters in the stroke record information.
Specifically, the quantization code may be a code consisting of 0 and 1, i.e., a one-hot code for chinese characters.
S101-2, using the quantization code as the input of the entity extraction model to obtain the mark sequence score of the corresponding character.
In particular, the entity extraction model may be a bilstm model. The annotation sequence is scored as the number of digits in the entity to which the word belongs. For example, a word in Zhangxian, the possible labeling sequences are:
the first character: 7 min; the second position of the character: 0.5 min; the third person: 0.5 point …
Time first bit: 0 minute; second bit of time: 0 minute; third bit of time: 0 point …
The first place of the site: 2 min; the second place of the site: 0.5 min; third place of site: 0.5 point …
The above annotation sequence score is only for ease of understanding, and the expression and form of the annotation sequence score are not limited herein.
And S101-3, outputting the entity type of the characters according to the type transition probability between the characters and the characters before and after the characters and the mark sequence scores.
For example, Zhang Xiaoming eats breakfast, and obviously, eating is a verb, which is not the same type as the previous Ming and the following breakfast. When the type between a word and its predecessor and successor is not transferred, often the word is a combination with its predecessor and successor. The "sheet" is the first digit in the character name, "small" is the second digit, and "bright" is the third digit, which can be obtained by the type transition probability and the annotation sequence score.
Possibly, the Chinese character entity type can be output through the bilstm model and the softmax layer.
And S101-4, combining the characters associated with the entity types together to obtain the entity.
For example, the three words "zhang", "xiao" and "ming" mentioned above are combined to obtain zhang xiao ming.
S101-5, acquiring entity relations based on the entities.
For example, when the obtained entity contains the small red and the small king, the entity relationship between the small red and the small king is obtained from the record information in combination with semantic analysis.
And S101-6, acquiring event information based on the syntactic relation.
The syntactic relations are, for example, a predicate relation, a move-guest relation, and the like. As the composition of an event often contains predicates or verbs. Therefore, event information can be acquired by using the predicate or verb as a trigger and combining the syntactic relationship.
On the basis of fig. 5, for the content in S106-6, the embodiment of the present application further provides a possible implementation manner, please refer to fig. 6, where S106-6 includes:
s106-6-1, the stroke information is used as the input of the word segmentation model to obtain a word segmentation result.
The word segmentation result includes words in the bibliographic information, such as verbs, nouns, adjectives, and the like.
And S106-6-2, constructing the word segmentation result into a syntactic network diagram based on the syntactic relation.
The syntactic network diagram comprises vocabularies in the word segmentation result, syntactic relations among the vocabularies and an adjacency matrix, wherein the adjacency matrix represents whether the syntactic relations exist between any two vocabularies or not.
Possibly, the syntactic network graph G ═ V, E, a. V denotes any one of the nodes (each vocabulary is a node) in the syntactic network diagram G. E denotes a relationship between nodes in the syntactic network diagram G. A denotes an adjacency matrix of the syntactic network diagram G. A may be composed of (0, 1), 1 indicating that there is a syntactic relationship between two nodes, and 0 indicating that there is no syntactic relationship.
And S106-6-3, embedding and representing the sentence method network graph based on the quantitative representation and the adjacency matrix of each vocabulary to obtain a vocabulary feature list.
Specifically, the vocabulary may be obtained according to the following equations.
Figure BDA0002498781200000131
Figure BDA0002498781200000132
Wherein A is an adjacency matrix;
Figure BDA0002498781200000133
is a repair matrix; i is an identity matrix; d is
Figure BDA0002498781200000134
A degree matrix of (c); x is a quantization matrix; w1、W2、b1、b2Is a trainable parameter;
Figure BDA0002498781200000135
is a vocabulary feature list. The quantization matrix consists of quantized encodings of the words.
And S106-6-4, calculating Euclidean distances between the verbs and other vocabularies based on the vocabulary feature table.
The verbs and other vocabularies are all vocabularies contained in the syntactic network diagram.
S106-6-5, judging whether the Euclidean distance between other vocabularies and the verb is smaller than a preset distance threshold value. If yes, executing S106-6-7; if not, S106-6-6 is executed.
Specifically, the Euclidean distance between the other vocabularies and the verb is smaller than the preset distance threshold, which indicates that the association exists between the vocabularies and the verb, and S106-6-7 is executed. Otherwise, S106-6-6 is executed.
S106-6-6, other vocabularies are not associated with the verbs.
And S106-6-7, using other vocabularies as the components of the event corresponding to the verb to obtain the candidate event.
And S106-6-8, taking the candidate event as the input of the classifier model to output the final event information.
Specifically, the event type of the candidate event may be obtained according to the following equation, and the classifier model may be trained in a manner of outputting final event information.
Figure BDA0002498781200000141
Figure BDA0002498781200000142
Figure BDA0002498781200000143
Wherein, CiRepresenting a subsequent time; n represents the number of sentences in the sample; y isiRepresenting real event categories;
Figure BDA0002498781200000144
representing a predicted event category; n ispIndicating the number of event categories.
On the basis of fig. 2, as for the content in S104, a possible implementation manner is further provided in the embodiment of the present application, please refer to fig. 7, where S104 includes:
s104-1, when the result of the relationship consistency detection indicates that the event information corresponding to the entity is contradictory, the confidence score is reduced.
And S104-2, when the behavior consistency detection result shows that the expression of the entity relation corresponding to the entity in the historical database and the knowledge graph is inconsistent, reducing the confidence score.
Referring to fig. 8, fig. 8 is a schematic diagram illustrating an apparatus for determining authenticity of transcript information according to an embodiment of the present application, where the apparatus for determining authenticity of transcript information is optionally applied to the electronic device described above.
The device for judging the authenticity of the record information comprises: a processing unit 201 and a judging unit 202.
The processing unit 201 is configured to extract entities, entity relationships, and event information in the entry information, where the entities include one or more of people, time, places, articles, and case types, the entity relationships represent relationship information between the entities, and the event information represents events corresponding to the entities; the system is also used for constructing a knowledge graph according to the entity, the entity relationship and the event information, wherein the knowledge graph comprises the corresponding relationship between the entity and the entity relationship and the corresponding relationship between the entity and the event information; the system is also used for carrying out relation consistency detection and behavior consistency detection on the knowledge graph based on the historical database; and the method is also used for adjusting the confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection. Specifically, the processing unit 201 may execute S101 to S104 described above.
The determining unit 202 is configured to determine that the entry information is in doubt when the confidence score is smaller than a preset score threshold. Specifically, the judgment unit 202 may execute the above-described S105-S107.
The historical database comprises record information which is verified to be true, behavior consistency detection is used for detecting whether event information corresponding to the entity is contradictory, and relationship consistency detection is used for detecting whether expression of entity relationships corresponding to the entity in the historical database is consistent with expression of the entity relationships in the knowledge graph.
It should be noted that the device for determining the authenticity of the bibliographic information provided in this embodiment may execute the method flows shown in the above method flow embodiments to achieve the corresponding technical effects. For the sake of brevity, the corresponding contents in the above embodiments may be referred to where not mentioned in this embodiment.
The embodiment of the invention also provides a storage medium, wherein the storage medium stores computer instructions and a program, and the computer instructions and the program execute the method for judging the truth of the record information of the embodiment when being read and run. The storage medium may include memory, flash memory, registers, or a combination thereof, etc.
The following provides an electronic device, which may be a computing device or other intelligent terminal device, and as shown in fig. 1, the electronic device may implement the method for determining whether the record information is true or false; specifically, the electronic device includes: processor 10, memory 11, bus 12. The processor 10 may be a CPU. The memory 11 is used for storing one or more programs, and when the one or more programs are executed by the processor 10, the method for determining the authenticity of the bibliographic information of the above embodiment is performed.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.
It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Claims (10)

1. A method for judging the authenticity of stroke record information is characterized by comprising the following steps:
extracting entities, entity relations and event information in the record information, wherein the entities comprise one or more of people, time, places, articles and case types, the entity relations represent relation information among the entities, and the event information represents events corresponding to the entities;
constructing a knowledge graph according to the entity, the entity relationship and the event information, wherein the knowledge graph comprises corresponding relationships between the entity and the entity relationship and between the entity and the event information;
performing relation consistency detection and behavior consistency detection on the knowledge graph based on a historical database;
adjusting a confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection;
when the confidence score is smaller than a preset score threshold value, determining that the record information is in doubt;
the historical database comprises record information which is verified to be true, the behavior consistency detection is used for detecting whether contradiction exists between event information corresponding to the entity, and the relationship consistency detection is used for detecting whether expression of entity relationships corresponding to the entity in the historical database and the knowledge graph is consistent.
2. The method for determining the authenticity of bibliographic information according to claim 1, wherein said step of constructing a knowledge graph based on said entities, said entity relationships, and said event information comprises:
constructing the knowledge graph by taking the entities, the entity relations and the event information as different nodes;
and eliminating repeated nodes in the knowledge graph, wherein the repeated nodes are any two nodes with the same corresponding entities, or any two nodes with the same corresponding entity relationship, or any two nodes with the same corresponding event information.
3. The method for determining the authenticity of bibliographic information according to claim 2, wherein said step of eliminating duplicate nodes in said knowledge-graph comprises:
calculating the similarity of any two nodes;
judging whether the similarity is larger than a preset threshold value or not;
if yes, the two nodes are repeated nodes, and one of the nodes is eliminated.
4. The method for determining the authenticity of bibliographic information according to claim 3, wherein the similarity between any two nodes is calculated by the following formula:
Figure FDA0002498781190000021
Figure FDA0002498781190000022
Figure FDA0002498781190000023
wherein sim (A, B) characterizes the similarity of node A and node B, AsCharacterizing similarity features of node A, A0A quantitative feature characterizing a node A, m characterizing the number of first-order neighbor nodes of the node A in the knowledge graph, PiRepresenting the quantization characteristics of the ith neighbor node of the node A, i is less than or equal to m, BsCharacterizing similarity characteristics of node Bs, B0A quantitative feature characterizing the node B, n characterizing the number of first-order neighbor nodes of the node B in the knowledge graph, PkAnd characterizing the quantization characteristics of the kth neighbor node of the node B, wherein k is less than or equal to n.
5. The method for determining the authenticity of bibliographic information according to claim 1, wherein the step of extracting the entity, the entity relationship and the event information in the bibliographic information comprises:
carrying out quantitative coding on characters in the stroke record information;
taking the quantization code as the input of an entity extraction model to obtain the mark sequence score of the corresponding character;
outputting the entity type of the characters according to the type transition probability between the characters and the characters before and after the characters and the scores of the labeling sequences;
combining the words associated with the entity types together to obtain the entity;
obtaining the entity relationship based on the entity;
and acquiring the event information based on the syntactic relation.
6. The method for determining the authenticity of the transcript information as claimed in claim 5, wherein said step of obtaining said event information based on syntactic relations comprises:
taking the stroke record information as input of a word segmentation model to obtain a word segmentation result, wherein the word segmentation result comprises words in the stroke record information;
constructing the word segmentation result into a syntactic network diagram based on the syntactic relation, wherein the syntactic network diagram comprises words in the word segmentation result, the syntactic relation among the words and an adjacency matrix, and the adjacency matrix represents whether the syntactic relation exists between any two words or not;
embedding and representing the syntactic network diagram based on the quantized representation of each vocabulary and the adjacency matrix to obtain a vocabulary feature list;
calculating Euclidean distances between the verbs and other vocabularies based on the vocabulary feature table, wherein the verbs and the other vocabularies are all the vocabularies contained in the syntactic network diagram;
if the Euclidean distance between other vocabularies and the verb is smaller than a preset distance threshold value, taking the other vocabularies as the components of the event corresponding to the verb to obtain a candidate event;
and taking the candidate event as an input of a classifier model to output final event information.
7. The method for determining the authenticity of bibliographic information according to claim 1, wherein the step of adjusting the confidence score corresponding to the bibliographic information based on the results of the relationship consistency detection and the behavior consistency detection comprises:
when the result of the relationship consistency detection indicates that the event information corresponding to the entity is contradictory, the confidence score is reduced;
and when the behavior consistency detection result indicates that the expression of the entity relationship corresponding to the entity in the historical database and the knowledge graph is inconsistent, reducing the confidence score.
8. A device for judging the authenticity of a note information, which is characterized by comprising:
the processing unit is used for extracting entities, entity relations and event information in the record information, wherein the entities comprise one or more of people, time, places, articles and case types, the entity relations represent relation information among the entities, and the event information represents events corresponding to the entities; the system is further used for constructing a knowledge graph according to the entity, the entity relationship and the event information, wherein the knowledge graph comprises corresponding relationships between the entity and the entity relationship and between the entity and the event information; the system is also used for carrying out relation consistency detection and behavior consistency detection on the knowledge graph based on a historical database; the system is further used for adjusting the confidence score corresponding to the record information based on the results of the relationship consistency detection and the behavior consistency detection;
the judging unit is used for determining that the record information is in doubt when the confidence score is smaller than a preset score threshold;
the historical database comprises record information which is verified to be true, the behavior consistency detection is used for detecting whether contradiction exists between event information corresponding to the entity, and the relationship consistency detection is used for detecting whether expression of entity relationships corresponding to the entity in the historical database and the knowledge graph is consistent.
9. A storage medium on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-7.
10. An electronic device, comprising: a processor and memory for storing one or more programs; the one or more programs, when executed by the processor, implement the method of any of claims 1-7.
CN202010426275.7A 2020-05-19 2020-05-19 Method and device for judging true and false of stroke information, storage medium and electronic equipment Active CN111881288B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010426275.7A CN111881288B (en) 2020-05-19 2020-05-19 Method and device for judging true and false of stroke information, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010426275.7A CN111881288B (en) 2020-05-19 2020-05-19 Method and device for judging true and false of stroke information, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN111881288A true CN111881288A (en) 2020-11-03
CN111881288B CN111881288B (en) 2024-04-09

Family

ID=73154345

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010426275.7A Active CN111881288B (en) 2020-05-19 2020-05-19 Method and device for judging true and false of stroke information, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN111881288B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114637829A (en) * 2022-02-21 2022-06-17 阿里巴巴(中国)有限公司 Recording text processing method, recording text processing device and computer readable storage medium
CN117591660A (en) * 2024-01-18 2024-02-23 杭州威灿科技有限公司 Material generation method, equipment and medium based on digital person

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101520740A (en) * 2009-04-03 2009-09-02 北京航空航天大学 Method for realizing event consistency based on time mapping
WO2012130489A1 (en) * 2011-04-01 2012-10-04 Siemens Aktiengesellschaft Method, system, and computer program product for maintaining data consistency between two databases
CN108241727A (en) * 2017-09-01 2018-07-03 新华智云科技有限公司 News reliability evaluation method and equipment
CN109388648A (en) * 2018-08-15 2019-02-26 王小易 A method of extracting personal information and party in electronic record
CN109766445A (en) * 2018-12-13 2019-05-17 平安科技(深圳)有限公司 A kind of knowledge mapping construction method and data processing equipment
CN109785968A (en) * 2018-12-27 2019-05-21 东软集团股份有限公司 A kind of event prediction method, apparatus, equipment and program product
CN109902151A (en) * 2019-03-08 2019-06-18 南阳市烟草公司城区分公司 Recording method, device and the electronic equipment of interrogation record
US20190354544A1 (en) * 2011-02-22 2019-11-21 Refinitiv Us Organization Llc Machine learning-based relationship association and related discovery and search engines
CN110489569A (en) * 2019-08-26 2019-11-22 上海秒针网络科技有限公司 A kind of event-handling method and device of knowledge based map
CN110634088A (en) * 2018-06-25 2019-12-31 阿里巴巴集团控股有限公司 Case refereeing method, device and system
WO2020001373A1 (en) * 2018-06-26 2020-01-02 杭州海康威视数字技术股份有限公司 Method and apparatus for ontology construction
CN110825879A (en) * 2019-09-18 2020-02-21 平安科技(深圳)有限公司 Case decision result determination method, device and equipment and computer readable storage medium
CN110825880A (en) * 2019-09-18 2020-02-21 平安科技(深圳)有限公司 Case winning rate determining method, device, equipment and computer readable storage medium
CN110895568A (en) * 2018-09-13 2020-03-20 阿里巴巴集团控股有限公司 Method and system for processing court trial records
US20200117732A1 (en) * 2018-10-11 2020-04-16 International Business Machines Corporation Analysis and determination of relative consistency of identified relationships
CN111159428A (en) * 2019-12-30 2020-05-15 智慧神州(北京)科技有限公司 Method and device for automatically extracting event relation of knowledge graph in economic field

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101520740A (en) * 2009-04-03 2009-09-02 北京航空航天大学 Method for realizing event consistency based on time mapping
US20190354544A1 (en) * 2011-02-22 2019-11-21 Refinitiv Us Organization Llc Machine learning-based relationship association and related discovery and search engines
WO2012130489A1 (en) * 2011-04-01 2012-10-04 Siemens Aktiengesellschaft Method, system, and computer program product for maintaining data consistency between two databases
CN108241727A (en) * 2017-09-01 2018-07-03 新华智云科技有限公司 News reliability evaluation method and equipment
CN110634088A (en) * 2018-06-25 2019-12-31 阿里巴巴集团控股有限公司 Case refereeing method, device and system
WO2020001373A1 (en) * 2018-06-26 2020-01-02 杭州海康威视数字技术股份有限公司 Method and apparatus for ontology construction
CN109388648A (en) * 2018-08-15 2019-02-26 王小易 A method of extracting personal information and party in electronic record
CN110895568A (en) * 2018-09-13 2020-03-20 阿里巴巴集团控股有限公司 Method and system for processing court trial records
US20200117732A1 (en) * 2018-10-11 2020-04-16 International Business Machines Corporation Analysis and determination of relative consistency of identified relationships
CN109766445A (en) * 2018-12-13 2019-05-17 平安科技(深圳)有限公司 A kind of knowledge mapping construction method and data processing equipment
CN109785968A (en) * 2018-12-27 2019-05-21 东软集团股份有限公司 A kind of event prediction method, apparatus, equipment and program product
CN109902151A (en) * 2019-03-08 2019-06-18 南阳市烟草公司城区分公司 Recording method, device and the electronic equipment of interrogation record
CN110489569A (en) * 2019-08-26 2019-11-22 上海秒针网络科技有限公司 A kind of event-handling method and device of knowledge based map
CN110825880A (en) * 2019-09-18 2020-02-21 平安科技(深圳)有限公司 Case winning rate determining method, device, equipment and computer readable storage medium
CN110825879A (en) * 2019-09-18 2020-02-21 平安科技(深圳)有限公司 Case decision result determination method, device and equipment and computer readable storage medium
CN111159428A (en) * 2019-12-30 2020-05-15 智慧神州(北京)科技有限公司 Method and device for automatically extracting event relation of knowledge graph in economic field

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
刘稳 等: "法院判决书关键信息抽取系统设计与实现", 《湖北工业大学学报》, vol. 33, no. 01, pages 63 - 67 *
李明 等: "勘验笔录证明力的认证规则探讨", 《证据科学》, vol. 26, no. 02, pages 151 - 160 *
董坤 等: "论行政笔录在刑事诉讼中的使用", 《苏州大学学报(哲学社会科学版)》, vol. 36, no. 04, pages 107 - 114 *
郭文利: "刑事司法印证式采纳言词笔录实践之反思", 《证据科学》, vol. 23, no. 06, pages 686 - 693 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114637829A (en) * 2022-02-21 2022-06-17 阿里巴巴(中国)有限公司 Recording text processing method, recording text processing device and computer readable storage medium
CN117591660A (en) * 2024-01-18 2024-02-23 杭州威灿科技有限公司 Material generation method, equipment and medium based on digital person
CN117591660B (en) * 2024-01-18 2024-04-16 杭州威灿科技有限公司 Material generation method, equipment and medium based on digital person

Also Published As

Publication number Publication date
CN111881288B (en) 2024-04-09

Similar Documents

Publication Publication Date Title
WO2019184217A1 (en) Hotspot event classification method and apparatus, and storage medium
CN110232923B (en) Voice control instruction generation method and device and electronic equipment
US20130238611A1 (en) Automatically Mining Patterns for Rule Based Data Standardization Systems
WO2021208727A1 (en) Text error detection method and apparatus based on artificial intelligence, and computer device
CN112084381A (en) Event extraction method, system, storage medium and equipment
CN110096573B (en) Text parsing method and device
CN112686036B (en) Risk text recognition method and device, computer equipment and storage medium
CN110741376A (en) Automatic document analysis for different natural languages
CN114090794A (en) Event map construction method based on artificial intelligence and related equipment
CN111881288A (en) Method and device for judging authenticity of record information, storage medium and electronic equipment
CN111125295A (en) Method and system for obtaining food safety question answers based on LSTM
CN111782759B (en) Question-answering processing method and device and computer readable storage medium
CN112132238A (en) Method, device, equipment and readable medium for identifying private data
CN112836039A (en) Voice data processing method and device based on deep learning
CN111369148A (en) Object index monitoring method, electronic device and storage medium
CN111079433A (en) Event extraction method and device and electronic equipment
CN111813896A (en) Text triple relation identification method and device, training method and electronic equipment
CN112182448A (en) Page information processing method, device and equipment
CN116189215A (en) Automatic auditing method and device, electronic equipment and storage medium
CN112541357B (en) Entity identification method and device and intelligent equipment
CN115238092A (en) Entity relationship extraction method, device, equipment and storage medium
CN114548113A (en) Event-based reference resolution system, method, terminal and storage medium
CN114722832A (en) Abstract extraction method, device, equipment and storage medium
Hatzivassiloglou et al. A quantitative evaluation of linguistic tests for the automatic prediction of semantic markedness
CN111708870A (en) Deep neural network-based question answering method and device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant