CN112733874B

CN112733874B - Suspicious vehicle discrimination method based on knowledge graph reasoning

Info

Publication number: CN112733874B
Application number: CN202011144715.6A
Authority: CN
Inventors: 俞山川; 谢耀华; 闫禹; 周欣; 周健; 王少飞; 涂耘; 陈晓利; 叶青; 陈晨
Original assignee: China Merchants Chongqing Communications Research and Design Institute Co Ltd
Current assignee: China Merchants Chongqing Communications Research and Design Institute Co Ltd
Priority date: 2020-10-23
Filing date: 2020-10-23
Publication date: 2023-04-07
Anticipated expiration: 2040-10-23
Also published as: CN112733874A

Abstract

The invention discloses a suspicious vehicle distinguishing method based on knowledge graph reasoning, which comprises the following steps: s1: the method comprises the steps of obtaining suspicious vehicle knowledge, representing the suspicious vehicle knowledge in a triple (h, r, t) mode based on a knowledge graph construction mode, and establishing a suspicious vehicle knowledge graph; wherein h represents a head entity, r represents a relationship, and t represents a tail entity; s2: embedding the triple suspicious vehicle knowledge graph into a vector space in a fusion manner to obtain a triple expression vector; s3: constructing a triple relation model, training the triple relation model, and determining the vector similarity between the relation r of the entity h and the tail entity t according to a relation score function; s4: and constructing a triple path model, training the triple path model, and determining the reliability between the relation r of the entity h and the tail entity t according to a path score function of the triple path model, wherein the path with the highest path score is a reasoning result.

Description

Suspicious vehicle discrimination method based on knowledge graph reasoning

Technical Field

The invention relates to a suspicious vehicle distinguishing method based on knowledge graph reasoning.

Background

With the establishment of an intelligent management system/platform of management departments such as an expressway and the like, managers can obviously improve the capacity and dimensionality in the aspects of automatic vehicle feature identification and automatic historical behavior recording, and the establishment of a large number of electronic files for vehicles becomes possible. However, the relevance between vehicles and potential traffic accidents of vehicles is lack of deep mining, and the management department is difficult to perform 'capture' and fine management on suspicious vehicles.

Disclosure of Invention

The invention aims to provide a suspicious vehicle distinguishing method based on knowledge graph reasoning, which aims to solve the problem that the suspicious vehicle is difficult to effectively identify at present.

In order to solve the technical problem, the invention provides a suspicious vehicle discrimination method based on knowledge graph reasoning, which comprises the following steps:

s1: the method comprises the steps of obtaining suspicious vehicle knowledge, representing the suspicious vehicle knowledge in a triple (h, r, t) mode based on a knowledge graph construction mode, and establishing a suspicious vehicle knowledge graph; wherein h represents a head entity, r represents a relationship, and t represents a tail entity;

s2: embedding the triple suspicious vehicle knowledge graph into a vector space in a fusion manner to obtain a triple expression vector;

s3: constructing a triple relation model, training the triple relation model, and determining the vector similarity between the relation r of the entity h and the tail entity t according to a relation score function;

s4: and constructing a triple path model, training the triple path model, and determining the reliability between the relation r of the entity h and the tail entity t according to a path score function of the triple path model, wherein the path with the highest path score is an inference result.

Further, constructing the triplet relation model specifically includes:

and establishing one-to-one, one-to-many, many-to-one or many-to-many relationship between the entity h and the entity t through mapping of the entity space and the relationship space.

Further, according to a relation score function f _r (h, t) determining the vector similarity between the relation r of the entity h and the tail entity t, said relation scoring function f _r (h, t) is calculated using formula (1):

wherein w _h 、w _t 、w _r Mapping functions between entity/relationship representations; i is an identity matrix; each vector satisfies the following constraints:

‖h‖ ₂ ≤1,‖r‖ ₂ ≤1,‖t‖ ₂ ≤1 (2)

further, the training objective of the triplet relation model is a minimization loss function, which is expressed as follows:

s.t.

wherein the content of the first and second substances,

representing a set of triples; />

A negative sample representing (h, r, t) is obtained by randomly replacing a head entity h or a tail entity t in the training process; and, if +>

y _hrt ＝1；/>

y _hrt ＝-1。

Further, constructing the triplet path model specifically includes:

according to the difference between the entities h and tGenerates path representation according to the relation between the paths and according to the characteristic value function s of each path _h,p(e) Establishing a series of paths, wherein the path set is represented as:

p(h,t)＝{…,p _i (h,t),…}

wherein p is _i (h,t)＝(h,r ₁ ,e ₁ ,r ₂ ,e ₂ ,…,e _k-1 ,r _k T), k is the path length.

Further, a path score function f is used _p (h, t) to determine the reliability between the relation r of the entity h and the tail entity t, the path score function f _p (h, t) is calculated by the formula (5):

probability of each sample being

Further, the training objective of the triplet path model is a minimization loss function, which is expressed as follows:

the invention has the beneficial effects that: the invention judges whether a certain vehicle running in a certain area is a suspicious vehicle or not based on knowledge map reasoning, and accordingly, the invention carries out early warning to relevant management departments such as a highway and the like, thereby improving the capturing and emergency handling capabilities of the management departments on the suspicious vehicle.

Detailed Description

A suspicious vehicle discrimination method based on knowledge graph reasoning,

s1: acquiring suspicious vehicle knowledge, representing the suspicious vehicle knowledge in a triple (h, r, t) mode based on a knowledge map construction mode, and establishing a suspicious vehicle knowledge map; wherein h represents a head entity, r represents a relationship, and t represents a tail entity;

wherein the suspicious vehicle knowledge comprises: a vehicle in an area (e.g., a section of highway) meeting one of three conditions: (1) the vehicle type, the license plate and the driver characteristics are not matched with the historical records of the database; (2) have occurred in historical accident scenes; (3) the dangerous driving behaviors such as traffic violation like overspeed and overload or the like changing lanes in lushikim are generated for many times; (4) often co-occurring with a vehicle that has been determined to be suspect, i.e., a companion vehicle to the suspect vehicle.

by using

Represents a collection of entities, is selected based on the presence of a particular entity>

Representing a collection of relationships. Representing a pair of embedded relationships using a triplet (h, r, t), wherein +>

Indicates head entity, is present>

Represents a relationship, is>

Representing the tail entity. Set of triples is used to +>

And (4) showing. Thus, each entity and relationship in the knowledge-graph is represented as a vector. Examples are as follows:

if r is the relationship of vehicle type, then (h, r, t) can be expressed as (Yu A00000, vehicle type is car); if r is the "driver is" relationship, (h, r, t) can be represented as (Yu A00000, driver is, driver ID 1); if r is a "body feature is" relationship, (h, r, t) can be represented as (driver ID 1, body feature is, thin); if r is "occurred at the accident scene", then (h, r, t) can be expressed as (Yu A00000, occurred at the accident scene, accident ID 1); if r is "multiple traffic violations," then (h, r, t) can be expressed as (Yu A00000, multiple traffic violations, 0 or 1); if r is "present at location", then (h, r, t) can be expressed as (Yu A00000, present at location, K1+ 120); if r is "present at time", then (h, r, t) can be expressed as (Yu A00000, present at time, YYYY-MM-DDhh: MM: ss); and so on.

1. triple relationship model building

The ranD model establishes one-to-one, one-to-many, many-to-one or many-to-many relationships between h and t through the mapping of the entity space and the relationship space. The vector similarity between the relations r is measured by a score function. Score function f _r (h, t) is calculated by the formula (1):

wherein, w _h 、w _t 、w _r Mapping functions between entity/relationship representations; and I is an identity matrix. Each vector satisfies the following constraints:

‖h‖ ₂ ≤1,‖r‖ ₂ ≤1,‖t‖ ₂ ≤1 (2)

2. triple relationship model training

The training goal of the triplet relation model is to minimize the loss function, which is expressed as follows:

s.t.

wherein the content of the first and second substances,

representing a set of triples; />

Negative samples representing (h, r, t) are obtained by randomly replacing a head entity h or a tail entity t in the training process; and, if +>

y _hrt ＝1；/>

y _hrt ＝-1。

For example, if the suspicious vehicle feature knowledge graph contains relationships (li x, driving, yu a 00000), (yu a00000, affiliated to Chongqing shipping Co., ltd.), then the missing relationships (li x, working at Chongqing shipping Co., ltd.) may be obtained.

The invention calculates the characteristic value function s of each path based on a random walk model _h,P(t) Thereby establishing a series of paths. A path P is defined by a series of relationships r ₁ ,…,r _l ,…,r _n Consists of the following components:

wherein, T _n-1 Is a relation r _n Scope of action of, T _n-1 While being in relation r _n-1 Value range of (i.e. T) _n-1 ＝ran(r _n )＝dom(r _n-1 ). Scope and value of the relationship, i.e. type of entity, T ₀ ＝{h}，T _n = t. Eigenvalue function s _h,P(t) Is the probability that the tail entity t can be reached starting from the head entity h along path P. When the path goes to any intermediate entity e, s _h,P(e) The updating method is

Wherein, in the initial stage of random walk, if e ∈ P, s _h,P(e) =1; otherwise, s _h,P(e) =0; i (e', e) is an indicator function if r _l (e', e) present, I (r) _l (e', e)) =1; otherwise, I (r) _l (e′,e))＝0；

For the relation r, a series of path characteristics P are obtained through a random walk algorithm _r ＝{P ₁ ,…,P _n A score function is then built for each training sample (i.e. a combination of head and tail entities) under relation r:

the probability for each sample is:

the minimization loss function is:

min w _k (y _k log P _k +(1-y _k )log(1-P _k )) (7)

wherein, y _k For training the sample (h) _k ,t _k ) Whether or not there is a flag of relation r, y _k =1, if triplet (h) _k ,r,t _k ) (ii) present; otherwise y _k ＝0。

The method and the device can well establish the complex relations such as the relations of 1 to N, N to 1, N to N and the like, are simpler and higher in calculation efficiency compared with other methods of the same type, and are suitable for establishing the mass suspicious vehicle data knowledge graph in the management department platforms such as the highway and the like. The model training of the method is based on open world assumption, the effect on the incomplete knowledge graph is better, and in the model training fine tuning, the model based on the open world assumption has better effect.

Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in the claims of the present invention.

Claims

1. A suspicious vehicle discrimination method based on knowledge graph reasoning is characterized by comprising the following steps:

s4: constructing a triple path model, training the triple path model, and determining the reliability between the relation r of an entity h and a tail entity t according to a path score function of the triple path model, wherein the path with the highest path score is a reasoning result; step S4 specifically includes:

calculating a characteristic value function s of each path based on a random walk model _h，P(t) Thereby establishing a series of paths; a path P is defined by a series of relationships r ₁ ，...，r _l ，...，r _n Consists of the following components:

wherein, T _n-1 Is a relation r _n Scope of action of (1), T _n-1 While being in the relationship r _n-1 Value range of (i.e. T) _n-1 ＝ran(r _n )＝dom(r _n-1 ) Scope and value of a relationship, i.e. type of entity, T ₀ ＝{h}，T _n = t }; function of eigenvalues s _h，P(t) Is the probability that the tail entity t can be reached starting from the head entity h along path P; when the path goes to any intermediate entity e, s _h，P(e) The updating method is

Wherein, in the initial stage of random walk, if e ∈ P, s _h，P(e) =1; otherwise, s _h，P(e) =0; i (e', e) is an indicator function if r _l (e', e) present, I (r) _l (e', e)) =1; otherwise, I (r) _l (e′，e))＝0；

For the relation r, a series of path features P are obtained through a random walk algorithm _r ＝{P ₁ ，...，P _n A score function is built for each training sample under relation r:

the probability for each sample is:

the training target of the triplet path model is a minimization loss function, which is expressed as follows:

2. the method for distinguishing suspicious vehicles based on knowledge graph reasoning according to claim 1, wherein the constructing of the triple relationship model specifically comprises:

3. The method of claim 2, wherein the method of suspect vehicle discrimination based on knowledgegraph reasoning is based on a relational score function f _r (h, t) determining the vector similarity between the relation r of the entity h and the tail entity t, the relation score function f _r (h, t) is calculated using formula (1):

wherein, w _h 、w _t 、w _r Mapping functions between entity/relationship representations; i is an identity matrix; each vector satisfies the following constraints:

||h|| ₂ ≤1，||r|| ₂ ≤1，||t|| ₂ ≤1(2)

4. the method of claim 3, wherein the training objective of the triplet relational model is a minimization loss function, the minimization loss function being expressed as follows:

s.t.

wherein the content of the first and second substances,

representing a set of triples; />

y _hrt ＝-1。/>