CN113919862A

CN113919862A - Marketing arbitrage black product identification method based on dynamic attention-drawing network

Info

Publication number: CN113919862A
Application number: CN202111040219.0A
Authority: CN
Inventors: 傅剑文; 陈心童; 章建森; 韩弘炀; 周文彬
Original assignee: Tianyi Electronic Commerce Co Ltd
Current assignee: Tianyi Electronic Commerce Co Ltd
Priority date: 2021-09-06
Filing date: 2021-09-06
Publication date: 2022-01-11
Also published as: JP2023543128A; WO2023029324A1

Abstract

The invention discloses a marketing arbitrage black product identification method based on a dynamic attention-deficit hyperactivity network. The invention has the following advantages: 1) defining a novel dynamic graph network structure consisting of user nodes, merchant nodes, multidimensional relations and a plurality of moments; 2) a novel multilateral attention unit is designed, so that unification of user nodes, merchant nodes and multidimensional relations can be realized more effectively, and information aggregation of node space layers can be realized better; 3) on the aspect of time dynamics of the graph network, a dynamic mode for transmitting attention information through a plurality of GRU units is innovatively designed, so that evolution information of the graph network on a time axis is effectively transmitted; 4) the multi-edge attention unit and the multi-path GRU units are effectively combined, a novel processing unit is provided, and a dynamic attention diagram network built by the unit can not only finish effective information transmission in a time dimension, but also realize high-efficiency aggregation of multilayer adjacent information in a space dimension.

Description

Marketing arbitrage black product identification method based on dynamic attention-drawing network

Technical Field

The invention relates to the fields of emerging information technologies and artificial intelligence, in particular to a marketing arbitrage black birth identification method based on a dynamic attention-drawing network.

Background

Marketing arbitrage is a novel internet fraud means, arbitrage black products arbitrage money vouchers, discounts, full reductions, standing-returning-standing reductions and other marketing activities which are released by an electronic commerce platform or a payment platform in a large quantity through various illegal modes, vital interests of normal consumers are greatly injured, and the marketing cost input by a platform party does not play the due value of the marketing cost. With the rapid development of the current internet industry, marketing arbitrage and black production tend to be ganged more, and the means become more concealed and complicated.

The traditional solution usually sets some risk strategies in a wind control system by means of expert experience to capture the arbitrage black-producing personnel hit by the risk strategies, but as the black-producing means are more ganged, complicated and concealed, the method is often limited by the fact that the expert cannot more comprehensively identify some characteristic hidden or novel arbitrage black-producing groups.

It is also proposed to judge the arbitrage risk of each node on the graph by calculating risk operators by constructing a graph network and defining some risk operators on the graph network, but such methods usually have the following disadvantages: 1) in essence, this method defines the expert experience policy in a more complex form on the graph network, and is also limited by the operator itself, not a machine learning algorithm. 2) The method usually only utilizes topological information on a graph network space, and does not utilize characteristic information hidden in network nodes. 3) Typically only information in low adjacencies is utilized and the potential information embedded in high adjacencies is not fully utilized.

With the development and progress of science and technology, the problem of better solving the scene through a graph neural network is provided. The graph neural network is an algorithm which is based on historical samples and can be learned by self, the limitation of expert experience can be avoided, meanwhile, the characteristic information of the nodes is introduced, and multi-layer adjacent information aggregation can be carried out, so that potential arbitrage risk nodes can be identified more comprehensively.

However, most of the current methods based on the graph neural network only use static graphs rather than dynamic graphs, that is, only a graph network structure at a certain time is used, and the evolution of the graph network on a time axis is not considered. The static graph method only considers the information on the network space structure of the graph, but ignores the information transmitted on the time axis

Disclosure of Invention

The technical problem to be solved by the invention is to overcome the defects of the prior art, and provide a marketing arbitrage black product identification method based on a dynamic attention-driven graph network, which can greatly improve the comprehensiveness and accuracy of a wind control system in identifying arbitrage black products, so that the scene problem can be better solved.

In order to solve the technical problems, the invention provides the following technical scheme:

the invention provides a marketing arbitrage black product identification method based on a dynamic attention-drawing network, which comprises the following steps of:

a dynamic graph network structure is defined, which is composed of two types of nodes, four types of relation edges and T times:

1) two types of nodes: the graph network comprises user nodes V_cAnd merchant node V_bTwo kinds of nodes

2) The four relationship edges are:

A) user and user's device relationship edge E_d：

According to the identity of the IP of different user access devices in a certain period of time in the past in the access log, generating a device relation edge between a user node and the user node;

B) payment relationship edge E of user and user_p：

Generating a payment relation edge between a user node and a user node according to the account transfer payment behavior existing between the user and the user in a certain period of time in the past in a service system;

C) social relationship edge E between user and user_s：

Generating a social relationship edge between a user node and a user node according to the fact that a user and the user share marketing activity information and successfully invite another user to participate in the marketing activity in a certain past period of time in a service system;

D) user and merchant transaction relationship edge E_t：

Generating a transaction relation edge between a user node and a merchant node according to the transaction payment behavior of a user in the merchant in a certain period of time in the past in a service system;

3) the T moments refer to:

selecting consecutive T time points on a time axis by taking T0 time as an initial time and taking omega as an interval; generating a relationship graph network consisting of the 2 types of nodes and the 4 types of relationship edges at each time point; thus, the evolution of the graph network along with the T moments of time is reflected;

generating T network graphs at T moments in graph database, wherein the T network graphs are respectively

Set V of users at each time point_cRespectively leading the characteristic vectors into a distributed database module from a database module, generating m-dimensional characteristic vectors representing the attributes of each user at each time point, wherein the characteristic fields of the characteristic vectors can be constructed by selecting common natural human attribute characteristics, and statistical characteristics extracted based on expert experience can be introduced, for example: the number of times of accessing a certain activity in a certain period of time in the past, and the like to obtain a better recognition effect; thus, for each time point, a set V of users_cAn attribute feature matrix is generated

Storing the data in a distributed database module in a form of a table; n in subscript₀，n₁,.. respectively representing the number of user sets at each time point;

similarly, the commercial tenant set V of each time point_bRespectively importing the data from the database module to the distributed database module, and generating k-dimensional feature vectors representing the attributes of each merchant at each time point; thus, for each time point, a merchant set V_bAn attribute feature matrix is generated

Storing the data in a distributed database module in a form of a table; subscript l₀，l₁,.. respectively representing the number of merchant sets at each point in time;

third, the history-based arbitrage user positive and negative samples are respectively a user set V of each time point_cLabeling and storing the label in a distributed database module in a form of a table; after the label is printed, the user set V_cBecomes labeled (V)_labeled) And no label (V)_unlabeled) Two kinds of samples, V_labeledThe samples are divided into known arbitrage user samples V _labeled1 and known non-arbitrage user sample V_labeledSetting all merchant nodes on the graph network as label-free samples;

designing a novel graph attention machine mechanism, which is characterized in that 1) independent mapping matrixes are defined for user nodes and merchant nodes respectively, and the two different dimensionality characteristic vectors are mapped into the same dimensionality through the mapping matrixes, so that the unification of the user nodes and the merchant nodes in information aggregation is realized; 2) based on four relation edges defined on the graph, a multilateral attention mechanism is defined, namely, a set of independent attention learning parameters are introduced into each relation edge, so that unification of multiple relation edges during information aggregation is realized;

the specific operation is as follows:

1) firstly, defining 4 sets of independent attention learning parameters according to four relation edges and initializing the learning parameters:

A) user-to-user device relationship E_d: defining a shared linear mapping matrix

And attention sharing vector a_d∈R^2f；

B) User-to-user payment relationship E_p: defining a shared linear mapping matrix

And attention sharing vector a_p∈R^2f；

C) Social relationship edge E between user and user_s: defining a shared linear mapping matrix

And attention sharing vector a_s∈R^2f；

D) User and merchant transaction relationship edge E_t: defining independent shared linear mapping matrices for user nodes and merchant nodes

And

the system comprises a database, a database and a database, wherein the database is used for storing a user characteristic vector space and a merchant characteristic vector space; defining a shared attention vector a_t∈R^2f；

2) And calculating the attention coefficient of each side in the graph according to the following formula:

a) based on E_dAttention coefficient of relationship edge:

b) based on E_pAttention coefficient of the relationship:

c) based on E_sAttention coefficient of the relationship:

d) based on E_tAttention coefficient of the relationship:

in the above expression, the symbol | | | represents the concatenation of vectors, the subscripts i, j respectively represent two adjacent nodes based on a certain relation edge in the graph, h_iOr h_jAll represent that the feature vector corresponding to the node is taken from the previously generated user feature matrix, and in the calculation E_tRelation attention coefficient h_iAnd h_jOne of the feature vectors is taken from a user feature matrix, and the other feature vector is taken from a merchant feature matrix;

3) and finally, performing information aggregation calculation through the following formula:

in the above formula, N_d(i)、N_p(i)、N_s(i)、N_t(i) The symbols respectively represent first-order neighbors connected by the node i through an equipment relationship edge, a payment relationship edge, a social relationship edge and a transaction relationship edge;

the series of processes in the fourth step is called as follows: a multilateral attention unit (MultiEdgeGAT);

fifthly, the fourth step is just to complete information aggregation of the graph network at a spatial level at a certain moment, because the transmission of network information at a time dimension must be completed in consideration of the dynamic property of the graph network at the time dimension, the invention innovatively provides a mechanism (called as multi-GRU unit-multi GRU) for transmitting attention parameters at the time dimension by using a multi-GRU mode, and a brand new processing unit (called as dynamic attention unit) is formed by combining a graph attention information aggregation mechanism (multi-edge attention unit-multi-edge GAT) at the spatial level, and the operation is as follows:

1) since the graph attention mechanism will generate 9 parameters to be learned, i.e. 5 shared linear mapping matrices and 4 shared attention vectors, there are 9 independent GRU units responsible for passing parameter information in the time dimension, which are:

for conveying a linear mapping matrix, an

Used for conveying attention vectors; initializing parameters of the GRU units;

2) the GRU propagation formula here is as follows:

because the internal structure of the GRU units is the same, the GRU units are illustrated for convenience of explanation

The formula is illustrated for an example:

the above formula completes the attention parameter at the moment t-1

By passing

Communicating an attention parameter at time t of generation

Similarly, the other 8 attention parameters W for generating the t moment can be transmitted from the t-1 moment through other GRUs in the same structure_pt、W_st、

In the above equation |, represents the hadamard product,

is composed of

The parameters to be learned by the GRU are similar to other GRUs; the 9 GRU units are gathered togetherReferred to as a multiway GRU unit-multiway GRU;

3) combining the multilateral attention unit and the multipath GRU unit to generate a brand new processing unit called dynamic attention unit (dynamic GAT), wherein the combination formula is as follows:

to simplify the formula and make it easier to express, W is_d、W_p、W_s、W_tc、W_tbThese 5 parameters are combined into W, a_d、a_p、a_s、a_tThese 4 parameters are combined into a;

W_t，a_t＝MultiGRU(H_t，W_t-1，a_t-1)

H′_t’＝MultiEdgeGAT(H_t，W_t，a_t)；

sixth, the above step five explains information transfer and information aggregation of one dynamic attention unit on the time axis, where it is natural to perform spatial multi-layer aggregation at each time point, then the expression of the dynamic attention unit can be expressed as:

the superscript l in the above formula represents the level of spatial aggregation, wherein, adding one layer of aggregation means that the next layer of adjacent information of the node in the graph is pressed to the node, so that the information contained in the multilayer adjacent relation is well utilized, and a dynamically learnable graph network which not only contains information transfer of time dimension, but also contains spatial dimension multilayer aggregation is constructed;

seventhly, outputting the last moment of the last layer of the dynamic attention diagram network

Characterization by Softmax, activating and mapping to predict the probability P of node arbitrage risk;

eighthly, calling all the steps as a complete learning iteration; when learning iteration is carried out, required information is synchronized into a server memory from a distributed database module, parameters required to be learned are initialized, and then multiple rounds of complete learning iteration are carried out; for labeled nodes V only after each complete learning iteration_labeledPerforming loss calculation through a cross entropy loss function, performing gradient updating through an Adam optimization algorithm, and finally learning parameters required by the dynamic attention-seeking network through N times of complete learning iteration; completing the learning of the dynamic attention diagram network;

and deploying the learned dynamic attention network into an air control decision module, performing online marketing arbitrage risk probability prediction, setting a probability decision threshold value P as sigma, judging as a marketing arbitrage risk group when the predicted risk probability is greater than sigma, and performing subsequent processing operations such as interception.

Compared with the prior art, the invention has the following beneficial effects:

the patent proposes a method for identifying arbitrage black products based on a dynamic attention-graph network, which is innovative and has the following advantages: 1) defining a novel dynamic graph network structure consisting of user nodes, merchant nodes, multidimensional relations and a plurality of moments; 2) a novel multilateral attention unit is designed, so that unification of user nodes, merchant nodes and multidimensional relations can be realized more effectively, and information aggregation of node space layers is realized better; 3) on the aspect of time dynamics of the graph network, a dynamic mode for transmitting attention information through a plurality of GRU units is innovatively designed, so that evolution information of the graph network on a time axis is effectively transmitted; 4) the multi-edge attention unit and the multi-path GRU units are effectively combined, a novel processing unit of the dynamic attention unit is provided, and the dynamic attention network built by the unit can not only finish effective information transmission of a time dimension, but also realize efficient aggregation of multilayer adjacent information of a space dimension.

The method provided by the patent can greatly improve the comprehensiveness and accuracy of the wind control system in identifying the arbitrage black products, so that the scene problem can be better solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:

FIG. 1 is a diagram of an exemplary network composition of the present invention;

FIG. 2 is a diagram of a multiple GRU unit architecture;

FIG. 3 is a diagram of a dynamic attention unit architecture;

FIG. 4 is a diagram of the overall structure of a dynamic attention map network;

fig. 5 is a flow chart of an implementation of the present invention.

Detailed Description

The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.

Example 1

The invention provides a marketing arbitrage black birth identification method based on a dynamic attention-drawing network, which comprises the following steps of:

2) The four relationship edges are:

A) user and user's device relationship edge E_d：

B) payment relationship edge E of user and user_p：

C) social relationship edge E between user and user_s：

D) user and merchant transaction relationship edge E_t：

3) the T moments refer to:

selecting consecutive T time points on a time axis by taking T0 time as an initial time and taking omega as an interval; generating a relationship graph network consisting of the 2 types of nodes and the 4 types of relationship edges at each time point; thus, the evolution of the graph network along with the T moments of time is reflected; legends are seen in fig. 1;

Set V of users at each time point_cRespectively leading the data from the database module to the distributed database module, generating m-dimensional feature vectors representing the attributes of each user at each time point, wherein the feature domain of the feature vectors can be constructed by selecting common natural human attribute features, and statistical class features extracted based on expert experience can be introducedCharacterization, for example: the number of times of accessing a certain activity in a certain period of time in the past, and the like to obtain a better recognition effect; thus, for each time point, a set V of users_cAn attribute feature matrix is generated

the specific operation is as follows:

And attention sharing vector a_d∈R^2f；

And attention sharing vector a_p∈R^2f；

And attention sharing vector a_t∈R^2f；

And

the method is used for mapping a user characteristic vector space and a merchant characteristic vector space from m dimension and l dimension to a unified f-dimension vector space. Defining a shared attention vector a_t∈R^2f；

a) based on E_dAttention coefficient of relationship edge:

b) based on E_pAttention coefficient of the relationship:

c) based on E_sAttention coefficient of the relationship:

d) based on E_tAttention coefficient of the relationship:

in the above expression, the symbol | | | represents the concatenation of vectors, the subscripts i, j respectively represent two adjacent nodes based on a certain relation edge in the graph, h_iOr h_jAll represent that the feature vector corresponding to the node is taken from the previously generated user feature matrix, and in the calculation E_tRelation attention coefficient h_iAnd h_jOne feature vector is taken from the user featuresThe other is taken from the merchant feature matrix;

for conveying a linear mapping matrix, an

For conveying attention vectors. Initializing parameters of the GRU units;

2) the GRU propagation formula here is as follows:

The formula is illustrated for an example:

the above formula completes the attention parameter at the moment t-1

By passing

Communicating an attention parameter at time t of generation

In the above equation |, represents the hadamard product,

is composed of

The parameters to be learned by the GRU are similar to other GRUs; the 9 GRU units are collected together and are called a multi-path GRU unit-MultiGRU; see the structure shown in FIG. 2;

W_t，a_t＝MultiGRU(H_t，W_t-1，a_t-1)

H′_t’＝MultiEdgeGAT(H_t，W_t，a_t)

see FIG. 3 for a dynamic attention cell structure;

the superscript l in the above formula represents the level of spatial aggregation, wherein, adding one layer of aggregation means that the next layer of adjacent information of the node in the graph is pressed to the node, so that the information contained in the multilayer adjacent relation is well utilized, and a dynamically learnable graph network which not only contains information transfer of time dimension, but also contains spatial dimension multilayer aggregation is constructed; see FIG. 4 for a dynamic attention diagram network overall structure;

Representing the probability P of activating and mapping the representation to the predicted node arbitrage risk through Softmax;

and eighthly, calling all the steps as a complete learning iteration. When the learning iteration is performed, the required information is synchronized into the server memory from the distributed database module, the parameters required to be learned are initialized, and then the complete learning iteration of multiple rounds is performed. For labeled nodes V only after each complete learning iteration_labeledAnd finally, learning parameters required by the dynamic attention-seeking network through N times of complete learning iteration by performing loss calculation through a cross entropy loss function and performing gradient updating through an Adam optimization algorithm. Completing the learning of the dynamic attention diagram network;

Specifically, examples are as follows:

1. the user node attribute feature vector takes the experience value m of 256

2. The merchant node attribute feature vector has an experience value of l being 128

T time, T value T is 10

3. With a label (V)_labeled) And no label (V)_unlabeled) Two types of sample ratios:

4. spatial aggregation level (L), span: { L |2 ≦ L ≦ 3, L ∈ Z }

5. And (3) taking a decision threshold value sigma during prediction and identification: σ is 0.5.

The invention has the following technical points:

1. a two-class node is defined: the user node and the merchant node are defined by four relation edges: user and user equipment relationship edge E_dUser and user payment relationship edge E_pUser and user social relationship edge E_sUser and merchant transaction relationship edge E_tAnd a new dynamic graph network structure consisting of a plurality of moments;

2. a novel multilateral attention unit is designed, so that unification of user nodes, merchant nodes and multidimensional relations can be realized more effectively, and information aggregation of node space layers is realized better;

3. the innovative design of a dynamic mode for transmitting attention information through multiple GRU units enables evolution information of the graph network on a time axis to be effectively transmitted.

4. The multi-edge attention unit and the multi-path GRU units are effectively combined, a novel processing unit of the dynamic attention unit is provided, and the dynamic attention network built by the unit can not only finish effective information transmission of a time dimension, but also realize efficient aggregation of multilayer adjacent information of a space dimension.

Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A marketing arbitrage black yield identification method based on a dynamic attention-drawing network is characterized by comprising the following steps:

1) two types of nodes: picture netThe network contains user nodes V_cAnd merchant node V_bTwo kinds of nodes

2) The four relationship edges are:

A) user and user's device relationship edge E_d：

B) payment relationship edge E of user and user_p：

C) social relationship edge E between user and user_s：

D) user and merchant transaction relationship edge E_t：

3) the T moments refer to:

Storing the data in a distributed database module in a form of a table; subscript l₀L 1.. denotes the number of merchant sets at each time point, respectively;

third, the history-based arbitrage user positive and negative samples are respectively a user set V of each time point_cLabeling and storing the label in a distributed database module in a form of a table; after the label is printed, the user set V_cBecomes labeled (V)_labeled) And no label (V)_unlabeled) Two kinds of samples, V_labeledThe samples are divided into known arbitrage user samplesV_labeled1 and known non-arbitrage user sample V_labeledSetting all merchant nodes on the graph network as label-free samples;

the specific operation is as follows:

And attention sharing vector a_d∈R^2f；

And attention sharing vector a_p∈R^2f；

And attention sharing vector a_s∈R^2f；

And

the system comprises a database, a user characteristic vector space and a merchant characteristic vector space, wherein the database is used for mapping the user characteristic vector space and the merchant characteristic vector space from m dimension and l dimension to a unified f-dimension vector space; defining a shared attention vector a_t∈R^2f；

a) based on E_dAttention coefficient of relationship edge:

b) based on E_pAttention coefficient of the relationship:

c) based on E_sAttention coefficient of the relationship:

d) based on E_tAttention coefficient of relationship：

for conveying a linear mapping matrix, an

Used for conveying attention vectors; initializing parameters of the GRU units;

2) the GRU propagation formula here is as follows:

The formula is illustrated for an example:

the above formula completes the attention parameter at the moment t-1

By passing

Communicating an attention parameter at time t of generation

In the above equation |, represents the hadamard product,

W_～d、U_～d、B_～dis composed of

The parameters to be learned by the GRU are similar to other GRUs; the 9 GRU units are collected together and are called a multi-path GRU unit-MultiGRU;

W_t，a_t＝MultiGRU(H_t，W_t-1，a_t-1)

H′_t’＝MultiEdgeGAT(H_t，W_t，a_t)；