CN113780318A - Method, apparatus, server and medium for generating prompt information - Google Patents

Method, apparatus, server and medium for generating prompt information Download PDF

Info

Publication number
CN113780318A
CN113780318A CN202010893051.7A CN202010893051A CN113780318A CN 113780318 A CN113780318 A CN 113780318A CN 202010893051 A CN202010893051 A CN 202010893051A CN 113780318 A CN113780318 A CN 113780318A
Authority
CN
China
Prior art keywords
user
behavior feature
behavior
feature representation
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010893051.7A
Other languages
Chinese (zh)
Other versions
CN113780318B (en
Inventor
郭思文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Holding Co Ltd
Original Assignee
Jingdong Technology Holding Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong Technology Holding Co Ltd filed Critical Jingdong Technology Holding Co Ltd
Priority to CN202010893051.7A priority Critical patent/CN113780318B/en
Publication of CN113780318A publication Critical patent/CN113780318A/en
Application granted granted Critical
Publication of CN113780318B publication Critical patent/CN113780318B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3438Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment monitoring of user actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Abstract

Embodiments of the present disclosure disclose methods, apparatuses, servers, and media for generating hint information. One embodiment of the method comprises: acquiring a behavior feature representation sequence of a target user, wherein behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user in a preset time period, and the operation data comprise data representing operation types and operation attributes; generating a metric value representing that the target user belongs to a preset category user according to the behavior characteristic representation sequence of the target user; and generating prompt information for representing that the target user belongs to a preset category of users in response to the comparison of the metric value and a preset threshold value. The implementation mode realizes effective utilization of the operation data, and more comprehensively reflects the detail information such as time sequence, operation behavior relevance and the like contained in the operation data, thereby providing a technical basis for improving the identification degree of abnormal users.

Description

Method, apparatus, server and medium for generating prompt information
Technical Field
Embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a method, an apparatus, a server, and a medium for generating a prompt message.
Background
With the development of internet technology, the existing internet anti-fraud means mainly comprises data index monitoring, abnormal behavior detection based on numerical statistical characteristics and user click behavior sequences, and the like.
However, in the prior art, only the statistical characteristics of the user operation data and the click behavior of the user are utilized, and the degree of abnormal user identification for scenes with high operation relevance, high time concentration and strong groupware (such as anti-fraud marketing) is still to be improved.
Disclosure of Invention
Embodiments of the present disclosure propose a method, an apparatus, a server, and a medium for generating hint information.
In a first aspect, an embodiment of the present disclosure provides a method for generating a prompt message, where the method includes: acquiring a behavior feature representation sequence of a target user, wherein behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user in a preset time period, and the operation data comprise data representing operation types and operation attributes; generating a metric value representing that the target user belongs to a preset category user according to the behavior characteristic representation sequence of the target user; and generating prompt information for representing that the target user belongs to a preset category user in response to the comparison between the determined metric value and the preset threshold value.
In some embodiments, the obtaining the behavior feature representation sequence of the target user includes: acquiring operation data of a target user in a preset time period; extracting operation data belonging to a preset category from the operation data to generate a target operation data set; generating behavior characteristic representations corresponding to the operation behaviors based on the operation data corresponding to the same operation behavior in the target operation data set; and generating a behavior feature representation sequence of the target user according to the time sequence of the corresponding operation represented by the generated behavior feature.
In some embodiments, the generating, based on operation data corresponding to the same operation behavior in the target operation data set, a behavior feature representation corresponding to each operation behavior includes: generating one-hot codes corresponding to the operation behaviors according to the operation behaviors corresponding to the target operation data in the target operation data set; and generating behavior characteristic representation corresponding to each unique hot code by utilizing a word vector model trained in advance according to the generated unique hot codes.
In some embodiments, the word vector model is trained by: acquiring a sample user behavior feature representation sequence set; obtaining an initial word vector model; and taking the behavior feature representation in the same sample user behavior feature sequence in the sample user behavior feature representation sequence set as a positive sample group, and training an initial word vector model by using a machine learning mode to obtain a word vector model.
In some embodiments, the generating a metric value representing that the target user belongs to a preset category user according to the behavior feature representation sequence of the target user includes: and inputting the behavior characteristic representation sequence of the target user into a pre-trained classification model to obtain a probability used for representing that the target user belongs to a preset class user as a metric value.
In some embodiments, the generating a metric value representing that the target user belongs to a preset category user according to the behavior feature representation sequence of the target user includes: acquiring behavior characteristic information corresponding to users belonging to a preset category as target behavior characteristic information; and determining similarity measurement between the behavior characteristic representation sequence of the target user and the target behavior characteristic information as a measurement value for representing that the target user belongs to a preset category of users.
In some embodiments, the behavior feature information includes a behavior feature representation; and the determining of the similarity metric between the behavior feature representation sequence of the target user and the target behavior feature information as a metric value representing that the target user belongs to a preset category of users comprises: acquiring the number of historical behavior characteristic representation sequences included in a preset historical behavior characteristic representation sequence set as a reference number; for behavior feature representations in the behavior feature representation sequence of the target user, determining a first significance metric corresponding to the behavior feature representation, wherein the first significance metric is positively correlated with a reference number and negatively correlated with the number of historical behavior feature representation sequences including historical behavior feature representations matched with the behavior feature representation in the historical behavior feature representation sequence set; generating a user tag feature representation corresponding to the behavior feature representation sequence of the target user based on the determined first significance measure; and determining similarity measurement between the user label characteristic representation and the target behavior characteristic information as a measurement value for representing that the target user belongs to a preset category user.
In some embodiments, the generating a user tag feature representation corresponding to the behavior feature representation sequence of the target user based on the determined first significance metric includes: determining frequency measurement of each behavior characteristic representation in the behavior characteristic representation sequence of the target user appearing in the behavior characteristic representation sequence of the target user; determining a second significance measure corresponding to the behavior feature representation in the behavior feature representation sequence of the target user, wherein the second significance measure is positively correlated with the first significance measure and positively correlated with the frequency measure; and selecting the corresponding behavior feature representation as the user label feature representation according to the determined second significance measure.
In some embodiments, the method further comprises: and generating a user behavior label corresponding to the behavior characteristic representation sequence of the target user based on the user label characteristic representation.
In some embodiments, the method further comprises: acquiring an associated user behavior feature representation sequence set associated with the behavior feature representation sequence of the user, wherein the associated user behavior feature identification sequence set comprises at least one behavior feature representation sequence corresponding to the user belonging to the preset category; clustering the obtained associated user behavior feature representation sequences in the associated user behavior feature representation sequence set to generate a target number of clustering results; determining the associated user behavior characteristic representation sequence of the behavior characteristic representation sequence corresponding to the user belonging to the preset category user as a target associated user behavior characteristic representation sequence, wherein the behavior characteristic representation sequence belongs to the same clustering result; and generating prompt information representing whether the user corresponding to the target associated user behavior characteristic representation sequence belongs to a preset category user.
In some embodiments, the method further comprises: and adding the identification information of the target user into a preset category user information set in response to receiving the information that the target user indicated by the representation determination prompt information belongs to the preset category user.
In some embodiments, the method further comprises: acquiring a behavior characteristic representation sequence set corresponding to a preset category user information set; generating user label characteristic representation corresponding to the behavior characteristic representation sequence in the behavior characteristic representation sequence set; and generating key behavior characteristic information corresponding to preset category users.
In a second aspect, an embodiment of the present disclosure provides an apparatus for generating a prompt message, the apparatus including: a first acquisition unit configured to acquire a behavior feature representation sequence of a target user, wherein behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user in a preset time period, and the operation data comprises data representing an operation type and an operation attribute; the first generation unit is configured to generate a metric value representing that the target user belongs to a preset category user according to the behavior characteristic representation sequence of the target user; and the prompting unit is configured to generate prompting information for representing that the target user belongs to a preset category user in response to the comparison between the determined metric value and the preset threshold value.
In some embodiments, the first obtaining unit includes: the first acquisition subunit is configured to acquire operation data of a target user in a preset time period; the extraction subunit is configured to extract operation data belonging to a preset category from the operation data and generate a target operation data set; the first generation subunit is configured to generate behavior feature representations corresponding to the operation behaviors based on the operation data corresponding to the same operation behavior in the target operation data set; and the second generation subunit is configured to generate a behavior feature representation sequence of the target user according to the time sequence of the corresponding operation represented by the generated behavior feature representation.
In some embodiments, the first generating subunit includes: the first generation module is configured to generate one-hot codes corresponding to the operation behaviors according to the operation behaviors corresponding to the target operation data in the target operation data set; and the second generation module is configured to generate behavior feature representations corresponding to the individual hot codes by utilizing a word vector model trained in advance according to the generated individual hot codes.
In some embodiments, the word vector model is trained by: acquiring a sample user behavior feature representation sequence set; obtaining an initial word vector model; and taking the behavior feature representation in the same sample user behavior feature sequence in the sample user behavior feature representation sequence set as a positive sample group, and training an initial word vector model by using a machine learning mode to obtain a word vector model.
In some embodiments, the first generating unit is further configured to: and inputting the behavior characteristic representation sequence of the target user into a pre-trained classification model to obtain a probability used for representing that the target user belongs to a preset class user as a metric value.
In some embodiments, the first generating unit includes: the second acquiring subunit is configured to acquire behavior feature information corresponding to users belonging to a preset category as target behavior feature information; and the determining subunit is configured to determine a similarity metric between the behavior feature representation sequence of the target user and the target behavior feature information as a metric value representing that the target user belongs to a preset category of users.
In some embodiments, the behavior feature information includes a behavior feature representation; the above-mentioned certain sub-unit includes: the acquisition module is configured to acquire the number of historical behavior feature representation sequences included in a preset historical behavior feature representation sequence set as a reference number; a first determining module configured to determine, for a behavior feature representation in a behavior feature representation sequence of a target user, a first significance metric corresponding to the behavior feature representation, wherein the first significance metric is positively correlated with a reference number and negatively correlated with a number of historical behavior feature representation sequences including historical behavior feature representations matching the behavior feature representation in a historical behavior feature representation sequence set; a generating module configured to generate a user tag feature representation corresponding to the behavior feature representation sequence of the target user based on the determined first significance metric; and the second determination module is configured to determine a similarity metric between the user tag feature representation and the target behavior feature information as a metric value representing that the target user belongs to a preset category of users.
In some embodiments, the generating module comprises: a first determining submodule configured to determine a measure of frequency with which each behavior feature representation in the sequence of behavior feature representations of the target user appears in the sequence of behavior feature representations of the target user; a second determining submodule configured to determine a second significance measure corresponding to the behavior feature representation in the behavior feature representation sequence of the target user, wherein the second significance measure is positively correlated with the first significance measure and positively correlated with the frequency measure; a selecting submodule configured to select a corresponding behavior feature representation as a user tag feature representation according to the determined second significance measure.
In some embodiments, the apparatus further comprises: and the second generation unit is configured to generate a user behavior tag corresponding to the behavior feature representation sequence of the target user based on the user tag feature representation.
In some embodiments, the apparatus further comprises: the second acquisition unit is configured to acquire an associated user behavior feature representation sequence set associated with the behavior feature representation sequence of the user, wherein the associated user behavior feature identification sequence set comprises at least one behavior feature representation sequence corresponding to the user belonging to the preset category; the clustering unit is configured to cluster the associated user behavior feature representation sequences in the acquired associated user behavior feature representation sequence set to generate a target number of clustering results; the determining unit is configured to determine an associated user behavior feature representation sequence of a behavior feature representation sequence corresponding to a user belonging to a preset category user as a target associated user behavior feature representation sequence, wherein the associated user behavior feature representation sequence belongs to the same clustering result; and the third generating unit is configured to generate prompt information for representing whether the user corresponding to the determined target associated user behavior characteristic representation sequence belongs to a preset category user.
In some embodiments, the apparatus further comprises: and the adding unit is configured to add the identification information of the target user into a preset category user information set in response to receiving the information that the target user indicated by the characterization determination prompt information belongs to the preset category user.
In some embodiments, the apparatus further comprises: a third obtaining unit configured to obtain a behavior feature representation sequence set corresponding to a preset category user information set; a fourth generating unit configured to generate a user tag feature representation corresponding to the behavior feature representation sequence in the behavior feature representation sequence set; and the fifth generating unit is configured to generate key behavior characteristic information corresponding to the preset category of users.
In a third aspect, an embodiment of the present disclosure provides a server, including: one or more processors; a storage device having one or more programs stored thereon; when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method as described in any implementation of the first aspect.
In a fourth aspect, embodiments of the present disclosure provide a computer-readable medium on which a computer program is stored, which when executed by a processor implements the method as described in any of the implementations of the first aspect.
According to the method, the device, the server and the medium for generating the prompt message, the behavior feature representation sequence generated according to the operation data of different operation types and operation attributes in the preset time period is obtained, so that the operation data is effectively utilized, and the detailed information such as time sequence, operation behavior relevance and the like contained in the operation data is more comprehensively represented. And generating prompt information that the target user belongs to a preset class user according to the metric value of the obtained behavior feature representation sequence, thereby improving the identification degree of the abnormal user aiming at the scene with the continuous operation behavior characteristics.
Drawings
Other features, objects and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present disclosure may be applied;
FIG. 2 is a flow diagram for one embodiment of a method for generating hints information in accordance with the present disclosure;
FIG. 3 is a schematic diagram of one application scenario of a method for generating hints information in accordance with embodiments of the present disclosure;
FIG. 4 is a flow diagram of yet another embodiment of a method for generating hints information in accordance with the present disclosure;
FIG. 5 is a schematic diagram illustrating one embodiment of an apparatus for generating toasts according to the present disclosure;
FIG. 6 is a schematic structural diagram of an electronic device suitable for use in implementing embodiments of the present disclosure.
Detailed Description
The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 illustrates an exemplary architecture 100 to which the method for generating hints information or the apparatus for generating hints information of the present disclosure can be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The terminal devices 101, 102, 103 interact with a server 105 via a network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as a web browser application, a shopping-type application, a search-type application, an instant messaging tool, a mailbox client, and the like.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices having a display screen and supporting human-computer interaction, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.
The server 105 may be a server providing various services, such as a background server providing support for shopping-like applications on the terminal devices 101, 102, 103. The background server may analyze and process the acquired user operation data to generate a corresponding processing result (e.g., prompt information indicating that the user corresponding to the user operation data belongs to an abnormal user).
The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be noted that the method for generating the prompt information provided by the embodiment of the present disclosure is generally executed by the server 105, and accordingly, the apparatus for generating the prompt information is generally disposed in the server 105.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for generating hints information in accordance with the present disclosure is shown. The method for generating the prompt message comprises the following steps:
step 201, acquiring a behavior feature representation sequence of a target user.
In this embodiment, the execution subject (such as the server 105 shown in fig. 1) of the method for generating the prompt message may obtain the behavior feature representation sequence of the target user through a wired connection manner or a wireless connection manner. The behavior feature representation in the behavior feature representation sequence may be generated based on operation data of the target user in a preset time period. The operation data may include data characterizing the type of operation and the attributes of the operation. The target user may be any user specified in advance according to actual application requirements, or may be a user specified according to a rule (for example, a user performing a predetermined operation in a preset time period).
As an example, the execution subject may obtain the behavior feature representation sequence of the target user from a local device or from an electronic device (e.g., a server) connected through a wired or wireless connection. The behavior feature representation sequence may include, for example, behavior feature representations representing transfer operations and transfer amounts, coupon picking operations and coupon denominations performed by the user, which are recorded in chronological order within 12 hours. For example, the behavior feature representation sequence of the target user may be
Figure BDA0002657468170000091
Wherein, the above
Figure BDA0002657468170000092
Can be used to characterize at the time t1、t2、t3…tnThe corresponding behavior feature is represented. The behavior feature representations may be used to characterize operational behavior performed by the user, for example
Figure BDA0002657468170000093
May be at t for the target user1The time transfer 200 yuan is the behavior characteristic representation generated by the operation behavior.
In some optional implementation manners of this embodiment, the execution subject may obtain the behavior feature representation sequence of the target user by:
firstly, acquiring operation data of a target user in a preset time period.
In these implementations, the execution subject may acquire operation data of the target user for a preset time period from a local or communication-connected electronic device (e.g., terminal devices 101, 102, 103 shown in fig. 1). The operation data may include data representing operation types and operation attributes. The types of operations described above may include, but are not limited to, at least one of: transactions (e.g., placing orders, transferring money, recharging), information changes (e.g., modifying user name, region), logging in, registering, participating in e-commerce marketing activities (e.g., clicking on e-commerce marketing page, getting coupons, using coupons, getting points), and browsing e-commerce marketing activities pages. The operation attribute may correspond to the operation type, which may include, but is not limited to, at least one of: transaction amount, type of information changed (e.g., user name, region), login location, registration time, number of times participating in the e-commerce marketing campaign, length of time to browse the e-commerce marketing campaign page, etc.
And secondly, extracting the operation data belonging to a preset category from the operation data to generate a target operation data set.
In these implementations, the execution subject may extract operation data belonging to a preset category from the operation data acquired in the first step, and generate a target operation data set. The preset category may include a preset category of an operation type, and may also include a preset category of an operation attribute.
Optionally, the execution main body may further perform preprocessing on the extracted operation data belonging to the preset category, and generate a target operation data set by using the preprocessed operation data. Wherein, the pretreatment may include, but is not limited to, at least one of the following: the dirty data (dirty data) is cleaned, the data such as missing values, messy codes and the like in the operation data are normalized, and the data table where the operation data is located is subjected to multi-table connection (join) so as to be integrated into a wide table.
And thirdly, generating behavior characteristic representation corresponding to each operation behavior based on the operation data corresponding to the same operation behavior in the target operation data set.
In these implementations, based on the operation data corresponding to the same operation behavior in the target operation data set generated in the second step, the execution main body may encode the operation data (e.g., the operation type and the operation attribute) corresponding to the same operation behavior, so as to generate the behavior feature representation corresponding to each operation behavior. By way of example, the target operational data set may include "top up", "100 yuan", "draw coupon", "denomination 20 yuan", "browse activity page", "10 seconds". Thus, the execution agent may encode "top-up" and "100-yuan" as operation data, and generate a behavior feature representation corresponding to a top-up behavior.
Optionally, the execution subject may generate a behavior feature representation corresponding to each operation behavior by:
and S1, generating one-hot codes corresponding to the operation behaviors according to the operation behaviors corresponding to the target operation data in the target operation data set.
In these implementations, according to the operation behavior corresponding to the target operation data in the target operation data set, the execution main body may generate the one-hot code corresponding to each operation behavior by using a preset dictionary.
And S2, generating behavior characteristic representation corresponding to each unique hot code by using a word vector model trained in advance according to the generated unique hot codes.
In these implementations, according to the one-hot code generated in step S1, the execution subject may convert the one-hot code corresponding to each operation behavior into a behavior feature representation by using a pre-trained word vector generation model.
Based on the above alternative implementation, the manner of generating the behavior feature representation can increase the representation of the feature and can avoid dimension disasters.
Optionally, the word vector model may be trained by:
1) and acquiring a sample user behavior feature representation sequence set.
In these implementations, the executive agent used to train the word vector model may obtain the set of sample user behavior feature representation sequences in various ways. In practice, the sample set of user behavior feature representation sequences may be behavior feature representation sequences generated according to historical operation behaviors of a large number of users. As an example, the sample user behavior feature representation sequence set may include:
Figure BDA0002657468170000111
wherein the content of the first and second substances,
Figure BDA0002657468170000112
respectively represent the sample user behavior feature representation sequences in the sample user behavior feature representation sequence set.
2) And obtaining an initial word vector model.
In these implementations, the executive for training the above-described word vector model may obtain the initial word vector model in various ways. The initial word vector model may include various Deep Neural Networks (DNNs). As an example, the initial word vector model described above may include a word2vec model.
3) And taking the behavior feature representation belonging to the same sample user behavior feature sequence in the sample user behavior feature representation sequence set as a positive sample, and training an initial word vector model by using a machine learning mode to obtain a word vector model.
In these implementations, the execution subject may determine, as the positive sample group, the behavior feature representations belonging to the same sample user behavior feature sequence in the sample user behavior feature representation sequence set. And then, taking part of the behavior feature representation in the positive sample group as input, taking the behavior feature representation belonging to the same positive sample group with the input behavior feature representation as expected output, and training by utilizing a machine learning mode to obtain a word vector model.
As an example, the execution body may be, as previously described above
Figure BDA0002657468170000113
Figure BDA0002657468170000114
Respectively determined as positive sample groups. Then, the execution subject can be
Figure BDA0002657468170000115
And
Figure BDA0002657468170000116
as an input, will
Figure BDA0002657468170000117
As the desired output, the parameters of the initial word vector model are adjusted using a machine learning method. As yet another example, the execution subject may be
Figure BDA0002657468170000118
As an input, will
Figure BDA0002657468170000119
And
Figure BDA00026574681700001110
and as expected output, adjusting parameters of the initial word vector model by using a machine learning method, and determining the initial word vector model obtained after training as the word vector model.
Based on the optional implementation manner, the execution main body can construct a word vector model suitable for encoding the sequence operation behavior of the user, so as to improve the representation performance of the behavior feature representation sequence and provide a basis for subsequently improving the recognition rate of the preset category user.
And fourthly, generating a behavior feature representation sequence of the target user according to the time sequence of the corresponding operation represented by the generated behavior feature.
In these implementations, the execution subject may generate the target user behavior feature representation sequence in various ways, in accordance with the time order in which the generated behavior feature representation corresponds to the operation. As an example, the behavior feature representation may be in the form of a vector. The execution body may arrange the behavior feature representations in the vector form in a time order of the corresponding operations, thereby generating a behavior feature representation sequence in a matrix form. The row or column vector of the behavior feature representation sequence in the matrix form is the behavior feature representation in the vector form. As another example, the execution subject may connect the behavior feature representations in the vector form end to end in the time sequence of the corresponding operation, so as to form a new behavior feature identification sequence in the vector form with a larger dimension.
Based on the alternative implementation manner, the execution body can encode different operation behaviors into a behavior feature representation sequence according to time sequence, so that a technical basis is provided for identifying accurate discovery of specific interlinked operation behaviors (such as coupon transfer cash register).
Step 202, according to the behavior feature representation sequence of the target user, generating a metric value representing that the target user belongs to a preset category user.
In this embodiment, according to the behavior feature representation sequence of the target user obtained in step 201, the execution main body may generate a metric value representing that the target user belongs to a preset category user in various ways. The metric value can be used for representing the possibility that the target user belongs to the preset category user.
In some optional implementation manners of this embodiment, the executing entity may input the behavior feature representation sequence of the target user to a pre-trained classification model, and obtain a probability for representing that the target user belongs to a preset class of users as the metric.
In these implementations, the above-described pre-trained classification model may include, but is not limited to, at least one of: logistic regression model, xgboost model. Alternatively, the classification model is typically a binary classification model. The two categories may be users corresponding to the behavior feature representation sequence belonging to or not belonging to the preset category.
Based on the optional implementation manner, the execution subject may identify the preset category user in a manner of supervised learning.
And step 203, responding to the comparison between the determined metric value and the preset threshold value, and generating prompt information for representing that the target user belongs to a preset category user.
In this embodiment, in response to comparing the metric value generated in the determining step 202 with the preset threshold, the executing entity may generate prompt information for characterizing that the target user belongs to a preset category in various ways. As an example, in response to determining that the metric value generated in step 202 is greater than the preset threshold, the executing entity may generate prompt information for characterizing that the target user belongs to a preset category of users, such as "risk user prompt" and "marketing fraud user prompt".
In some optional implementations of this embodiment, the executing body may continue to perform the following steps:
the method comprises the first step of obtaining an associated user behavior feature representation sequence set associated with a behavior feature representation sequence of a user.
In these implementations, the execution subject may acquire the associated user behavior feature representation sequence set associated with the behavior feature representation sequence of the user in various ways. The associated user behavior feature identification sequence set may include at least one behavior feature representation sequence corresponding to a user belonging to the preset category. For example, the association relationship may be preset, for example, the operation time period corresponding to the behavior feature representation sequence corresponding to the user of the preset category of users matches (for example, 8 hours before and after), the login areas indicated by the IP addresses are the same, and the user name similarity exceeds a threshold.
And secondly, clustering the associated user behavior feature representation sequences in the acquired associated user behavior feature representation sequence set to generate a target number of clustering results.
In these implementations, the executing entity may utilize various clustering algorithms to cluster the associated user behavior feature representation sequences in the associated user behavior feature representation sequence set obtained in the first step, so as to generate a target number of clustering results. As an example, the execution subject may perform clustering directly according to the associated user behavior feature representation sequence in a vector form, so as to generate a clustering result. As another example, the execution subject may first extract a representative vector from the associated user behavior feature representation sequence in the form of a matrix, and then perform clustering according to the extracted vector to generate a clustering result. Optionally, the extracting of the representative vector may be consistent with the following manner of generating the user tag feature representation, and is not described herein again.
And thirdly, determining the associated user behavior feature representation sequence of the behavior feature representation sequence corresponding to the user belonging to the preset category user as the target associated user behavior feature representation sequence.
And fourthly, generating prompt information representing whether the user corresponding to the target associated user behavior characteristic representation sequence belongs to a preset class of users.
In these implementations, the execution subject may generate, in various ways, prompt information that indicates whether the user corresponding to the target associated user behavior feature representation sequence determined in the third step belongs to a preset category of users. As an example, the executing body may directly generate prompt information representing that the user corresponding to the target associated user behavior feature representation sequence determined in the third step belongs to a preset category of users. As another example, the execution subject may further determine a similarity between the behavior feature representation sequence of the target associated user determined in the third step and the behavior feature representation sequence corresponding to the user belonging to the preset category. In response to determining that the similarity is greater than the preset similarity threshold, the executing body may generate prompt information representing that the user corresponding to the target associated user behavior feature representation sequence determined in the third step belongs to a preset category of users. In response to determining that the similarity is not greater than the preset similarity threshold, the executing body may generate prompt information indicating that the user corresponding to the target associated user behavior feature representation sequence determined in the third step does not belong to a preset category of users.
Based on the optional implementation manner, the execution main body can utilize a clustering algorithm to cluster the behavior feature representation sequences, so that the users potentially belonging to the preset category users are mined in a manner of combining supervised learning and semi-supervised learning, and the method is particularly suitable for anti-fraud scenes with strong population and high similarity.
In some optional implementation manners of this embodiment, in response to receiving information that the target user indicated by the representation determination prompt information belongs to a preset category user, the execution main body may further add the identification information of the target user to a preset category user information set. As an example, the information that the target user indicated by the above-described characterization determination prompting information belongs to the preset category of users may be confirmation information transmitted by a terminal used by a technician.
Based on the optional implementation manner, the execution main body can dynamically supplement the identified information set of the preset category user, and provide a data basis for further improving the identification rate of the preset category user.
Optionally, the executing main body may further continue to perform the following steps:
and S1, acquiring a behavior feature representation sequence set corresponding to the preset category user information set.
In these implementations, the execution subject may acquire the behavior feature representation sequence set corresponding to the preset category user information set in various ways. The behavior feature representation sequence in the acquired behavior feature representation sequence set may be generated based on operation data of a user indicated by the user information set belonging to the preset category.
And S2, generating user label characteristic representation corresponding to the behavior characteristic representation sequence in the behavior characteristic representation sequence set.
In these implementations, the execution subject may generate the user tag feature representation corresponding to the behavior feature representation sequence in the behavior feature representation sequence set in various ways. Wherein, the user label feature representation is used for representing the representative behavior feature representation in the behavior feature representation sequence. Optionally, the above-mentioned manner for generating the user tag feature representation may also be consistent with the description in the optional embodiment in step 403 of the following embodiment, and is not described here again.
And S3, generating key behavior characteristic information corresponding to the preset category users.
In these implementations, the execution subject may generate the key behavior feature information corresponding to the preset category of users in various ways. As an example, the execution subject may represent the user tag feature generated in step S2 as the key behavior feature information corresponding to the preset category of users. As another example, the execution subject may further integrate the user tag feature representation generated in step S2. And then, generating key behavior characteristic information corresponding to the preset category of users according to the integrated user label characteristic representation. The integration may include, for example, selecting the user tag feature representation with the largest occurrence number, and selecting the user tag feature representation with the occurrence number greater than a preset number threshold.
Based on the optional implementation manner, the execution main body may analyze the characteristic behavior of the preset category user to generate explicit key behavior characteristic information. Therefore, the behavior picture of the preset category user can be generated, on one hand, the interpretability of the basis for identifying the preset category user can be improved, and on the other hand, an improved thought provided for anti-fraud wind control personnel to design marketing activities and anti-fraud rules can be provided.
With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of a method for generating hint information according to an embodiment of the present disclosure. In the application scenario of fig. 3, a user 301 performs a series of operations using a terminal device 302 (e.g., id: user058) to generate operation data 303. The operation data 303 may include, for example, an "18: 05:18 click-to-E-commerce promotion activityPage "," 18:05:21 get 30-element coupon "," 18:05:25 share marketing page to social friend "," 18:05:32 present 1-element ". Then, the terminal device 302 sends the operation data 303 to the backend server 304. The backend server 304 generates a behavior feature representation sequence 305 corresponding to the received operation data 303. Wherein the behavior feature representation sequence 305 may include
Figure BDA0002657468170000161
Wherein, the above
Figure BDA0002657468170000162
The method can be respectively used for representing behaviors of clicking an E-commerce promotion activity page, getting 30-element coupons, sharing marketing pages to social friends, and giving up 1 element. The backend server 304 may generate a metric value (shown as 306 in fig. 3) characterizing that the user with id user058 belongs to a risky user in various ways, such as 0.83. In response to determining that the generated metric value 306 is greater than a preset threshold (e.g., 0.75), the backend server 304 may generate a prompt 307 characterizing that the user with the user058 id belongs to a risky user. Optionally, the backend server 304 may also send the generated prompt message 307 to the terminal device 308 used by the service personnel.
At present, one of the prior arts generally utilizes the click behavior sequence generation feature of the user to perform anomaly detection, which results in insufficient identification of anomalous users for scenes with high operation relevance, high time concentration and strong population (e.g. anti-fraud marketing). In the method provided by the embodiment of the disclosure, the behavior feature representation sequence generated according to the operation data of different operation types and operation attributes within the preset time period is obtained, so that the operation data is effectively utilized, and the detail information, such as time sequence, operation behavior relevance and the like, contained in the operation data is more comprehensively represented, thereby providing a technical basis for improving the identification degree of the abnormal user.
With further reference to FIG. 4, a flow 400 of yet another embodiment of a method for generating hints information is shown. The process 400 of the method for generating a prompt message includes the following steps:
step 401, acquiring a behavior feature representation sequence of a target user.
Step 402, acquiring behavior feature information corresponding to users belonging to a preset category as target behavior feature information.
In this embodiment, an execution subject (e.g., the server 105 shown in fig. 1) of the method for generating the prompt information may acquire behavior feature information corresponding to a user belonging to a preset category as target behavior feature information in various ways. The obtained target behavior feature information may be generated based on operation data of a user belonging to the preset category.
And step 403, determining similarity measurement between the behavior characteristic representation sequence of the target user and the target behavior characteristic information as a measurement value representing that the target user belongs to a preset category of users.
In this embodiment, the execution subject may determine, in various ways, a similarity metric between the behavior feature representation sequence of the target user obtained in step 401 and the target behavior feature information obtained in step 402 as a metric value representing that the target user belongs to a preset category of users. Wherein the similarity measure may include, but is not limited to, at least one of: cosine similarity, cosine distance, euclidean distance.
In some optional implementations of this embodiment, the behavior feature information may include a behavior feature representation. The execution main body can determine similarity measurement between the behavior characteristic representation sequence of the target user and the target behavior characteristic information as a measurement value for representing that the target user belongs to a preset category user through the following steps:
the method comprises the following steps of firstly, acquiring the number of historical behavior feature representation sequences included in a preset historical behavior feature representation sequence set as a reference number.
In these implementations, the execution subject may obtain the number of historical behavior feature representation sequences included in a preset historical behavior feature representation sequence set and use the number as a reference number. In practice, the preset historical behavior feature representation sequence set can be generated according to historical operation data of a large number of users.
And secondly, determining a first significance measure corresponding to the behavior feature representation in the behavior feature representation sequence of the target user.
In these implementations, for the behavior feature representation in the behavior feature representation sequence of the target user obtained in the first step, the execution subject may determine the first significance measure corresponding to the behavior feature representation in various ways. Wherein the first significance measure is generally positively correlated with the reference number and negatively correlated with the number of historical behavior feature representation sequences in the historical behavior feature representation sequence set including the historical behavior feature representation matching the behavior feature representation. The matching may refer to the same or similarity greater than a preset threshold.
As an example, the first significance measure may be a ratio of the reference number to a number of historical behavior feature representation sequences in the historical behavior feature representation sequence set that include a historical behavior feature representation matching the behavior feature representation. As yet another example, the first significance measure may also be a logarithm of the aforementioned ratio.
And thirdly, generating a user tag feature representation corresponding to the behavior feature representation sequence of the target user based on the determined first significance metric.
In these implementations, based on the first significance metric determined in the third step, the execution subject may generate a user tag feature representation corresponding to the behavior feature representation sequence of the target user in various ways. As an example, the execution subject may select a behavior feature representation corresponding to the maximum value of the first significance measure as a user tag feature representation corresponding to the behavior feature representation sequence of the target user.
Based on the optional implementation manner, the execution main body may generate a representative user tag feature representation for the behavior feature representation sequence of the target user according to a degree of significance of the entity of the behavior feature representation sequence of the target user in a group of a preset historical behavior feature representation sequence set.
Optionally, the executing body may further generate a user tag feature representation corresponding to the behavior feature representation sequence of the target user by:
and S1, determining the frequency measurement of the behavior characteristic representation sequence of the target user, wherein each behavior characteristic representation in the behavior characteristic representation sequence of the target user appears.
In these alternative implementations, the executing entity may determine a frequency metric that each behavior feature representation in the behavior feature representation sequence of the target user obtained in step 401 appears in the behavior feature representation sequence of the target user. As an example, the frequency metric may be, for example, a ratio between the number of times the behavior feature representation appears in the behavior feature representation sequence of the target user and the number of behavior feature representations included in the behavior feature representation sequence of the target user.
And S2, determining a second significance measure corresponding to the behavior feature representation in the behavior feature representation sequence of the target user.
In these implementations, the executing entity may determine a second significance measure corresponding to the behavior feature representation in the behavior feature representation sequence of the target user obtained in step 401. Wherein the second significance measure may be positively correlated with the first significance measure and the second significance measure may be positively correlated with the frequency measure. As an example, the second significance measure may be a product between the first significance measure and the frequency measure.
And S3, selecting the corresponding behavior feature representation as the user label feature representation according to the determined second significance measure.
In these implementations, the executing entity may select the corresponding behavior feature representation as the user tag feature representation in various ways according to the second significance measure determined in step S2. As an example, the execution subject may select, as the user tag feature representation, a target number of behavior feature representations with the highest corresponding second significance measure.
Based on the optional implementation manner, the execution main body may further determine a degree of characterization of the behavior feature representation on the behavior feature representation sequence according to a specific gravity of different behavior feature representations in the behavior feature representation sequence of the target user, and generate a representative user tag feature representation for the behavior feature representation sequence of the target user.
And fourthly, determining similarity measurement between the user label characteristic representation and the target behavior characteristic information as a measurement value for representing that the target user belongs to a preset category user.
In these implementations, the executing body may further use, as a metric value representing that the target user belongs to a preset category of users, a similarity metric between the user tag feature representation determined in the third step and the target behavior feature information obtained in the step 402.
In some optional implementation manners of this embodiment, based on the user tag feature representation corresponding to the generated behavior feature representation sequence of the target user, the execution main body may further generate a user behavior tag corresponding to the behavior feature representation sequence of the target user. As an example, the execution subject may generate a corresponding user behavior tag according to a preset correspondence table between the user tag feature representation and the user behavior tag. As yet another example, the executing agent may generate the user behavior tag in a decoding manner corresponding to an encoding manner in which the user tag feature representation is generated (e.g., "receive coupon", "register", "redeem point").
Based on the optional implementation manner, the execution subject may generate an explicit user behavior tag, and provide a basis for visual presentation of the representative behavior of the preset category user.
In response to the comparison between the determined metric value and the preset threshold value, prompt information for representing that the target user belongs to a preset category of users is generated in step 404.
Step 401 and step 404 are respectively consistent with step 201 and step 203 and their optional implementations in the foregoing embodiment, and the above description on step 201 and step 203 and their optional implementations also applies to step 401 and step 404, which is not described herein again.
In some optional implementation manners of this embodiment, in response to receiving information that the target user indicated by the representation determination prompt information belongs to a preset category user, the execution main body may further add the identification information of the target user to a preset category user information set.
Optionally, the executing main body may further continue to perform the following steps:
and S1, acquiring a behavior feature representation sequence set corresponding to the preset category user information set.
And S2, generating user label characteristic representation corresponding to the behavior characteristic representation sequence in the behavior characteristic representation sequence set.
And S3, generating key behavior characteristic information corresponding to the preset category users.
As can be seen from fig. 4, the process 400 of the method for generating a prompt message in this embodiment embodies a step of determining a similarity metric between a behavior feature representation sequence of a target user and target behavior feature information as a metric value representing that the target user belongs to a preset category of users. Therefore, the scheme described in this embodiment can utilize the similarity comparison between the existing related information belonging to the preset category of users and the behavior feature representation sequence of the target user, thereby mining the potential abnormal behavior users and improving the identification efficiency of the abnormal behavior users.
With further reference to fig. 5, as an implementation of the method shown in the above-mentioned figures, the present disclosure provides an embodiment of an apparatus for generating a prompt message, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2 or fig. 4, and the apparatus may be specifically applied to various electronic devices.
As shown in fig. 5, the apparatus 500 for generating reminder information provided by the present embodiment includes a first acquisition unit 501, a first generation unit 502, and a reminder unit 503. The first obtaining unit 501 is configured to obtain a behavior feature representation sequence of a target user, where behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user in a preset time period, and the operation data includes data representing an operation type and an operation attribute; a first generating unit 502 configured to generate a metric value representing that the target user belongs to a preset category user according to the behavior feature representation sequence of the target user; a prompt unit 503 configured to generate prompt information for characterizing that the target user belongs to a preset category of users in response to determining that the metric value is compared with a preset threshold value.
In the present embodiment, in the apparatus 500 for generating prompt information: the specific processing of the first obtaining unit 501, the first generating unit 502, and the prompting unit 503 and the technical effects thereof can refer to the related descriptions of step 201, step 202, and step 203 in the corresponding embodiment of fig. 2, which are not repeated herein.
In some optional implementations of the present embodiment, the first obtaining unit 501 may include a first obtaining subunit (not shown in the figure), an extracting subunit (not shown in the figure), a first generating subunit (not shown in the figure), and a second generating subunit (not shown in the figure). The first acquiring subunit may be configured to acquire operation data of a target user in a preset time period. The extracting subunit may be configured to extract operation data belonging to a preset category from the operation data, and generate a target operation data set. The first generating subunit may be configured to generate a behavior feature representation corresponding to each operation behavior based on operation data corresponding to the same operation behavior in the target operation data set. The second generating subunit may be configured to generate a behavior feature representation sequence of the target user in a time order of the operations corresponding to the generated behavior feature representation.
In some optional implementations of this embodiment, the first generating subunit may include a first generating module (not shown in the figure) and a second generating module (not shown in the figure). The first generating module may be configured to generate the one-hot code corresponding to each operation behavior according to the operation behavior corresponding to the target operation data in the target operation data set. The second generating module may be configured to generate behavior feature representations corresponding to the respective unique hot codes by using a word vector model trained in advance according to the generated unique hot codes.
In some optional implementations of this embodiment, the word vector model may be obtained by training through the following steps: acquiring a sample user behavior feature representation sequence set; obtaining an initial word vector model; and taking the behavior feature representation in the same sample user behavior feature sequence in the sample user behavior feature representation sequence set as a positive sample group, and training an initial word vector model by using a machine learning mode to obtain a word vector model.
In some optional implementations of the present embodiment, the first generating unit 502 may be further configured to: and inputting the behavior characteristic representation sequence of the target user into a pre-trained classification model to obtain a probability used for representing that the target user belongs to a preset class user as a metric value.
In some optional implementations of the present embodiment, the first generating unit 502 may include: a second acquisition subunit (not shown in the figure), a determination subunit (not shown in the figure). The second obtaining subunit may be configured to obtain, as the target behavior feature information, behavior feature information corresponding to a user belonging to a preset category. The determining subunit may be configured to determine, as the metric value representing that the target user belongs to the preset category, a similarity metric between the behavior feature representation sequence of the target user and the target behavior feature information.
In some optional implementations of this embodiment, the behavior feature information may include a behavior feature representation. The determining subunit may include an obtaining module (not shown in the figure), a first determining module (not shown in the figure), a generating module (not shown in the figure), and a second determining module (not shown in the figure). The obtaining module may be configured to obtain, as the reference number, the number of historical behavior feature representation sequences included in a preset historical behavior feature representation sequence set. The first determining module may be configured to determine, for a behavior feature representation in the behavior feature representation sequence of the target user, a first significance measure corresponding to the behavior feature representation. The first significance measure may be positively correlated with the reference number, and negatively correlated with the number of historical behavior feature representation sequences in the historical behavior feature representation sequence set including the historical behavior feature representation matching the behavior feature representation. The generating module may be configured to generate a user tag feature representation corresponding to the behavior feature representation sequence of the target user based on the determined first significance metric. The second determining module may be configured to determine a similarity metric between the user tag feature representation and the target behavior feature information as a metric value representing that the target user belongs to a preset category of users.
In some optional implementations of this embodiment, the generating module may include a first determining sub-module (not shown in the figure), a second determining sub-module (not shown in the figure), and a selecting sub-module (not shown in the figure). The first determining submodule may be configured to determine a measure of frequency of occurrence of each behavior feature representation in the behavior feature representation sequence of the target user. The second determining submodule may be configured to determine a second significance measure corresponding to the behavior feature representation in the behavior feature representation sequence of the target user. Wherein the second significance measure can be positively correlated with the first significance measure and positively correlated with the frequency measure. A selecting sub-module may be configured to select the corresponding behavior feature representation as the user tag feature representation in accordance with the determined second significance measure.
In some optional implementations of this embodiment, the apparatus 500 for generating a prompt message may further include: and the second generation unit is configured to generate a user behavior tag corresponding to the behavior feature representation sequence of the target user based on the user tag feature representation.
In some optional implementations of this embodiment, the apparatus 500 for generating a prompt message may further include: a second obtaining unit (not shown), a clustering unit (not shown), a determining unit (not shown), and a third generating unit (not shown). The second obtaining unit may be configured to obtain an associated user behavior feature representation sequence set associated with the behavior feature representation sequence of the user. The associated user behavior feature identification sequence set may include at least one behavior feature representation sequence corresponding to a user belonging to a preset category. The clustering unit may be configured to cluster the associated user behavior feature representation sequences in the obtained associated user behavior feature representation sequence set to generate a target number of clustering results. The determining unit may be configured to determine, as the target associated user behavior feature representation sequence, an associated user behavior feature representation sequence in which behavior feature representation sequences corresponding to users belonging to a preset category belong to the same clustering result. The third generating unit may be configured to generate prompt information indicating whether the user corresponding to the determined target associated user behavior feature representation sequence belongs to a preset category of users.
In some optional implementations of this embodiment, the apparatus 500 for generating a prompt message may further include: and the adding unit is configured to add the identification information of the target user into a preset category user information set in response to receiving the information that the target user indicated by the characterization determination prompt information belongs to the preset category user.
In some optional implementations of this embodiment, the apparatus 500 for generating a prompt message may further include: a third acquiring unit (not shown), a fourth generating unit (not shown), and a fifth generating unit (not shown). The third obtaining unit may be configured to obtain a behavior feature representation sequence set corresponding to a preset category user information set. The fourth generating unit may be configured to generate a user tag feature representation corresponding to the behavior feature representation sequence in the behavior feature representation sequence set. The fifth generating unit may be configured to generate key behavior feature information corresponding to a preset category of users.
The apparatus provided by the above embodiment of the present disclosure acquires the behavior feature representation sequence of the target user through the first acquiring unit 501. The behavior feature representation in the behavior feature representation sequence is generated based on operation data of a target user in a preset time period, and the operation data comprises data representing operation types and operation attributes. Then, the first generating unit 502 generates a metric value representing that the target user belongs to a preset category user according to the behavior feature representation sequence of the target user. Finally, the prompt unit 503 generates prompt information for representing that the target user belongs to the preset category user in response to the comparison between the determined metric value and the preset threshold value. Therefore, the effective utilization of the operation data is realized, and the detail information such as time sequence, operation behavior relevance and the like contained in the operation data is more comprehensively embodied, so that a technical basis is provided for improving the identification degree of the abnormal user.
Referring now to FIG. 6, a schematic diagram of an electronic device (e.g., the server of FIG. 1) 600 suitable for use in implementing embodiments of the present disclosure is shown. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The server shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 6, electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 6 may represent one device or may represent multiple devices as desired.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of embodiments of the present disclosure.
It should be noted that the computer readable medium described in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, however, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (Radio Frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the server; or may exist separately and not be assembled into the server. The computer readable medium carries one or more programs which, when executed by the server, cause the server to: acquiring a behavior feature representation sequence of a target user, wherein behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user in a preset time period, and the operation data comprise data representing operation types and operation attributes; generating a metric value representing that the target user belongs to a preset category user according to the behavior characteristic representation sequence of the target user; and generating prompt information for representing that the target user belongs to a preset category user in response to the comparison between the determined metric value and the preset threshold value.
Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor comprises a first acquisition unit, a first generation unit and a prompt unit. Where the names of these units do not in some cases constitute a limitation on the unit itself, for example, the first acquisition unit may also be described as a "unit that acquires a behavior feature representation sequence of a target user, where behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user over a preset time period, the operation data including data characterizing an operation type and an operation attribute".
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the inventive concept as defined above. For example, the above features and (but not limited to) technical features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.

Claims (15)

1. A method for generating hints information, comprising:
acquiring a behavior feature representation sequence of a target user, wherein behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user in a preset time period, and the operation data comprise data representing operation types and operation attributes;
generating a metric value representing that the target user belongs to a preset category user according to the behavior feature representation sequence of the target user;
and generating prompt information for representing that the target user belongs to a preset category of users in response to the comparison of the metric value and a preset threshold value.
2. The method of claim 1, wherein the obtaining a behavior feature representation sequence of a target user comprises:
acquiring operation data of the target user in a preset time period;
extracting operation data belonging to a preset category from the operation data to generate a target operation data set;
generating behavior characteristic representations corresponding to the operation behaviors based on the operation data corresponding to the same operation behavior in the target operation data set;
and generating a behavior feature representation sequence of the target user according to the time sequence of the corresponding operation represented by the generated behavior feature.
3. The method of claim 2, wherein the generating behavior feature representations corresponding to the respective operation behaviors based on the operation data corresponding to the same operation behavior in the target operation data set comprises:
generating one-hot codes corresponding to the operation behaviors according to the operation behaviors corresponding to the target operation data in the target operation data set;
and generating behavior characteristic representation corresponding to each unique hot code by utilizing a word vector model trained in advance according to the generated unique hot codes.
4. The method of claim 3, wherein the word vector model is trained by:
acquiring a sample user behavior feature representation sequence set;
obtaining an initial word vector model;
and taking the behavior feature representation belonging to the same sample user behavior feature sequence in the sample user behavior feature representation sequence set as a positive sample group, and training the initial word vector model by using a machine learning mode to obtain the word vector model.
5. The method according to claim 1, wherein the generating a metric value representing that the target user belongs to a preset category of users according to the behavior feature representation sequence of the target user comprises:
and inputting the behavior characteristic representation sequence of the target user into a pre-trained classification model to obtain the probability for representing that the target user belongs to a preset class user as the metric value.
6. The method according to claim 1, wherein the generating a metric value representing that the target user belongs to a preset category of users according to the behavior feature representation sequence of the target user comprises:
acquiring behavior characteristic information corresponding to the users belonging to the preset category as target behavior characteristic information;
and determining similarity measurement between the behavior characteristic representation sequence of the target user and the target behavior characteristic information as the measurement value representing that the target user belongs to a preset category of users.
7. The method of claim 6, wherein the behavior feature information comprises a behavior feature representation; and
the determining, as the metric value representing that the target user belongs to a preset category of users, a similarity metric between the behavior feature representation sequence of the target user and the target behavior feature information includes:
acquiring the number of historical behavior characteristic representation sequences included in a preset historical behavior characteristic representation sequence set as a reference number;
for the behavior feature representation in the behavior feature representation sequence of the target user, determining a first significance metric corresponding to the behavior feature representation, wherein the first significance metric is positively correlated with the reference number and negatively correlated with the number of historical behavior feature representation sequences including the historical behavior feature representation matched with the behavior feature representation in the historical behavior feature representation sequence set;
generating a user tag feature representation corresponding to the behavior feature representation sequence of the target user based on the determined first significance metric;
and determining similarity measurement between the user tag feature representation and the target behavior feature information as the measurement value representing that the target user belongs to a preset category of users.
8. The method of claim 7, wherein the generating a user tag feature representation corresponding to the sequence of behavioral feature representations of the target user based on the determined first significance metric comprises:
determining a frequency metric of occurrence of each behavior feature representation in the behavior feature representation sequence of the target user;
determining a second significance measure corresponding to the behavior feature representation in the behavior feature representation sequence of the target user, wherein the second significance measure is positively correlated with the first significance measure and positively correlated with the frequency measure;
and selecting the corresponding behavior characteristic representation as the user label characteristic representation according to the determined second significance measure.
9. The method of claim 6, wherein the method further comprises:
and generating a user behavior label corresponding to the behavior feature representation sequence of the target user based on the user label feature representation.
10. The method of claim 1, wherein the method further comprises:
acquiring a related user behavior feature representation sequence set related to the behavior feature representation sequence of the user, wherein the related user behavior feature identification sequence set comprises at least one behavior feature representation sequence corresponding to the user belonging to the preset category user;
clustering the obtained associated user behavior feature representation sequences in the associated user behavior feature representation sequence set to generate a target number of clustering results;
determining the associated user behavior characteristic representation sequence of the behavior characteristic representation sequence corresponding to the user belonging to the preset category user as a target associated user behavior characteristic representation sequence, wherein the associated user behavior characteristic representation sequence belongs to the same clustering result;
and generating prompt information representing whether the user corresponding to the target associated user behavior characteristic representation sequence belongs to a preset category user.
11. The method according to one of claims 1-10, the method further comprising:
and adding the identification information of the target user into a preset category user information set in response to receiving information representing that the target user indicated by the prompt information belongs to a preset category user.
12. The method of claim 11, further comprising:
acquiring a behavior characteristic representation sequence set corresponding to the preset category user information set;
generating user label characteristic representation corresponding to the behavior characteristic representation sequence in the behavior characteristic representation sequence set;
and generating key behavior characteristic information corresponding to the preset category users.
13. An apparatus for generating hints information, comprising:
a first acquisition unit configured to acquire a behavior feature representation sequence of a target user, wherein behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user in a preset time period, and the operation data comprises data representing an operation type and an operation attribute;
the first generation unit is configured to generate a metric value representing that the target user belongs to a preset category user according to the behavior feature representation sequence of the target user;
a prompt unit configured to generate prompt information for characterizing that the target user belongs to a preset category of users in response to determining that the metric value is compared with a preset threshold value.
14. A server, comprising:
one or more processors;
a storage device having one or more programs stored thereon;
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-12.
15. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-12.
CN202010893051.7A 2020-08-31 2020-08-31 Method, device, server and medium for generating prompt information Active CN113780318B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010893051.7A CN113780318B (en) 2020-08-31 2020-08-31 Method, device, server and medium for generating prompt information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010893051.7A CN113780318B (en) 2020-08-31 2020-08-31 Method, device, server and medium for generating prompt information

Publications (2)

Publication Number Publication Date
CN113780318A true CN113780318A (en) 2021-12-10
CN113780318B CN113780318B (en) 2024-04-16

Family

ID=78835242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010893051.7A Active CN113780318B (en) 2020-08-31 2020-08-31 Method, device, server and medium for generating prompt information

Country Status (1)

Country Link
CN (1) CN113780318B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114885183A (en) * 2022-04-21 2022-08-09 武汉斗鱼鱼乐网络科技有限公司 Method, device, medium and equipment for identifying gift package risk user

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109388548A (en) * 2018-09-29 2019-02-26 北京京东金融科技控股有限公司 Method and apparatus for generating information
CN109410036A (en) * 2018-10-09 2019-03-01 北京芯盾时代科技有限公司 A kind of fraud detection model training method and device and fraud detection method and device
WO2019052283A1 (en) * 2017-09-13 2019-03-21 乐蜜有限公司 Fraud prevention method, operation detection method and apparatus, and electronic device
US20190102276A1 (en) * 2017-10-04 2019-04-04 Servicenow, Inc. Systems and methods for robust anomaly detection
CN110798440A (en) * 2019-08-13 2020-02-14 腾讯科技(深圳)有限公司 Abnormal user detection method, device and system and computer storage medium
CN111177725A (en) * 2019-12-31 2020-05-19 广州市百果园信息技术有限公司 Method, device, equipment and storage medium for detecting malicious click operation
US20200228552A1 (en) * 2019-01-14 2020-07-16 Penta Security Systems Inc. Method and apparatus for detecting abnormal behavior of groupware user

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019052283A1 (en) * 2017-09-13 2019-03-21 乐蜜有限公司 Fraud prevention method, operation detection method and apparatus, and electronic device
US20190102276A1 (en) * 2017-10-04 2019-04-04 Servicenow, Inc. Systems and methods for robust anomaly detection
CN109388548A (en) * 2018-09-29 2019-02-26 北京京东金融科技控股有限公司 Method and apparatus for generating information
CN109410036A (en) * 2018-10-09 2019-03-01 北京芯盾时代科技有限公司 A kind of fraud detection model training method and device and fraud detection method and device
US20200228552A1 (en) * 2019-01-14 2020-07-16 Penta Security Systems Inc. Method and apparatus for detecting abnormal behavior of groupware user
CN110798440A (en) * 2019-08-13 2020-02-14 腾讯科技(深圳)有限公司 Abnormal user detection method, device and system and computer storage medium
CN111177725A (en) * 2019-12-31 2020-05-19 广州市百果园信息技术有限公司 Method, device, equipment and storage medium for detecting malicious click operation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘绍海;刘青昆;安娜;顾跃举;: "基于聚类算法与序列异常技术的入侵检测新方法", 计算机安全, no. 08 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114885183A (en) * 2022-04-21 2022-08-09 武汉斗鱼鱼乐网络科技有限公司 Method, device, medium and equipment for identifying gift package risk user

Also Published As

Publication number Publication date
CN113780318B (en) 2024-04-16

Similar Documents

Publication Publication Date Title
US10558984B2 (en) Method, apparatus and server for identifying risky user
US20190147063A1 (en) Method and apparatus for generating information
CN110442712B (en) Risk determination method, risk determination device, server and text examination system
US20180284971A1 (en) Intelligent visual object management system
CN111371767B (en) Malicious account identification method, malicious account identification device, medium and electronic device
US11200241B2 (en) Search query enhancement with context analysis
CN111104590A (en) Information recommendation method, device, medium and electronic equipment
CN111429214B (en) Transaction data-based buyer and seller matching method and device
CN113656699B (en) User feature vector determining method, related equipment and medium
US20230281696A1 (en) Method and apparatus for detecting false transaction order
CN112995414B (en) Behavior quality inspection method, device, equipment and storage medium based on voice call
CN113780318B (en) Method, device, server and medium for generating prompt information
CN110674300B (en) Method and apparatus for generating information
CN111787042B (en) Method and device for pushing information
CN116186541A (en) Training method and device for recommendation model
CN114265989A (en) Friend recommendation method, electronic device and computer-readable storage medium
CN114493850A (en) Artificial intelligence-based online notarization method, system and storage medium
CN114066603A (en) Post-loan risk early warning method and device, electronic equipment and computer readable medium
CN113807920A (en) Artificial intelligence based product recommendation method, device, equipment and storage medium
CN111127057B (en) Multi-dimensional user portrait recovery method
CN107368597B (en) Information output method and device
CN116911304B (en) Text recommendation method and device
CN116821475B (en) Video recommendation method and device based on client data and computer equipment
CN116911912B (en) Method and device for predicting interaction objects and interaction results
CN113508371B (en) System and method for improving computer identification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant