CN113780318B - Method, device, server and medium for generating prompt information - Google Patents

Method, device, server and medium for generating prompt information Download PDF

Info

Publication number
CN113780318B
CN113780318B CN202010893051.7A CN202010893051A CN113780318B CN 113780318 B CN113780318 B CN 113780318B CN 202010893051 A CN202010893051 A CN 202010893051A CN 113780318 B CN113780318 B CN 113780318B
Authority
CN
China
Prior art keywords
user
behavior
behavior feature
feature representation
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010893051.7A
Other languages
Chinese (zh)
Other versions
CN113780318A (en
Inventor
郭思文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Holding Co Ltd
Original Assignee
Jingdong Technology Holding Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong Technology Holding Co Ltd filed Critical Jingdong Technology Holding Co Ltd
Priority to CN202010893051.7A priority Critical patent/CN113780318B/en
Publication of CN113780318A publication Critical patent/CN113780318A/en
Application granted granted Critical
Publication of CN113780318B publication Critical patent/CN113780318B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3438Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment monitoring of user actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the disclosure discloses a method, a device, a server and a medium for generating prompt information. One embodiment of the method comprises the following steps: acquiring a behavior feature representation sequence of a target user, wherein behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user in a preset time period, and the operation data comprise data representing operation types and operation attributes; generating a measurement value representing that the target user belongs to a preset class user according to the behavior characteristic representation sequence of the target user; and generating prompt information for representing that the target user belongs to a preset category user in response to determining that the metric value is compared with a preset threshold value. According to the embodiment, the effective utilization of the operation data is realized, the detailed information such as time sequence, operation behavior relevance and the like contained in the operation data is more comprehensively embodied, and therefore a technical basis is provided for improving the recognition degree of abnormal users.

Description

Method, device, server and medium for generating prompt information
Technical Field
Embodiments of the present disclosure relate to the field of computer technology, and in particular, to a method, an apparatus, a server, and a medium for generating prompt information.
Background
With the development of internet technology, the existing internet anti-fraud means mainly comprise data index monitoring, numerical statistics feature based, abnormal behavior detection based on a user click behavior sequence and the like.
However, the prior art only uses the statistical characteristics of the user operation data and the clicking behaviors of the user, and the recognition degree of the abnormal user in the scene with high operation relevance, high time concentration and strong population (such as marketing anti-fraud) is still to be improved.
Disclosure of Invention
The embodiment of the disclosure provides a method, a device, a server and a medium for generating prompt information.
In a first aspect, embodiments of the present disclosure provide a method for generating a hint information, the method comprising: acquiring a behavior feature representation sequence of a target user, wherein behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user in a preset time period, and the operation data comprise data representing operation types and operation attributes; generating a measurement value representing that the target user belongs to a preset class user according to the behavior characteristic representation sequence of the target user; and generating prompt information for representing that the target user belongs to the preset category user in response to the comparison of the determined metric value and the preset threshold value.
In some embodiments, the obtaining the behavior feature representation sequence of the target user includes: acquiring operation data of a target user in a preset time period; extracting operation data belonging to a preset category from the operation data to generate a target operation data set; generating behavior characteristic representations corresponding to all operation behaviors based on operation data corresponding to the same operation behavior in a target operation data set; and generating a behavior characteristic representation sequence of the target user according to the time sequence of the corresponding operation of the generated behavior characteristic representation.
In some embodiments, the generating, based on the operation data corresponding to the same operation behavior in the target operation data set, a behavior feature representation corresponding to each operation behavior includes: generating a single-heat code corresponding to each operation behavior according to the operation behavior corresponding to the target operation data in the target operation data set; and generating behavior characteristic representations corresponding to the independent thermal codes by utilizing a pre-trained word vector model according to the generated independent thermal codes.
In some embodiments, the word vector model is trained by: acquiring a sample user behavior characteristic representation sequence set; acquiring an initial word vector model; and training an initial word vector model by using a machine learning mode to obtain a word vector model by taking the behavior feature representations belonging to the same sample user behavior feature sequence in the sample user behavior feature representation sequence set as a positive sample group.
In some embodiments, the generating, according to the behavior feature representation sequence of the target user, a metric value characterizing that the target user belongs to a preset class of users includes: and inputting the behavior characteristic representation sequence of the target user into a pre-trained classification model to obtain probability for representing that the target user belongs to a preset class user as a metric value.
In some embodiments, the generating, according to the behavior feature representation sequence of the target user, a metric value characterizing that the target user belongs to a preset class of users includes: acquiring behavior characteristic information corresponding to a user belonging to a preset category as target behavior characteristic information; and determining similarity measurement between the behavior characteristic representation sequence of the target user and the target behavior characteristic information as a measurement value for representing that the target user belongs to a preset class of users.
In some embodiments, the behavioral characteristic information includes a behavioral characteristic representation; and determining the similarity measure between the behavior feature expression sequence of the target user and the target behavior feature information as a measure value for representing that the target user belongs to a preset class of users, wherein the method comprises the following steps: acquiring the number of the historical behavior feature representation sequences included in a preset historical behavior feature representation sequence set as a reference number; for behavior feature representations in a behavior feature representation sequence of a target user, determining a first saliency measure corresponding to the behavior feature representation, wherein the first saliency measure is positively correlated with a reference number, and is negatively correlated with the number of historical behavior feature representation sequences comprising historical behavior feature representations matched with the behavior feature representation in a historical behavior feature representation sequence set; generating a user tag feature representation corresponding to the behavior feature representation sequence of the target user based on the determined first significance measure; and determining similarity measurement between the user tag characteristic representation and the target behavior characteristic information as a measurement value for representing that the target user belongs to a preset class of users.
In some embodiments, generating the user tag feature representation corresponding to the behavior feature representation sequence of the target user based on the determined first saliency measure includes: determining a frequency measure of occurrence of each behavior feature representation in the behavior feature representation sequence of the target user; determining a second saliency measure corresponding to the behavior feature representation in the behavior feature representation sequence of the target user, wherein the second saliency measure is positively correlated with the first saliency measure and is positively correlated with the frequency measure; and selecting the corresponding behavior characteristic representation as a user tag characteristic representation according to the determined second significance measure.
In some embodiments, the method further comprises: and generating a user behavior label corresponding to the behavior characteristic representation sequence of the target user based on the user label characteristic representation.
In some embodiments, the method further comprises: acquiring an associated user behavior feature representation sequence set associated with a user behavior feature representation sequence, wherein the associated user behavior feature identification sequence set comprises at least one behavior feature representation sequence corresponding to a user belonging to a preset category of users; clustering the related user behavior feature representation sequences in the acquired related user behavior feature representation sequence set to generate a target number of clustering results; determining an associated user behavior feature representation sequence corresponding to the user belonging to the preset category of users as a target associated user behavior feature representation sequence, wherein the associated user behavior feature representation sequence belongs to the same clustering result; and generating prompt information for representing whether the user corresponding to the determined target associated user behavior characteristic representation sequence belongs to a preset class of user.
In some embodiments, the method further comprises: and in response to receiving the information indicating that the target user indicated by the characterization determination prompt information belongs to the preset category user, adding the identification information of the target user into a preset category user information set.
In some embodiments, the method further comprises: acquiring a behavior characteristic representation sequence set corresponding to a preset category user information set; generating a user tag characteristic representation corresponding to the behavior characteristic representation sequence in the behavior characteristic representation sequence set; and generating key behavior characteristic information corresponding to the preset category user.
In a second aspect, embodiments of the present disclosure provide an apparatus for generating hint information, the apparatus comprising: a first acquisition unit configured to acquire a behavior feature representation sequence of a target user, wherein behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user in a preset time period, the operation data including data characterizing an operation type and an operation attribute; the first generation unit is configured to generate a measurement value representing that the target user belongs to a preset class user according to the behavior characteristic representation sequence of the target user; and the prompting unit is configured to generate prompting information for representing that the target user belongs to the preset category user in response to the comparison of the determined metric value and the preset threshold value.
In some embodiments, the first obtaining unit includes: a first acquisition subunit configured to acquire operation data of a target user in a preset period of time; an extraction subunit configured to extract operation data belonging to a preset category from the operation data, and generate a target operation data set; the first generation subunit is configured to generate behavior feature representations corresponding to the operation behaviors based on the operation data corresponding to the same operation behavior in the target operation data set; and a second generation subunit configured to generate a behavior feature representation sequence of the target user in accordance with the generated time order of the behavior feature representation corresponding operation.
In some embodiments, the first generating subunit includes: the first generation module is configured to generate single-heat codes corresponding to all operation behaviors according to the operation behaviors corresponding to the target operation data in the target operation data set; and the second generation module is configured to generate a behavior characteristic representation corresponding to each single thermal code by utilizing a pre-trained word vector model according to the generated single thermal code.
In some embodiments, the word vector model is trained by: acquiring a sample user behavior characteristic representation sequence set; acquiring an initial word vector model; and training an initial word vector model by using a machine learning mode to obtain a word vector model by taking the behavior feature representations belonging to the same sample user behavior feature sequence in the sample user behavior feature representation sequence set as a positive sample group.
In some embodiments, the first generating unit is further configured to: and inputting the behavior characteristic representation sequence of the target user into a pre-trained classification model to obtain probability for representing that the target user belongs to a preset class user as a metric value.
In some embodiments, the first generating unit includes: the second acquisition subunit is configured to acquire behavior feature information corresponding to the user belonging to the preset category as target behavior feature information; and the determining subunit is configured to determine a similarity measure between the behavior characteristic representation sequence of the target user and the target behavior characteristic information as a measure value for representing that the target user belongs to the preset category of users.
In some embodiments, the behavioral characteristic information includes a behavioral characteristic representation; the determining subunit includes: the acquisition module is configured to acquire the number of the historical behavior feature representation sequences included in the preset historical behavior feature representation sequence set as a reference number; a first determining module configured to determine, for a behavioral feature representation in a sequence of behavioral feature representations of a target user, a first saliency measure corresponding to the behavioral feature representation, wherein the first saliency measure is positively correlated with a reference number and negatively correlated with a number of historical behavioral feature representation sequences in a set of historical behavioral feature representation sequences that include historical behavioral feature representations that match the behavioral feature representation; a generation module configured to generate a user tag feature representation corresponding to the behavior feature representation sequence of the target user based on the determined first saliency measure; and the second determining module is configured to determine a similarity measure between the user tag characteristic representation and the target behavior characteristic information as a measure value for representing that the target user belongs to the preset category of users.
In some embodiments, the generating module includes: a first determination submodule configured to determine a frequency measure in which each behavioral characteristic representation of the target user occurs in the behavioral characteristic representation sequence of the target user; a second determination submodule configured to determine a second saliency measure corresponding to the behavioral feature representations in the behavioral feature representation sequence of the target user, wherein the second saliency measure is positively correlated with the first saliency measure and is positively correlated with the frequency measure; and a selection sub-module configured to select a corresponding behavior feature representation as a user tag feature representation according to the determined second saliency measure.
In some embodiments, the apparatus further comprises: and the second generation unit is configured to generate a user behavior label corresponding to the behavior characteristic representation sequence of the target user based on the user label characteristic representation.
In some embodiments, the apparatus further comprises: the second acquisition unit is configured to acquire an associated user behavior feature representation sequence set associated with the user behavior feature representation sequence, wherein the associated user behavior feature identification sequence set comprises at least one behavior feature representation sequence corresponding to a user belonging to a preset category of users; the clustering unit is configured to cluster the related user behavior feature representation sequences in the acquired related user behavior feature representation sequence set to generate a target number of clustering results; a determining unit configured to determine, as a target associated user behavior feature representation sequence, an associated user behavior feature representation sequence to which a behavior feature representation sequence corresponding to a user belonging to a preset category of users belongs to the same clustering result; the third generation unit is configured to generate prompt information for representing whether the user corresponding to the determined target associated user behavior feature representation sequence belongs to a preset class of users.
In some embodiments, the apparatus further comprises: and the adding unit is configured to add the identification information of the target user into a preset category user information set in response to receiving the information indicating that the target user indicated by the characterization determination prompt information belongs to the preset category user.
In some embodiments, the apparatus further comprises: a third acquisition unit configured to acquire a behavior feature representation sequence set corresponding to a preset category user information set; a fourth generating unit configured to generate a user tag feature representation corresponding to the behavior feature representation sequence in the behavior feature representation sequence set; and the fifth generation unit is configured to generate key behavior feature information corresponding to the preset category of users.
In a third aspect, embodiments of the present disclosure provide a server comprising: one or more processors; a storage device having one or more programs stored thereon; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method as described in any of the implementations of the first aspect.
In a fourth aspect, embodiments of the present disclosure provide a computer readable medium having stored thereon a computer program which, when executed by a processor, implements a method as described in any of the implementations of the first aspect.
According to the method, the device, the server and the medium for generating the prompt information, the effective utilization of the operation data is realized by acquiring the behavior feature representation sequence generated according to the operation data of different operation types and operation attributes within the preset time period, and the detailed information such as time sequence, operation behavior relevance and the like contained in the operation data is more comprehensively embodied. And generating prompt information of the target user belonging to the preset category user according to the obtained measurement value of the behavior characteristic expression sequence, so that the recognition degree of the abnormal user aiming at the scene with the continuous operation behavior characteristic is improved.
Drawings
Other features, objects and advantages of the present disclosure will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings:
FIG. 1 is an exemplary system architecture diagram in which an embodiment of the present disclosure may be applied;
FIG. 2 is a flow chart of one embodiment of a method for generating hint information according to the present disclosure;
FIG. 3 is a schematic diagram of one application scenario of a method for generating hints information in accordance with an embodiment of the present disclosure;
FIG. 4 is a flow chart of yet another embodiment of a method for generating hint information according to the present disclosure;
FIG. 5 is a schematic structural diagram of one embodiment of an apparatus for generating hint information according to the present disclosure;
fig. 6 is a schematic structural diagram of an electronic device suitable for use in implementing embodiments of the present disclosure.
Detailed Description
The present disclosure is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.
It should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be combined with each other. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
FIG. 1 illustrates an exemplary architecture 100 to which the methods of the present disclosure for generating a hint or an apparatus for generating a hint may be applied.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The terminal devices 101, 102, 103 interact with the server 105 via the network 104 to receive or send messages or the like. Various communication client applications, such as a web browser application, a shopping class application, a search class application, an instant messaging tool, a mailbox client, etc., may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices having a display screen and supporting man-machine interaction, including but not limited to smartphones, tablet computers, laptop and desktop computers, and the like. When the terminal devices 101, 102, 103 are software, they can be installed in the above-listed electronic devices. Which may be implemented as multiple software or software modules (e.g., software or software modules for providing distributed services) or as a single software or software module. The present invention is not particularly limited herein.
The server 105 may be a server providing various services, such as a background server providing support for shopping class applications on the terminal devices 101, 102, 103. The background server can analyze and process the obtained user operation data to generate corresponding processing results (such as prompt information representing that the user corresponding to the user operation data belongs to an abnormal user).
The server may be hardware or software. When the server is hardware, the server may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules (e.g., software or software modules for providing distributed services), or as a single software or software module. The present invention is not particularly limited herein.
It should be noted that, the method for generating the prompt message provided by the embodiment of the present disclosure is generally performed by the server 105, and accordingly, the device for generating the prompt message is generally provided in the server 105.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to fig. 2, a flow 200 of one embodiment of a method for generating hint information according to the present disclosure is shown. The method for generating the prompt message comprises the following steps:
step 201, a behavior feature representation sequence of a target user is acquired.
In this embodiment, the execution subject (such as the server 105 shown in fig. 1) of the method for generating the prompt message may acquire the behavior feature representation sequence of the target user through a wired connection manner or a wireless connection manner. Wherein the behavior feature representations in the behavior feature representation sequence may be generated based on operation data of the target user in a preset time period. The operation data may include data characterizing operation types and operation attributes. The target user may be any user specified in advance according to the actual application requirements, or may be a user according to a rule (e.g., a user performing a predetermined operation in a predetermined period of time).
As an example, the executing entity may obtain the behavior feature representation sequence of the target user locally or from an electronic device (e.g., a server) connected through a wire or wireless connection. The behavior characteristic representation sequence may include, for example, a behavior characteristic representation representing the transfer operation and the transfer amount, the coupon retrieving operation, and the coupon denomination of the user recorded in chronological order within 12 hours. For example, the behavior feature representation sequence of the target user can beWherein, the->Can be used to characterize the time t respectively 1 、t 2 、t 3 …t n The corresponding behavior feature is represented. The behavior feature representation may be used to characterize the behavior of operations performed by the user, e.gMay be according to the target user at t 1 The action characteristic representation generated by the action of the moment transfer 200 yuan.
In some optional implementations of this embodiment, the executing entity may obtain the behavior feature representation sequence of the target user through the following steps:
step one, operation data of a target user in a preset time period are obtained.
In these implementations, the execution subject may acquire the operation data of the target user for the preset period from a local or communication-connected electronic device (e.g., the terminal devices 101, 102, 103 shown in fig. 1). The operation data may include data characterizing operation types and operation attributes, among others. The above operation types may include, but are not limited to, at least one of: transactions (e.g., order, transfer, recharge), information changes (e.g., modify user name, region), login, registration, participation in an e-commerce marketing campaign (e.g., click on an e-commerce marketing page, pick up coupons, use coupons, pick up points), browse an e-commerce marketing campaign page. The operation attributes may correspond to the operation types, which may include, but are not limited to, at least one of: transaction amount, type of information altered (e.g., username, region), login location, registration time, number of times to engage in an e-commerce marketing campaign, duration to browse the page of the e-commerce marketing campaign, etc.
And secondly, extracting operation data belonging to a preset category from the operation data to generate a target operation data set.
In these implementations, the execution body may extract operation data belonging to a preset category from the operation data acquired in the first step, and generate the target operation data set. The preset category may include a preset category of an operation type, and may also include a preset category of an operation attribute.
Optionally, the execution body may further perform preprocessing on the extracted operation data belonging to the preset category, and generate the target operation data set by using the preprocessed operation data. Wherein the preprocessing may include, for example, but not limited to, at least one of: dirty data (dirty data) is cleaned, data such as missing values, messy codes and the like in operation data are normalized, and a data table where the operation data is located is subjected to multi-table connection (join) so as to be integrated into a wide table.
And thirdly, generating behavior characteristic representations corresponding to the operation behaviors based on the operation data corresponding to the same operation behavior in the target operation data set.
In these implementations, based on the operation data corresponding to the same operation behavior in the target operation data set generated in the second step, the execution body may encode the operation data (for example, the operation type and the operation attribute) corresponding to the same operation behavior, so as to generate the behavior feature representation corresponding to each operation behavior. As an example, the target operation data set may include "recharge", "100 yuan", "get coupon", "denomination 20 yuan", "browse active page", "10 seconds". Thus, the execution body can encode "recharge" and "100-element" as operation data, and generate a behavior feature representation corresponding to the recharge behavior.
Optionally, the execution body may generate the behavior feature representation corresponding to each operation behavior through the following steps:
s1, generating single-heat codes corresponding to all operation behaviors according to the operation behaviors corresponding to the target operation data in the target operation data set.
In these implementations, according to the operation behaviors corresponding to the target operation data in the target operation data set, the execution subject may generate the one-hot codes corresponding to the operation behaviors using a dictionary set in advance.
S2, generating behavior characteristic representations corresponding to the independent thermal codes by utilizing a pre-trained word vector model according to the generated independent thermal codes.
In these implementations, according to the one-hot code generated in the step S1, the execution subject may convert the one-hot code corresponding to each operation behavior into the behavior feature representation using a word vector generation model trained in advance.
Based on the alternative implementation manner, the manner of generating the behavior feature representation can increase the representational property of the feature and avoid dimension disasters.
Alternatively, the word vector model may be obtained by training the following steps:
1) A sample set of user behavior feature representation sequences is obtained.
In these implementations, the execution body used to train the word vector model described above may obtain a sample set of user behavior feature representation sequences in various ways. In practice, the set of sample user behavior feature representation sequences may be behavior feature representation sequences generated from historical operational behaviors of a large number of users. As an example, the sample set of user behavior feature representation sequences may include:wherein (1)>And respectively representing the sequences of the sample user behavior characteristic representations in the sample user behavior characteristic representation sequence set.
2) An initial word vector model is obtained.
In these implementations, the execution body used to train the word vector models described above may obtain the initial word vector model in various ways. The initial word vector model may include various deep neural networks (Deep Neural Networks, DNN), among others. By way of example, the initial word vector model described above may include a word2vec model.
3) And training an initial word vector model by using a machine learning mode to obtain a word vector model by taking the behavior feature representations belonging to the same sample user behavior feature sequence in the sample user behavior feature representation sequence set as positive samples.
In these implementations, the execution body may determine, as the positive sample group, a behavior feature representation in the sample user behavior feature representation sequence set that belongs to the same sample user behavior feature sequence. And then taking part of the behavior characteristic representations in the positive sample group as input, taking the behavior characteristic representations belonging to the same positive sample group as expected output, and training by using a machine learning mode to obtain a word vector model.
As an example, as previously described, the execution subject may be Respectively, as positive sample groups. The execution body may then executeAnd->As input, will +.>As a desired output, parameters of the initial word vector model are adjusted by a machine learning method. As a further example, the execution subject may also be +.>As input, will +.>And->As a desired output, parameters of the initial word vector model are adjusted by a machine learning method, and the initial word vector model obtained after training is determined as the word vector model.
Based on the optional implementation manner, the execution body can construct a word vector model suitable for encoding the sequence operation behaviors of the user so as to improve the representation performance of the behavior feature representation sequence and provide a basis for subsequently improving the recognition rate of the preset type of users.
And step four, generating a behavior feature representation sequence of the target user according to the time sequence of the operation corresponding to the generated behavior feature representation.
In these implementations, the execution body may generate the target user behavior feature representation sequence in various ways in terms of the chronological order of the operations to which the generated behavior feature representations correspond. As an example, the behavior feature representation may be in the form of a vector. The execution body may arrange the behavior feature representations in the form of vectors in a time sequence of the corresponding operations, thereby generating a behavior feature representation sequence in the form of a matrix. Wherein, the row or column vector of the matrix form behavior feature representation sequence is the vector form behavior feature representation. As yet another example, the execution body may end-to-end connect the behavior feature representations of the vector form according to the time sequence of the corresponding operations to form a new behavior feature identification sequence of the vector form with a larger dimension.
Based on the optional implementation manner, the execution body can code different operation behaviors into a behavior characteristic representation sequence according to time sequence, so that a technical basis is provided for identifying the accurate discovery of a specific interlinked operation behavior (such as the cash register of the coupon transfer).
Step 202, generating a metric value representing that the target user belongs to a preset class user according to the behavior characteristic representation sequence of the target user.
In this embodiment, according to the behavior feature representation sequence of the target user acquired in step 201, the execution subject may generate the metric value representing that the target user belongs to the preset class of users in various manners. The metric value may be used to characterize the probability that the target user belongs to the preset category of users.
In some optional implementations of this embodiment, the execution body may input the behavior feature representation sequence of the target user into a pre-trained classification model, to obtain, as the metric value, a probability for characterizing that the target user belongs to a preset class of users.
In these implementations, the pre-trained classification model described above may include, but is not limited to, at least one of: a logistic regression model, an xgboost model. Alternatively, the classification model is typically a classification model. The two categories may be that the user corresponding to the behavior feature representation sequence belongs to or does not belong to the preset category user.
Based on the optional implementation manner, the executing body can identify the preset category users in a supervised learning manner.
In step 203, in response to comparing the determined metric value with a preset threshold value, a prompt message for characterizing that the target user belongs to the preset category user is generated.
In this embodiment, in response to the comparison between the metric value generated in the determining step 202 and the preset threshold, the executing entity may generate the prompt information for indicating that the target user belongs to the preset category user in various manners. As an example, in response to determining that the metric value generated in step 202 is greater than the preset threshold, the executing entity may generate prompt information, such as "risk user prompt", "marketing fraud user prompt", for characterizing that the target user belongs to a preset category of users.
In some optional implementations of this embodiment, the executing body may continue to execute the following steps:
first, a set of associated user behavioral characteristic representation sequences associated with a behavioral characteristic representation sequence of a user is obtained.
In these implementations, the executing entity may obtain the set of associated user behavior feature representation sequences associated with the user's behavior feature representation sequences in various ways. The associated user behavior feature identification sequence set may include at least one behavior feature expression sequence corresponding to a user belonging to the preset category of users. As an example, the association relationship may be preset, for example, the operation time period corresponding to the behavior feature expression sequence corresponding to the user of the preset category user is matched (for example, 8 hours before and after), the login area indicated by the IP address is the same, the similarity of the user name exceeds a threshold value, and the like.
And secondly, clustering the related user behavior feature expression sequences in the acquired related user behavior feature expression sequence set to generate a target number of clustering results.
In these implementations, the execution body may use various clustering algorithms to cluster the associated user behavior feature representation sequences in the associated user behavior feature representation sequence set acquired in the first step, so as to generate a target number of clustering results. As an example, the execution subject may directly perform clustering according to the associated user behavior feature representation sequence in the form of a vector, so as to generate a clustering result. As yet another example, the execution subject may first extract a representative vector from the associated user behavior feature representation sequence in a matrix form, and then cluster according to the extracted vector to generate a clustering result. Alternatively, the extraction of representative vectors may be consistent with the manner of generating the user tag feature representation described below, and will not be described in detail herein.
And thirdly, determining the associated user behavior feature representation sequence which belongs to the same clustering result as the behavior feature representation sequence of the target associated user corresponding to the user belonging to the preset category user.
And fourth, generating prompt information representing whether the user corresponding to the determined target associated user behavior characteristic representation sequence belongs to a preset class of user.
In these implementations, the executing body may generate the prompt information characterizing whether the user corresponding to the target associated user behavior feature representation sequence determined in the third step belongs to the preset category of users in various manners. As an example, the execution body may directly generate prompt information for characterizing and prompting that the user corresponding to the target associated user behavior feature representation sequence determined in the third step belongs to the preset category of users. As yet another example, the execution subject may further determine a similarity between the target associated user behavior feature representation sequence determined in the third step and a behavior feature representation sequence corresponding to a user belonging to a preset category of users. In response to determining that the similarity is greater than a preset similarity threshold, the execution body may generate prompt information for prompting that the user corresponding to the target associated user behavior feature representation sequence determined in the third step belongs to a preset category of users. In response to determining that the similarity is not greater than a preset similarity threshold, the execution body may generate prompt information indicating that the user corresponding to the target associated user behavior feature representation sequence determined in the third step does not belong to the preset category of users.
Based on the optional implementation manner, the execution subject can cluster the behavior characteristic representation sequence by using a clustering algorithm, so that the mining of users potentially belonging to the preset category of users is realized by using a mode of combining supervised learning and semi-supervised learning, and the method is particularly suitable for anti-fraud scenes with strong crowd and high similarity.
In some optional implementations of this embodiment, in response to receiving information indicating that the target user indicated by the characterization determination prompt information belongs to a preset category user, the executing body may further add identification information of the target user to a preset category user information set. As an example, the information indicating that the target user indicated by the characterization determination prompt information belongs to the preset category of users may be acknowledgement information sent by a terminal used by a technician.
Based on the optional implementation manner, the execution subject can dynamically supplement the information set of the identified preset type user, and provide a data basis for further improving the identification rate of the preset type user.
Optionally, the execution body may further continue to execute the following steps:
s1, acquiring a behavior characteristic representation sequence set corresponding to a preset category user information set.
In these implementations, the executing body may acquire the behavior feature representation sequence set corresponding to the preset category user information set in various manners. Wherein the behavior feature representation sequence in the obtained behavior feature representation sequence set may be generated based on operation data of the user indicated by the user information set belonging to the preset category.
S2, generating a user tag feature representation corresponding to the behavior feature representation sequence in the behavior feature representation sequence set.
In these implementations, the execution body may generate the user tag feature representations corresponding to the behavior feature representation sequences in the behavior feature representation sequence set in various manners. Wherein the user tag feature representation is used to characterize a representative behavior feature representation in the behavior feature representation sequence. Optionally, the manner of generating the user tag feature representation may also be identical to the description in the optional embodiment in step 403 of the embodiment described below, which is not described herein.
S3, generating key behavior characteristic information corresponding to the preset category user.
In these implementations, the execution body may generate the key behavior feature information corresponding to the preset category user in various manners. As an example, the execution body may use the user tag feature generated in the step S2 as the key behavior feature information corresponding to the preset category user. As yet another example, the execution body may integrate the user tag feature representation generated in the step S2. And then, generating key behavior characteristic information corresponding to the preset category users according to the integrated user tag characteristic representation. The integrating may include, for example, selecting a user tag feature representation with the largest occurrence number, and selecting a user tag feature representation with the occurrence number greater than a preset number threshold.
Based on the optional implementation manner, the execution body may analyze the feature behaviors of the user in the preset category to generate explicit key behavior feature information. Therefore, behavior portraits of the users in the preset category can be generated, on one hand, the interpretability of the basis for identifying the users in the preset category can be improved, and on the other hand, improved ideas provided by marketing activities and anti-fraud rules can be designed for anti-fraud wind control personnel.
With continued reference to fig. 3, fig. 3 is a schematic illustration of an application scenario of a method for generating hint information according to embodiments of the present disclosure. In the application scenario of fig. 3, a user 301 performs a series of operations using a terminal device 302 (e.g., id: user 058) to generate operation data 303. The operation data 303 may include, for example, "18:05:18 click on e-commerce promotional program page", "18:05:21 get 30-membered coupon", "18:05:25 share marketing page to social friends", "18:05:32 promote 1-membered". Then, the terminal device 302 transmits the above-described operation data 303 to the background server 304. The background server 304 generates a behavior feature representation sequence 305 corresponding to the received operation data 303. Wherein the behavior feature representation sequence 305 may include Wherein, the->The method can be used for representing the behaviors of clicking the e-commerce promotion page, picking up 30-element coupons, sharing the marketing page to social friends, picking up 1-element and the like. The backend server 304 may generate a metric value (shown as 306 in fig. 3) representing that the user with id user058 belongs to a risk user, such as 0.83, in various ways. In response to determining that the generated metric 306 is greater than a preset threshold (e.g., 0.75), the background server 304 may generate prompt 307 characterizing that the user with id user058 belongs to a risk user. Optionally, the background server 304 may also send the generated prompt 307 to the terminal device 308 used by the service personnel.
At present, one of the prior art generally uses the click behavior sequence generation characteristics of users to perform anomaly detection, so that the degree of identification of abnormal users in scenes with high operation relevance, high time concentration and strong population (such as marketing anti-fraud) is not high enough. By acquiring the behavior characteristic representation sequence generated according to the operation data of different operation types and operation attributes in the preset time period, the method provided by the embodiment of the invention realizes effective utilization of the operation data, and more comprehensively embodies the detailed information such as time sequence, operation behavior relevance and the like contained in the operation data, thereby providing a technical basis for improving the recognition degree of abnormal users.
With further reference to fig. 4, a flow 400 of yet another embodiment of a method for generating hints information is illustrated. The process 400 of the method for generating a hint information includes the steps of:
step 401, a behavior feature representation sequence of a target user is acquired.
Step 402, obtaining behavior feature information corresponding to a user belonging to a preset category as target behavior feature information.
In this embodiment, the execution subject (e.g., the server 105 shown in fig. 1) of the method for generating the prompt information may acquire, as the target behavior feature information, behavior feature information corresponding to the user belonging to the preset category in various ways. The obtained target behavior feature information may be generated based on operation data of the user belonging to the preset category.
Step 403, determining a similarity measure between the behavior feature expression sequence of the target user and the target behavior feature information as a measure value for characterizing that the target user belongs to a preset class of users.
In this embodiment, the executing body may determine, in various manners, a similarity measure between the behavior feature representation sequence of the target user obtained in step 401 and the target behavior feature information obtained in step 402, as a measure value for characterizing that the target user belongs to a preset category of users. Wherein the similarity measure may include, but is not limited to, at least one of: cosine similarity, cosine distance, euclidean distance.
In some optional implementations of this embodiment, the behavior feature information may include a behavior feature representation. The executing body may determine a similarity measure between the behavior feature expression sequence of the target user and the target behavior feature information as a measure value for characterizing that the target user belongs to a preset class of users by:
the method comprises the steps of obtaining the number of historical behavior feature expression sequences included in a preset historical behavior feature expression sequence set as a reference number.
In these implementations, the execution body may acquire the number of the history behavior feature expression sequences included in the preset history behavior feature expression sequence set and use the number as the reference number. In practice, the set of predetermined historical behavioral characteristics representation sequences may be generated based on historical operational data of a large number of users.
Second, for a behavioral characteristic representation in a sequence of behavioral characteristic representations of a target user, determining a first measure of significance corresponding to the behavioral characteristic representation.
In these implementations, for the behavior feature representation in the behavior feature representation sequence of the target user acquired in the first step, the execution subject may determine the first saliency measure corresponding to the behavior feature representation in various manners. Wherein the first saliency measure is typically positively correlated with a reference number and negatively correlated with a number of historical behavior feature representation sequences in the set of historical behavior feature representation sequences that include a historical behavior feature representation that matches the behavior feature representation. The matching may refer to the same or a similarity greater than a preset threshold.
As an example, the first saliency measure may be a ratio of the reference number to a number of historical behavioral feature representation sequences in the set of historical behavioral feature representation sequences that include a historical behavioral feature representation that matches the behavioral feature representation. As yet another example, the first saliency metric described above may also be a logarithm of the ratio described above.
And thirdly, generating a user tag characteristic representation corresponding to the behavior characteristic representation sequence of the target user based on the determined first significance measure.
In these implementations, the execution body may generate the user tag feature representation corresponding to the behavior feature representation sequence of the target user in various manners based on the first saliency measure determined in the third step. As an example, the execution subject may select a behavior feature representation corresponding to a maximum value of the first saliency measure as the user tag feature representation corresponding to the behavior feature representation sequence of the target user.
Based on the optional implementation manner, the execution body may generate a representative user tag feature representation for the behavior feature representation sequence of the target user according to the significance degree of the individuality of the behavior feature representation sequence of the target user in the preset population of the historical behavior feature representation sequence set.
Optionally, the executing body may further generate a user tag feature representation corresponding to the behavior feature representation sequence of the target user through the following steps:
s1, determining frequency measurement of occurrence of each behavior feature representation in a behavior feature representation sequence of a target user.
In these alternative implementations, the execution entity may determine a frequency measure in which each behavior feature representation of the target user in the behavior feature representation sequence of the target user obtained in step 401 appears in the behavior feature representation sequence of the target user. As an example, the frequency measure may be a ratio between the number of occurrences of the behavior feature representation in the sequence of behavior feature representations of the target user and the number of behavior feature representations comprised in the sequence of behavior feature representations of the target user, for example.
S2, determining a second significance measure corresponding to the behavior feature representation in the behavior feature representation sequence of the target user.
In these implementations, the execution body may determine the second saliency measure corresponding to the behavioral characteristic representation of the target user obtained in step 401. Wherein the second saliency measure may be positively correlated with the first saliency measure, and the second saliency measure may be positively correlated with the frequency measure. As an example, the second saliency measure may be a product between the first saliency measure and the frequency measure.
S3, selecting corresponding behavior characteristic representations as the user tag characteristic representations according to the determined second significance measure.
In these implementations, according to the second saliency measure determined in step S2, the executing entity may select the corresponding behavior feature representation as the user tag feature representation in various manners. As an example, the executing entity may select, as the user tag feature representation, a target number of behavior feature representations for which the corresponding second saliency measure is highest.
Based on the optional implementation manner, the execution body may further determine the degree of characterization of the behavior feature representation sequence by the behavior feature representation according to the specific gravity of the different behavior feature representations in the behavior feature representation sequence of the target user, and generate a representative user tag feature representation for the behavior feature representation sequence of the target user.
And fourthly, determining similarity measurement between the user tag characteristic representation and the target behavior characteristic information as a measurement value for representing that the target user belongs to the preset category user.
In these implementations, the executing entity may further use the similarity measure between the user tag feature representation determined in the third step and the target behavior feature information obtained in the step 402 as a measure value for characterizing that the target user belongs to the preset category of users.
In some optional implementations of this embodiment, the execution body may further generate a user behavior tag corresponding to the behavior feature representation sequence of the target user based on the generated user tag feature representation corresponding to the behavior feature representation sequence of the target user. As an example, the execution body may generate the corresponding user behavior tag according to a preset correspondence table between the user tag feature representation and the user behavior tag. As yet another example, the executing entity may generate the user behavior tag (e.g., "get coupon", "register", "credit redemption") using a decoding scheme corresponding to an encoding scheme used to generate the user tag feature representation.
Based on the optional implementation manner, the execution body can generate an explicit user behavior label, and a basis is provided for the visual presentation of the representative behaviors of the preset type of users.
In step 404, in response to comparing the determined metric value with a preset threshold value, a prompt message for characterizing that the target user belongs to the preset category user is generated.
The above steps 401 and 404 are consistent with the steps 201 and 203 and their optional implementation manners in the foregoing embodiments, and the above descriptions on the steps 201 and 203 and their optional implementation manners also apply to the steps 401 and 404, which are not repeated herein.
In some optional implementations of this embodiment, in response to receiving information indicating that the target user indicated by the characterization determination prompt information belongs to a preset category user, the executing body may further add identification information of the target user to a preset category user information set.
Optionally, the execution body may further continue to execute the following steps:
s1, acquiring a behavior characteristic representation sequence set corresponding to a preset category user information set.
S2, generating a user tag feature representation corresponding to the behavior feature representation sequence in the behavior feature representation sequence set.
S3, generating key behavior characteristic information corresponding to the preset category user.
As can be seen from fig. 4, the flow 400 of the method for generating prompt information in this embodiment represents the step of determining a similarity measure between the behavior feature expression sequence of the target user and the target behavior feature information as a measure value characterizing that the target user belongs to a preset category of users. Therefore, the scheme described in the embodiment can utilize the similarity comparison between the existing related information belonging to the preset category user and the behavior characteristic representation sequence of the target user, so that the mining of the potential abnormal behavior user is realized, and the recognition efficiency of the abnormal behavior user is improved.
With further reference to fig. 5, as an implementation of the method shown in the foregoing figures, the present disclosure provides an embodiment of an apparatus for generating a hint information, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2 or fig. 4, and the apparatus may be specifically applied in various electronic devices.
As shown in fig. 5, the apparatus 500 for generating a hint information provided in this embodiment includes a first obtaining unit 501, a first generating unit 502, and a hint unit 503. Wherein the first obtaining unit 501 is configured to obtain a behavior feature representation sequence of the target user, wherein the behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user in a preset time period, and the operation data comprise data representing operation types and operation attributes; the first generating unit 502 is configured to generate a metric value representing that the target user belongs to a preset class of users according to the behavior feature representation sequence of the target user; the prompting unit 503 is configured to generate prompting information for characterizing that the target user belongs to a preset category user in response to the comparison of the determined metric value and the preset threshold value.
In this embodiment, in the apparatus 500 for generating a hint information: the specific processing and the technical effects of the first obtaining unit 501, the first generating unit 502, and the prompting unit 503 may refer to the descriptions related to step 201, step 202, and step 203 in the corresponding embodiment of fig. 2, and are not repeated herein.
In some optional implementations of this embodiment, the first obtaining unit 501 may include a first obtaining subunit (not shown in the figure), an extracting subunit (not shown in the figure), a first generating subunit (not shown in the figure), and a second generating subunit (not shown in the figure). The first obtaining subunit may be configured to obtain operation data of the target user in a preset period of time. The extracting subunit may be configured to extract operation data belonging to a preset category from the operation data, and generate a target operation data set. The first generating subunit may be configured to generate, based on operation data corresponding to the same operation behavior in the target operation data set, a behavior feature representation corresponding to each operation behavior. The second generating subunit may be configured to generate the behavior feature representation sequence of the target user according to the time sequence of the operations corresponding to the generated behavior feature representations.
In some optional implementations of this embodiment, the first generating subunit may include a first generating module (not shown in the figure) and a second generating module (not shown in the figure). The first generating module may be configured to generate the one-hot codes corresponding to the operation behaviors according to the operation behaviors corresponding to the target operation data in the target operation data set. The second generation module may be configured to generate, according to the generated single thermal codes, behavioral characteristic representations corresponding to the single thermal codes by using a word vector model trained in advance.
In some optional implementations of this embodiment, the word vector model may be obtained through training: acquiring a sample user behavior characteristic representation sequence set; acquiring an initial word vector model; and training an initial word vector model by using a machine learning mode to obtain a word vector model by taking the behavior feature representations belonging to the same sample user behavior feature sequence in the sample user behavior feature representation sequence set as a positive sample group.
In some optional implementations of this embodiment, the first generating unit 502 may be further configured to: and inputting the behavior characteristic representation sequence of the target user into a pre-trained classification model to obtain probability for representing that the target user belongs to a preset class user as a metric value.
In some optional implementations of this embodiment, the first generating unit 502 may include: a second acquisition subunit (not shown), a determination subunit (not shown). The second obtaining subunit may be configured to obtain, as the target behavior feature information, behavior feature information corresponding to the user belonging to the preset category. The determining subunit may be configured to determine, as the metric value representing that the target user belongs to the preset category of users, a similarity metric between the behavior feature representation sequence of the target user and the target behavior feature information.
In some optional implementations of this embodiment, the behavior feature information may include a behavior feature representation. The determining subunit may include an acquisition module (not shown in the figure), a first determining module (not shown in the figure), a generating module (not shown in the figure), and a second determining module (not shown in the figure). The obtaining module may be configured to obtain, as the reference number, the number of the historical behavior feature expression sequences included in the preset historical behavior feature expression sequence set. The first determining module may be configured to determine, for a behavioral characteristic representation in a sequence of behavioral characteristic representations of the target user, a first measure of significance corresponding to the behavioral characteristic representation. Wherein the first saliency measure may be positively correlated with the reference number and negatively correlated with the number of historical behavior feature representation sequences in the set of historical behavior feature representation sequences that include a historical behavior feature representation that matches the behavior feature representation. The generating module may be configured to generate a user tag feature representation corresponding to the behavior feature representation sequence of the target user based on the determined first saliency measure. The second determining module may be configured to determine a similarity measure between the user tag feature representation and the target behavior feature information as a measure value characterizing that the target user belongs to a preset category of users.
In some optional implementations of this embodiment, the generating module may include a first determining sub-module (not shown in the figure), a second determining sub-module (not shown in the figure), and a selecting sub-module (not shown in the figure). Wherein the first determining submodule may be configured to determine a frequency measure in which each behavior feature representation of the target user appears in the behavior feature representation sequence of the target user. The second determination submodule may be configured to determine a second significance measure corresponding to the behavioral characteristic representation in the behavioral characteristic representation sequence of the target user. Wherein the second saliency measure may be positively correlated with the first saliency measure and positively correlated with the frequency measure. The selection sub-module may be configured to select a corresponding behavior feature representation as the user tag feature representation based on the determined second saliency measure.
In some optional implementations of this embodiment, the apparatus 500 for generating a hint information may further include: and the second generation unit is configured to generate a user behavior label corresponding to the behavior characteristic representation sequence of the target user based on the user label characteristic representation.
In some optional implementations of this embodiment, the apparatus 500 for generating a hint information may further include: a second acquisition unit (not shown in the figure), a clustering unit (not shown in the figure), a determination unit (not shown in the figure), a third generation unit (not shown in the figure). Wherein the second obtaining unit may be configured to obtain the set of associated user behavior feature representation sequences associated with the behavior feature representation sequence of the user. The associated user behavior feature identification sequence set may include at least one behavior feature representation sequence corresponding to a user belonging to a preset category of users. The clustering unit may be configured to cluster the associated user behavior feature representation sequences in the obtained associated user behavior feature representation sequence set, and generate a target number of clustering results. The determining unit may be configured to determine, as the target associated user behavior feature representation sequence, an associated user behavior feature representation sequence in which the behavior feature representation sequences corresponding to the users belonging to the preset category of users belong to the same clustering result. The third generating unit may be configured to generate prompt information for characterizing whether the user corresponding to the determined target associated user behavior feature expression sequence belongs to a preset category of users.
In some optional implementations of this embodiment, the apparatus 500 for generating a hint information may further include: and the adding unit is configured to add the identification information of the target user into a preset category user information set in response to receiving the information indicating that the target user indicated by the characterization determination prompt information belongs to the preset category user.
In some optional implementations of this embodiment, the apparatus 500 for generating a hint information may further include: a third acquisition unit (not shown in the figure), a fourth generation unit (not shown in the figure), and a fifth generation unit (not shown in the figure). The third obtaining unit may be configured to obtain a behavior feature representation sequence set corresponding to a preset category user information set. The fourth generating unit may be configured to generate a user tag feature representation corresponding to the behavior feature representation sequence in the behavior feature representation sequence set. The fifth generating unit may be configured to generate key behavior feature information corresponding to a preset category of users.
The apparatus provided in the above embodiment of the present disclosure acquires, through the first acquisition unit 501, a behavior feature representation sequence of a target user. Wherein the behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user in a preset time period, and the operation data comprise data representing operation types and operation attributes. Then, the first generating unit 502 generates a metric value representing that the target user belongs to the preset category user according to the behavior feature representation sequence of the target user. Finally, the prompting unit 503 generates prompting information for characterizing that the target user belongs to the preset category user in response to the comparison of the determined metric value and the preset threshold value. Therefore, effective utilization of the operation data is realized, detailed information such as time sequence, operation behavior relevance and the like contained in the operation data is more comprehensively embodied, and a technical basis is provided for improving the recognition degree of abnormal users.
Referring now to fig. 6, a schematic diagram of an electronic device (e.g., server in fig. 1) 600 suitable for use in implementing embodiments of the present disclosure is shown. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, etc., and a fixed terminal such as a digital TV, a desktop computer, etc. The server illustrated in fig. 6 is merely an example, and should not be construed as limiting the functionality and scope of use of the embodiments of the present disclosure in any way.
As shown in fig. 6, the electronic device 600 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 601, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data required for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
In general, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 607 including, for example, a liquid crystal display (LCD, liquid Crystal Display), a speaker, a vibrator, and the like; storage 608 including, for example, magnetic tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 shows an electronic device 600 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 6 may represent one device or a plurality of devices as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via communication means 609, or from storage means 608, or from ROM 602. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing means 601.
It should be noted that, the computer readable medium according to the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In an embodiment of the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. Whereas in embodiments of the present disclosure, the computer-readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (Radio Frequency), and the like, or any suitable combination thereof.
The computer readable medium may be contained in the server; or may exist alone without being assembled into the server. The computer readable medium carries one or more programs which, when executed by the server, cause the server to: acquiring a behavior feature representation sequence of a target user, wherein behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user in a preset time period, and the operation data comprise data representing operation types and operation attributes; generating a measurement value representing that the target user belongs to a preset class user according to the behavior characteristic representation sequence of the target user; and generating prompt information for representing that the target user belongs to the preset category user in response to the comparison of the determined metric value and the preset threshold value.
Computer program code for carrying out operations of embodiments of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments described in the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The described units may also be provided in a processor, for example, described as: a processor comprises a first acquisition unit, a first generation unit and a prompt unit. The names of these units do not constitute a limitation of the unit itself in some cases, for example, the first acquisition unit may also be described as "a unit that acquires a behavior feature representation sequence of the target user, wherein the behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user for a preset period of time, the operation data including data characterizing the operation type and the operation attribute".
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above technical features, but encompasses other technical features formed by any combination of the above technical features or their equivalents without departing from the spirit of the invention. Such as the above-described features, are mutually substituted with (but not limited to) the features having similar functions disclosed in the embodiments of the present disclosure.

Claims (13)

1. A method for generating a hint information, comprising:
acquiring a behavior feature representation sequence of a target user, wherein behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user in a preset time period, and the operation data comprise data representing operation types and operation attributes;
generating a metric value representing that the target user belongs to a preset class user according to the behavior characteristic representation sequence of the target user, wherein the method comprises the following steps: acquiring behavior characteristic information corresponding to a user belonging to the preset category as target behavior characteristic information, wherein the behavior characteristic information comprises behavior characteristic representation; acquiring the number of the historical behavior feature representation sequences included in a preset historical behavior feature representation sequence set as a reference number; for the behavior feature representation in the behavior feature representation sequence of the target user, determining a first significance measure corresponding to the behavior feature representation, wherein the first significance measure is positively correlated with the reference number, and is negatively correlated with the number of historical behavior feature representation sequences comprising the historical behavior feature representation matched with the behavior feature representation in the historical behavior feature representation sequence set; generating a user tag feature representation corresponding to the behavior feature representation sequence of the target user based on the determined first significance measure; determining similarity measurement between the user tag characteristic representation and the target behavior characteristic information as a measurement value for representing that the target user belongs to a preset class user;
And generating prompt information for representing that the target user belongs to a preset category user in response to determining that the metric value is compared with a preset threshold value.
2. The method of claim 1, wherein the obtaining a sequence of behavioral characteristic representations of the target user comprises:
acquiring operation data of the target user in a preset time period;
extracting operation data belonging to a preset category from the operation data to generate a target operation data set;
generating behavior characteristic representations corresponding to all operation behaviors based on operation data corresponding to the same operation behavior in the target operation data set;
and generating a behavior characteristic representation sequence of the target user according to the time sequence of the operation corresponding to the generated behavior characteristic representation.
3. The method of claim 2, wherein the generating behavior feature representations corresponding to the respective operation behaviors based on operation data corresponding to the same operation behavior in the target operation data set includes:
generating a single-hot code corresponding to each operation behavior according to the operation behavior corresponding to the target operation data in the target operation data set;
and generating behavior characteristic representations corresponding to the independent thermal codes by utilizing a pre-trained word vector model according to the generated independent thermal codes.
4. A method according to claim 3, wherein the word vector model is trained by:
acquiring a sample user behavior characteristic representation sequence set;
acquiring an initial word vector model;
and training the initial word vector model by using a machine learning mode to obtain the word vector model by taking the behavior characteristic representations belonging to the same sample user behavior characteristic sequence in the sample user behavior characteristic sequence set as a positive sample group.
5. The method of claim 1, wherein the generating, from the sequence of behavioral characteristic representations of the target user, a metric value characterizing the target user as belonging to a preset category of users comprises:
and inputting the behavior characteristic representation sequence of the target user into a pre-trained classification model to obtain the probability for representing the target user belonging to the preset class user as the measurement value.
6. The method of claim 1, wherein the generating a user tag feature representation corresponding to the sequence of behavioral feature representations of the target user based on the determined first saliency metric comprises:
determining a frequency measure of occurrence of each behavioral characteristic representation in the behavioral characteristic representation sequence of the target user;
Determining a second saliency measure corresponding to the behavior feature representation in the behavior feature representation sequence of the target user, wherein the second saliency measure is positively correlated with the first saliency measure and is positively correlated with the frequency measure;
and selecting a corresponding behavior characteristic representation as the user tag characteristic representation according to the determined second significance measure.
7. The method of claim 1, wherein the method further comprises:
and generating a user behavior label corresponding to the behavior characteristic representation sequence of the target user based on the user label characteristic representation.
8. The method of claim 1, wherein the method further comprises:
acquiring an associated user behavior feature representation sequence set associated with the user behavior feature representation sequence, wherein the associated user behavior feature identification sequence set comprises at least one behavior feature representation sequence corresponding to a user belonging to the preset category of users;
clustering the related user behavior feature representation sequences in the acquired related user behavior feature representation sequence set to generate a target number of clustering results;
determining an associated user behavior feature representation sequence corresponding to the user belonging to the preset category user as a target associated user behavior feature representation sequence, wherein the associated user behavior feature representation sequence belongs to the same clustering result;
And generating prompt information for representing whether the user corresponding to the determined target associated user behavior characteristic representation sequence belongs to a preset class of user.
9. The method according to one of claims 1-8, the method further comprising:
and in response to receiving information representing that the target user indicated by the prompt information belongs to a preset category user, adding the identification information of the target user into a preset category user information set.
10. The method of claim 9, the method further comprising:
acquiring a behavior characteristic representation sequence set corresponding to the preset category user information set;
generating a user tag feature representation corresponding to the behavior feature representation sequence in the behavior feature representation sequence set;
and generating key behavior characteristic information corresponding to the preset category user.
11. An apparatus for generating a hint information, comprising:
a first acquisition unit configured to acquire a behavior feature representation sequence of a target user, wherein behavior feature representations in the behavior feature representation sequence are generated based on operation data of the target user in a preset time period, the operation data including data characterizing an operation type and an operation attribute;
The first generating unit is configured to generate a metric value representing that the target user belongs to a preset class of users according to the behavior feature representation sequence of the target user, and comprises the following steps: acquiring behavior characteristic information corresponding to a user belonging to the preset category as target behavior characteristic information, wherein the behavior characteristic information comprises behavior characteristic representation; acquiring the number of the historical behavior feature representation sequences included in a preset historical behavior feature representation sequence set as a reference number; for the behavior feature representation in the behavior feature representation sequence of the target user, determining a first significance measure corresponding to the behavior feature representation, wherein the first significance measure is positively correlated with the reference number, and is negatively correlated with the number of historical behavior feature representation sequences comprising the historical behavior feature representation matched with the behavior feature representation in the historical behavior feature representation sequence set; generating a user tag feature representation corresponding to the behavior feature representation sequence of the target user based on the determined first significance measure; determining similarity measurement between the user tag characteristic representation and the target behavior characteristic information as a measurement value for representing that the target user belongs to a preset class user;
And the prompting unit is configured to generate prompting information for representing that the target user belongs to a preset category user in response to the comparison of the measurement value and a preset threshold value.
12. A server, comprising:
one or more processors;
a storage device having one or more programs stored thereon;
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-10.
13. A computer readable medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1-10.
CN202010893051.7A 2020-08-31 2020-08-31 Method, device, server and medium for generating prompt information Active CN113780318B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010893051.7A CN113780318B (en) 2020-08-31 2020-08-31 Method, device, server and medium for generating prompt information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010893051.7A CN113780318B (en) 2020-08-31 2020-08-31 Method, device, server and medium for generating prompt information

Publications (2)

Publication Number Publication Date
CN113780318A CN113780318A (en) 2021-12-10
CN113780318B true CN113780318B (en) 2024-04-16

Family

ID=78835242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010893051.7A Active CN113780318B (en) 2020-08-31 2020-08-31 Method, device, server and medium for generating prompt information

Country Status (1)

Country Link
CN (1) CN113780318B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114885183A (en) * 2022-04-21 2022-08-09 武汉斗鱼鱼乐网络科技有限公司 Method, device, medium and equipment for identifying gift package risk user

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109388548A (en) * 2018-09-29 2019-02-26 北京京东金融科技控股有限公司 Method and apparatus for generating information
CN109410036A (en) * 2018-10-09 2019-03-01 北京芯盾时代科技有限公司 A kind of fraud detection model training method and device and fraud detection method and device
WO2019052283A1 (en) * 2017-09-13 2019-03-21 乐蜜有限公司 Fraud prevention method, operation detection method and apparatus, and electronic device
CN110798440A (en) * 2019-08-13 2020-02-14 腾讯科技(深圳)有限公司 Abnormal user detection method, device and system and computer storage medium
CN111177725A (en) * 2019-12-31 2020-05-19 广州市百果园信息技术有限公司 Method, device, equipment and storage medium for detecting malicious click operation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10635565B2 (en) * 2017-10-04 2020-04-28 Servicenow, Inc. Systems and methods for robust anomaly detection
US11336668B2 (en) * 2019-01-14 2022-05-17 Penta Security Systems Inc. Method and apparatus for detecting abnormal behavior of groupware user

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019052283A1 (en) * 2017-09-13 2019-03-21 乐蜜有限公司 Fraud prevention method, operation detection method and apparatus, and electronic device
CN109388548A (en) * 2018-09-29 2019-02-26 北京京东金融科技控股有限公司 Method and apparatus for generating information
CN109410036A (en) * 2018-10-09 2019-03-01 北京芯盾时代科技有限公司 A kind of fraud detection model training method and device and fraud detection method and device
CN110798440A (en) * 2019-08-13 2020-02-14 腾讯科技(深圳)有限公司 Abnormal user detection method, device and system and computer storage medium
CN111177725A (en) * 2019-12-31 2020-05-19 广州市百果园信息技术有限公司 Method, device, equipment and storage medium for detecting malicious click operation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于聚类算法与序列异常技术的入侵检测新方法;刘绍海;刘青昆;安娜;顾跃举;;计算机安全(08);全文 *

Also Published As

Publication number Publication date
CN113780318A (en) 2021-12-10

Similar Documents

Publication Publication Date Title
US10558984B2 (en) Method, apparatus and server for identifying risky user
CN109104620B (en) Short video recommendation method and device and readable medium
CN109492772B (en) Method and device for generating information
CN112801719A (en) User behavior prediction method, user behavior prediction device, storage medium, and apparatus
CN111915086A (en) Abnormal user prediction method and equipment
CN114090601B (en) Data screening method, device, equipment and storage medium
US20230281696A1 (en) Method and apparatus for detecting false transaction order
CN109582854B (en) Method and apparatus for generating information
CN113392920B (en) Method, apparatus, device, medium, and program product for generating cheating prediction model
CN112995414B (en) Behavior quality inspection method, device, equipment and storage medium based on voice call
CN110674300A (en) Method and apparatus for generating information
CN113780318B (en) Method, device, server and medium for generating prompt information
CN111787042B (en) Method and device for pushing information
US20220198487A1 (en) Method and device for processing user interaction information
CN116578925A (en) Behavior prediction method, device and storage medium based on feature images
CN114493850A (en) Artificial intelligence-based online notarization method, system and storage medium
CN111127057B (en) Multi-dimensional user portrait recovery method
CN111897951A (en) Method and apparatus for generating information
CN116911304B (en) Text recommendation method and device
CN116776160B (en) Data processing method and related device
CN117172632B (en) Enterprise abnormal behavior detection method, device, equipment and storage medium
CN114417944B (en) Recognition model training method and device, and user abnormal behavior recognition method and device
CN113761387A (en) Object label determination method and device, electronic equipment and storage medium
CN113779647A (en) Method and apparatus for generating information
CN113779980A (en) Method and device for recognizing text

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant