CN111709765A - User portrait scoring method and device and storage medium - Google Patents

User portrait scoring method and device and storage medium Download PDF

Info

Publication number
CN111709765A
CN111709765A CN202010217512.9A CN202010217512A CN111709765A CN 111709765 A CN111709765 A CN 111709765A CN 202010217512 A CN202010217512 A CN 202010217512A CN 111709765 A CN111709765 A CN 111709765A
Authority
CN
China
Prior art keywords
behavior
behavior data
data sequence
target user
classifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010217512.9A
Other languages
Chinese (zh)
Inventor
雷璟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electronic Science Research Institute of CTEC
Original Assignee
Electronic Science Research Institute of CTEC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electronic Science Research Institute of CTEC filed Critical Electronic Science Research Institute of CTEC
Priority to CN202010217512.9A priority Critical patent/CN111709765A/en
Publication of CN111709765A publication Critical patent/CN111709765A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0203Market surveys; Market polls
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0282Rating or review of business operators or products

Abstract

The invention provides a user portrait scoring method, a user portrait scoring device and a storage medium, which are used for improving the accuracy and timeliness of user portrait evaluation. The user portrait scoring method comprises the following steps: acquiring a current behavior data sequence of a target user, wherein the current behavior data sequence comprises a plurality of behavior data; determining an abnormal score corresponding to the target user by utilizing a behavior classifier set obtained by pre-training based on the current behavior data sequence, wherein the behavior classifier set comprises a plurality of behavior classifiers, and each behavior classifier is obtained by using the historical single-class behavior data sequence of the target user through training; searching a corresponding first hidden state according to the observation state corresponding to the current behavior data sequence; and determining the portrait score corresponding to the target user according to the transition probability from the first hidden state to the second hidden state and the abnormal score.

Description

User portrait scoring method and device and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a user portrait scoring method, a user portrait scoring device and a storage medium.
Background
The user portrait is a way for depicting users and connecting user requirements and product design, and aims to extract a user information complete picture from massive user behavior data as comprehensively and meticulously as possible, so that the collected data has practical significance and value. With the expansion of application fields, the concept of user portrayal is gradually changing. Generally, a user portrait is a technology for tagging user information, and comprehensively mining a series of contents such as personal information, social information, and behavior information of a user to classify different users. The core of the method is to represent and store the potential intentions and interests of the user, and to summarize a readable and computable user model according to the basic information, browsing information, behavior preference and the like of the user.
In the field of network security, user profiling techniques are also used to detect and analyze potentially malicious behavior. A behavior baseline is established for normal operation by adopting a user portrait technology, and the change of a user behavior mode can be effectively detected by carrying out comparative analysis, so that the security risk of an enterprise is reduced to a certain extent.
From the technical scheme, the behavior modeling technical scheme based on the user portrait is integrally divided into three parts, namely data collection, data preprocessing and comprehensive analysis. Firstly, outputting standardized data with a uniform format by collecting data of multiple categories; then, defining a series of user tags for the attribute types of different tags to realize the pre-processing of the collected data; finally, label values are generated through a series of attributes, and the portrait label of the user is formed. On the basis of portrait label, a corresponding model can be constructed by using a rule or an algorithm scoring mode.
Scoring of the user may be accomplished using a scoring mechanism to process the image data. The scoring is divided into a template rule method and a scoring algorithm. The scoring standard of the template rule method is given by an engineer or a platform managing the privileged account at present, and according to the abnormal score of each abnormal behavior exceeding the baseline standard, the abnormal score value of the user portrait can be obtained, and the high-risk privileged user and the high-risk behavior thereof are reported.
In a traditional portrait scoring mechanism based on a template rule method, engineering personnel or a platform managing privileged accounts at present give a scoring standard, then risk scoring is obtained according to an abnormal access operation type label owned by a user, and finally the user with higher scoring rank and high-risk behaviors thereof are reported. However, the template rule method depends on manual definition and cannot dynamically follow user behaviors, so that the problems of false alarm and missed alarm of abnormal behaviors are caused, and the accuracy of portrait evaluation is reduced.
Disclosure of Invention
The embodiment of the invention provides a user portrait scoring method, a user portrait scoring device and a storage medium, which are used for improving the accuracy and timeliness of user portrait evaluation.
In a first aspect, a user portrait scoring method is provided, including:
acquiring a current behavior data sequence of a target user, wherein the current behavior data sequence comprises a plurality of behavior data;
determining an abnormal score corresponding to the target user by utilizing a behavior classifier set obtained by pre-training based on the current behavior data sequence, wherein the behavior classifier set comprises a plurality of behavior classifiers, and each behavior classifier is obtained by using the historical single-class behavior data sequence of the target user through training;
searching a corresponding first hidden state according to the observation state corresponding to the current behavior data sequence;
and determining the portrait score corresponding to the target user according to the transition probability from the first hidden state to the second hidden state and the abnormal score.
In one embodiment, any behavior classifier in the behavior classifier set is trained using the target user historical single-class behavior data sequence according to the following procedure:
dividing the historical single-class behavior data sequence of the target user into a plurality of behavior blocks at fixed time intervals according to a time sequence, wherein each behavior block comprises behavior data in the fixed time interval;
and respectively training by using the behavior data in each behavior block to obtain a corresponding behavior classifier.
In an embodiment, the training by using the behavior data in each behavior block to obtain a corresponding behavior classifier specifically includes:
and respectively utilizing the behavior data in each behavior block, and using a single-class support vector machine OC-SVM to train to obtain a corresponding behavior classifier.
In an embodiment, determining, based on the current behavior data sequence, an abnormal score corresponding to the target user by using a behavior classifier set obtained through pre-training specifically includes:
respectively inputting each behavior data contained in the current behavior data sequence into each behavior classifier contained in the behavior classifier set;
and determining the average value of the scores output by each behavior classifier as the abnormal score corresponding to the behavior data.
In one embodiment, the portrait score corresponding to the target user is determined according to the transition probability from the first hidden state to the second hidden state and the anomaly score according to the following method:
Figure BDA0002424954000000031
wherein:
s represents the portrait score corresponding to the target user;
P12representing the transition probability of the first hidden state to the second hidden state;
m represents the number of behavior data contained in the current behavior data sequence;
Siand representing the abnormal score corresponding to the ith behavior data contained in the current behavior data sequence.
In a second aspect, a user portrait scoring apparatus is provided, comprising:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a current behavior data sequence of a target user, and the current behavior data sequence comprises a plurality of behavior data;
a first determining unit, configured to determine, based on the current behavior data sequence, an abnormal score corresponding to the target user by using a behavior classifier set obtained through pre-training, where the behavior classifier set includes a plurality of behavior classifiers, and each behavior classifier is obtained through training using the historical single-class behavior data sequence of the target user;
the searching unit is used for searching a corresponding first hidden state according to the observation state corresponding to the current behavior data sequence;
and the second determining unit is used for determining the portrait score corresponding to the target user according to the transition probability from the first hidden state to the second hidden state and the abnormal score.
In an implementation manner, the apparatus for scoring a user portrait according to an embodiment of the present invention further includes:
the training unit is used for dividing the historical single-class behavior data sequence of the target user into a plurality of behavior blocks at fixed time intervals according to a time sequence, and each behavior block comprises behavior data in the fixed time interval; and respectively training by using the behavior data in each behavior block to obtain a corresponding behavior classifier.
In an embodiment, the training unit is specifically configured to use behavior data in each behavior block to perform training by using a single-class support vector machine OC-SVM to obtain a corresponding behavior classifier.
In an embodiment, the first determining unit is specifically configured to input each behavior data included in the current behavior data sequence into each behavior classifier included in the behavior classifier set respectively; and determining the average value of the scores output by each behavior classifier as the abnormal score corresponding to the behavior data.
In an embodiment, the second determining unit is specifically configured to determine the portrait score corresponding to the target user according to the following method:
Figure BDA0002424954000000041
wherein:
s represents the portrait score corresponding to the target user;
P12representing the transition probability of the first hidden state to the second hidden state;
m represents the number of behavior data contained in the current behavior data sequence;
Siand representing the abnormal score corresponding to the ith behavior data contained in the current behavior data sequence.
In a third aspect, a computing device is provided, the computing device comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of any of the methods described above.
In a fourth aspect, a computer storage medium is provided, on which a computer program is stored, which, when being executed by a processor, carries out the steps of any of the methods described above.
By adopting the technical scheme, the invention at least has the following advantages:
according to the user portrait scoring method, the device and the storage medium, the behavior classifier set is obtained by training the historical single-class behavior data sequence of the target user, the behavior mode change of the target user is learned along with the time, the online updating of the behavior mode is realized, the timeliness of user portrait evaluation is improved, on the other hand, the deviation degree of the user behavior is judged based on the combination of the deviation degree of the historical behavior and the single classification model and is used as abnormal scoring, information in the normal historical behavior of the user is used, the comparison between the current behavior and the historical behavior is facilitated, and the accuracy of user portrait evaluation is improved.
Drawings
FIG. 1 is a schematic diagram of a training process of a behavior classifier set according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating the operation of an OC-SVM in accordance with an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating a user representation scoring process according to an embodiment of the present invention;
FIG. 4a is a schematic diagram of the principle of using OC-SVM classification in combination with hidden Markov models to determine a user profile score in accordance with an embodiment of the present invention;
FIG. 4b is a schematic diagram of an implementation principle of a user-portrait-based behavior modeling technique according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a user image scoring apparatus according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a computing device according to an embodiment of the invention.
Detailed Description
To further explain the technical means and effects of the present invention adopted to achieve the intended purpose, the present invention will be described in detail with reference to the accompanying drawings and preferred embodiments.
First, some terms related to the embodiments of the present invention are explained to facilitate understanding by those skilled in the art.
It should be noted that the terms "first", "second", and the like in the description and the claims of the embodiments of the present invention and in the drawings described above are used for distinguishing similar objects and not necessarily for describing a particular order or sequence. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein.
Reference herein to "a plurality or a number" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
The invention aims to adopt a scoring mechanism based on historical behavior deviation, combine the historical deviation with a single classification model, and judge the deviation of user behavior as abnormal scoring. The method solves the error caused by a single classification model, establishes an integrated model by training models of a plurality of time blocks, respectively calculates and predicts the new abnormal sequences, comprehensively analyzes the results, and provides the final scoring result so as to more accurately depict the global behavior mode of the user.
In order to improve the accuracy of user portrait anomaly scoring, in the embodiment of the invention, a behavior classifier set is obtained by training a target user historical single-class behavior data sequence, wherein the historical single-class behavior data sequence comprises a plurality of behavior data generated by historical access of the target user. As shown in fig. 1, it is a schematic diagram of a training process of a behavior classifier set, and includes the following steps:
and S11, dividing the historical single-class behavior data sequence of the target user into a plurality of behavior blocks at fixed time intervals according to the time sequence.
In specific implementation, different types of historical behavior data of the user are collected, for example, target user behavior data contained in server log data, user behavior data generated by a web front end, traffic data of the target user, and the like are collected. Dividing the collected behavior data of each type into different behavior blocks by taking a certain fixed time window as a unit according to a time sequence, wherein each behavior block comprises the behavior data in the fixed time interval. In order to fully describe the behavior pattern of the target user in a period of time, in a specific implementation, each behavior block may contain behavior data of a user working day and a user resting day.
And S12, respectively training by using the behavior data in each behavior block to obtain a corresponding behavior classifier.
In one embodiment, the behavior data in each behavior block is used, and a one-class support vector machine (OC-SVM) can be used for training to obtain a corresponding behavior classifier. The OC-SVM can solve errors brought by a single classification model, and the new abnormal sequences can be calculated and predicted respectively by training the models of a plurality of time blocks to establish an integrated model. The OC-SVM mainly figures out the details of a single behavior of a user, and can effectively judge the abnormity of the single behavior.
In specific implementation, the set of classifiers M that holds the n most recent data blocks is { M1, M2, …, Mn }, and forms an OC-SVM cluster. When new data comes, taking the average value of the M classifier scores as the abnormal score of the new data, and simultaneously obtaining the corresponding access operation type label. As shown in fig. 2, it is a schematic diagram of the OC-SVM operation. The arrows in the figure indicate the dependencies between variables. At any one time, the value of the observation variable depends only on the corresponding state variable and is irrelevant to the values of other state variables and the observation variable. While the state yt at time t depends only on the state yt-1 at time t-1, independently of the remaining states.
Based on the trained behavior classifier set, in the embodiment of the present invention, the user portrait score may be determined according to the following process, as shown in fig. 3, the method may include the following steps:
and S31, acquiring the current behavior data sequence of the target user.
Wherein, the current behavior data sequence comprises a plurality of behavior data.
And S32, determining the abnormal score corresponding to the target user by using the behavior classifier set obtained by pre-training based on the current behavior data sequence.
Specifically, in this step, each behavior data included in the current behavior data sequence may be respectively input into each behavior classifier included in the behavior classifier set; and determining the average value of the scores output by each behavior classifier as the abnormal score corresponding to the behavior data.
And S33, searching a corresponding first hidden state according to the observation state corresponding to the current behavior data sequence.
In this step, a Hidden Markov Model (HMM) is used to model the time sequence, and the hidden Markov model is a dynamic bayesian network with the simplest structure, is a well-known directed graph model, and is mainly used for modeling time sequence data. A hidden markov model is a type of markov chain whose states are not directly observable, but are observable through a sequence of observation vectors, each of which is represented as a variety of states by some probability density distribution, each observation vector being generated by a sequence of states having a corresponding probability density distribution.
The correspondence between the observed state and the hidden state may be obtained by a method of thought judgment or machine learning, which is not limited in the embodiment of the present invention.
And S34, determining the portrait score corresponding to the target user according to the transition probability and the abnormal score of the first hidden state to the second hidden state.
In this step, according to the transition probability and the abnormal score from the first hidden state to the second hidden state, the portrait score corresponding to the target user can be determined according to the following formula:
Figure BDA0002424954000000081
wherein:
s represents the portrait score corresponding to the target user;
P12representing the transition probability of the first hidden state to the second hidden state;
m represents the number of behavior data contained in the current behavior data sequence;
Siand representing the abnormal score corresponding to the ith behavior data contained in the current behavior data sequence.
The transition probability of the first hidden state to the second hidden state can be obtained by a machine learning method.
FIG. 4a is a schematic diagram of the concept of using OC-SVM classification in combination with hidden Markov models to determine a user profile score. Sequence of behaviors { a11,a12,…,a1nForm an observation state x1,x1Corresponding hidden state y1。P12Indicating a hidden state y1To y2Probability of transition. When the new sequence { a }21,a22,…,a2mUpon arrival, the calculation P can be obtained12. For each action a2iThe abnormal score S of the behavior can be obtained by utilizing the previously trained OC-SVM classifier clusteri. Finally, the sketch score of the current behavior data sequence is
Figure BDA0002424954000000082
When in the hidden state y1To y2The probability of transition is very high, and when the activities in the behavior sequence are the activities frequently done by the user history, the value of the abnormal score S approaches to 1. On the contrary, when the hidden state y1To y2The probability of a transition is small, or when historical rare activity occurs in the sequence of behaviors, the S value approaches 0. Finally, according to the settingThe score threshold of (a) determines whether the current behavior is abnormal. When the abnormal behavior is judged, the system sends an alarm to the safety operation and maintenance personnel; and when the normal behavior is judged, storing the current behavior data to a full-text search engine, and updating the historical user behavior mode.
The portrait scoring can be used in the abnormal behavior monitoring scene, malicious actions existing inside the system can be visually displayed by scoring the behaviors of the user, and meanwhile, reference is provided for predicting the future behaviors of the abnormal user.
The scheme of the invention adopts a scoring mechanism based on the combination of historical behavior deviation and a single classification model to judge the deviation of the user behavior as abnormal scoring. According to the method, a Hidden Markov Model (HMM) is used for modeling the time sequence, an OC-SVM (one-class support vector machine) is used for solving errors caused by a single classification model, an integrated model is established by training models of a plurality of time blocks, new abnormal sequences are respectively calculated and predicted, and the final scoring result is given by comprehensively analyzing the result.
Fig. 4b is a schematic diagram illustrating an implementation principle of a behavior modeling technique based on a user portrait according to an embodiment of the present invention.
According to the method provided by the embodiment of the invention, a scoring mechanism based on historical behavior deviation is adopted, the historical deviation is combined with a single classification model, and the deviation of the user behavior is judged and used as the abnormal score. In a specific implementation process, the processed behavior category labels and behavior vectors are input into a behavior portrait processor for calculating an abnormal score. The method considers the daily operation details and pays attention to the business logic relationship, and integrates the behavior detail point information and the behavior process line information into the user behavior surface information. The behavior details fully use the information in the normal historical behaviors of the user, so that the current behavior and the historical behavior can be compared conveniently. The behavior sequence reveals the internal logic of the user to execute the business and shows the daily working habits of the user. The two are organically combined, and data support is provided for abnormal behavior judgment. The process can solve errors caused by a single classification model, an integrated model is established by training models of a plurality of time blocks, new abnormal sequences are respectively calculated and predicted, results are comprehensively analyzed, a final scoring result is given, and the global behavior pattern of the user can be accurately described. Meanwhile, an integrated learning method is utilized, and a near-term basic classifier is used for replacing an early classifier, so that the concept drift problem caused by the change of behaviors along with time can be solved to a certain extent. When the user behavior is judged to be normal, the user behavior is timely supplemented into the historical behavior library, and the user portrait can be conveniently updated in time.
The user portrait scoring method provided by the embodiment of the invention adopts a scoring mechanism combining historical deviation and a single classification model to judge the deviation of user behaviors. Through experimental verification, the OCSVM is used as a basic classifier, so that the influence caused by the problems of misinformation and missing report caused by data overfitting in a single model can be effectively reduced, the change of a user behavior pattern can be learned along with the lapse of time, the online updating of the behavior pattern is realized, and the robustness and the stability of modeling are improved; the hidden Markov model is adopted to extract the user global behavior sequence, so that the service logic behind the hidden behavior can be disclosed, the transition probability of the service state can be predicted, and the user global behavior pattern can be better described.
Based on the same technical concept, an embodiment of the present invention further provides a user portrait scoring apparatus, as shown in fig. 5, including:
an obtaining unit 51, configured to obtain a current behavior data sequence of a target user, where the current behavior data sequence includes a plurality of behavior data;
a first determining unit 52, configured to determine, based on the current behavior data sequence, an abnormal score corresponding to the target user by using a behavior classifier set obtained through pre-training, where the behavior classifier set includes a plurality of behavior classifiers, and each behavior classifier is obtained through training using the historical single-class behavior data sequence of the target user;
the searching unit 53 is configured to search, according to the observation state corresponding to the current behavior data sequence, a first hidden state corresponding to the current behavior data sequence;
and a second determining unit 54, configured to determine an image score corresponding to the target user according to the transition probability from the first hidden state to the second hidden state and the anomaly score.
In an implementation manner, the apparatus for scoring a user portrait according to an embodiment of the present invention further includes:
the training unit is used for dividing the historical single-class behavior data sequence of the target user into a plurality of behavior blocks at fixed time intervals according to a time sequence, and each behavior block comprises behavior data in the fixed time interval; and respectively training by using the behavior data in each behavior block to obtain a corresponding behavior classifier.
In an embodiment, the training unit is specifically configured to use behavior data in each behavior block to perform training by using a single-class support vector machine OC-SVM to obtain a corresponding behavior classifier.
In an embodiment, the first determining unit is specifically configured to input each behavior data included in the current behavior data sequence into each behavior classifier included in the behavior classifier set respectively; and determining the average value of the scores output by each behavior classifier as the abnormal score corresponding to the behavior data.
In an embodiment, the second determining unit is specifically configured to determine the portrait score corresponding to the target user according to the following method:
Figure BDA0002424954000000111
wherein:
s represents the portrait score corresponding to the target user;
P12representing the transition probability of the first hidden state to the second hidden state;
m represents the number of behavior data contained in the current behavior data sequence;
Siand representing the abnormal score corresponding to the ith behavior data contained in the current behavior data sequence.
For convenience of description, the above parts are separately described as modules (or units) according to functional division. Of course, the functionality of the various modules (or units) may be implemented in the same or in multiple pieces of software or hardware in practicing the invention.
Having described the user representation scoring method and apparatus of an exemplary embodiment of the present invention, a computing apparatus according to another exemplary embodiment of the present invention is next described.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
In some possible embodiments, a computing device according to the present invention may include at least one processor, and at least one memory. Wherein the memory stores program code that, when executed by the processor, causes the processor to perform the steps of the user representation scoring method according to various exemplary embodiments of the present invention described above in this specification. For example, the processor may execute step S31 shown in fig. 3, obtaining a current behavior data sequence of the target user, and step S32, determining an anomaly score corresponding to the target user by using a pre-trained behavior classifier set based on the current behavior data sequence; step S33, searching a corresponding first hidden state according to the observation state corresponding to the current behavior data sequence; and step S34, determining the portrait score corresponding to the target user according to the transition probability and the abnormal score of the first hidden state to the second hidden state.
The computing device 60 according to this embodiment of the invention is described below with reference to fig. 6. The computing device 60 shown in fig. 6 is only an example and should not impose any limitations on the functionality or scope of use of embodiments of the present invention.
As shown in fig. 6, the computing apparatus 60 is in the form of a general purpose computing device. Components of computing device 60 may include, but are not limited to: the at least one processor 61, the at least one memory 62, and a bus 63 connecting the various system components (including the memory 62 and the processor 61).
Bus 63 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a processor, or a local bus using any of a variety of bus architectures.
The memory 62 may include readable media in the form of volatile memory, such as Random Access Memory (RAM)621 and/or cache memory 622, and may further include Read Only Memory (ROM) 623.
The memory 62 may also include a program/utility 625 having a set (at least one) of program modules 624, such program modules 624 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Computing device 60 may also communicate with one or more external devices 64 (e.g., keyboard, pointing device, etc.), with one or more devices that enable a user to interact with computing device 60, and/or with any devices (e.g., router, modem, etc.) that enable computing device 60 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 65. Also, computing device 60 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) through network adapter 66. As shown, network adapter 66 communicates with other modules for computing device 60 over bus 63. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with computing device 60, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
In some possible embodiments, the various aspects of the user representation scoring method provided by the present invention may also be implemented as a program product, which includes program code for causing a computer device to perform the steps of the user representation scoring method according to various exemplary embodiments of the present invention described above in this specification when the program product runs on the computer device, for example, the computer device may perform step S31 shown in fig. 3, obtain a current behavior data sequence of a target user, and step S32, determine an abnormal score corresponding to the target user by using a pre-trained behavior classifier set based on the current behavior data sequence; step S33, searching a corresponding first hidden state according to the observation state corresponding to the current behavior data sequence; and step S34, determining the portrait score corresponding to the target user according to the transition probability and the abnormal score of the first hidden state to the second hidden state.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A program product for user representation scoring in accordance with embodiments of the present invention may employ a portable compact disk read-only memory (CD-ROM) and include program code, and may be run on a computing device. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device over any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., over the internet using an internet service provider).
While the present invention has been described with reference to particular embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A user portrait scoring method, comprising:
acquiring a current behavior data sequence of a target user, wherein the current behavior data sequence comprises a plurality of behavior data;
determining an abnormal score corresponding to the target user by utilizing a behavior classifier set obtained by pre-training based on the current behavior data sequence, wherein the behavior classifier set comprises a plurality of behavior classifiers, and each behavior classifier is obtained by using the historical single-class behavior data sequence of the target user through training;
searching a corresponding first hidden state according to the observation state corresponding to the current behavior data sequence;
and determining the portrait score corresponding to the target user according to the transition probability from the first hidden state to the second hidden state and the abnormal score.
2. The method of claim 1, wherein any behavior classifier in the set of behavior classifiers is trained using the target user historical single-class behavior data sequence according to the following procedure:
dividing the historical single-class behavior data sequence of the target user into a plurality of behavior blocks at fixed time intervals according to a time sequence, wherein each behavior block comprises behavior data in the fixed time interval;
and respectively training by using the behavior data in each behavior block to obtain a corresponding behavior classifier.
3. The method according to claim 2, wherein the training is performed by using the behavior data in each behavior block to obtain a corresponding behavior classifier, which specifically comprises:
and respectively utilizing the behavior data in each behavior block, and using a single-class support vector machine OC-SVM to train to obtain a corresponding behavior classifier.
4. The method according to claim 3, wherein determining the abnormality score corresponding to the target user by using a pre-trained behavior classifier set based on the current behavior data sequence specifically includes:
respectively inputting each behavior data contained in the current behavior data sequence into each behavior classifier contained in the behavior classifier set;
and determining the average value of the scores output by each behavior classifier as the abnormal score corresponding to the behavior data.
5. The method according to any one of claims 1 to 4, wherein the portrait score corresponding to the target user is determined according to the transition probability from the first hidden state to the second hidden state and the anomaly score according to the following method:
Figure FDA0002424953990000021
wherein:
s represents the portrait score corresponding to the target user;
P12representing the transition probability of the first hidden state to the second hidden state;
m represents the number of behavior data contained in the current behavior data sequence;
Siand representing the abnormal score corresponding to the ith behavior data contained in the current behavior data sequence.
6. A user representation scoring device, comprising:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a current behavior data sequence of a target user, and the current behavior data sequence comprises a plurality of behavior data;
a first determining unit, configured to determine, based on the current behavior data sequence, an abnormal score corresponding to the target user by using a behavior classifier set obtained through pre-training, where the behavior classifier set includes a plurality of behavior classifiers, and each behavior classifier is obtained through training using the historical single-class behavior data sequence of the target user;
the searching unit is used for searching a corresponding first hidden state according to the observation state corresponding to the current behavior data sequence;
and the second determining unit is used for determining the portrait score corresponding to the target user according to the transition probability from the first hidden state to the second hidden state and the abnormal score.
7. The apparatus of claim 6, further comprising:
the training unit is used for dividing the historical single-class behavior data sequence of the target user into a plurality of behavior blocks at fixed time intervals according to a time sequence, and each behavior block comprises behavior data in the fixed time interval; and respectively training by using the behavior data in each behavior block to obtain a corresponding behavior classifier.
8. The method of claim 7,
the training unit is specifically configured to use the behavior data in each behavior block to train by using a single-class support vector machine OC-SVM to obtain a corresponding behavior classifier.
9. A computing device, the computing device comprising: memory, processor and computer program stored on the memory and executable on the processor, which computer program, when executed by the processor, carries out the steps of the method according to any one of claims 1 to 5.
10. A computer storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
CN202010217512.9A 2020-03-25 2020-03-25 User portrait scoring method and device and storage medium Pending CN111709765A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010217512.9A CN111709765A (en) 2020-03-25 2020-03-25 User portrait scoring method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010217512.9A CN111709765A (en) 2020-03-25 2020-03-25 User portrait scoring method and device and storage medium

Publications (1)

Publication Number Publication Date
CN111709765A true CN111709765A (en) 2020-09-25

Family

ID=72536253

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010217512.9A Pending CN111709765A (en) 2020-03-25 2020-03-25 User portrait scoring method and device and storage medium

Country Status (1)

Country Link
CN (1) CN111709765A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112395331A (en) * 2020-11-17 2021-02-23 平安科技(深圳)有限公司 User portrayal method, apparatus, device and medium for credit card client
CN112488507A (en) * 2020-11-30 2021-03-12 广东电网有限责任公司 Expert classification portrait method and device based on clustering and storage medium
CN112651433A (en) * 2020-12-17 2021-04-13 广州锦行网络科技有限公司 Abnormal behavior analysis method for privileged account
CN112995331A (en) * 2021-03-25 2021-06-18 绿盟科技集团股份有限公司 User behavior threat detection method and device and computing equipment
CN114153716A (en) * 2022-02-08 2022-03-08 中国电子科技集团公司第五十四研究所 Real-time portrait generation method for people and nobody objects under semantic information exchange network
CN114500075A (en) * 2022-02-11 2022-05-13 中国电信股份有限公司 User abnormal behavior detection method and device, electronic equipment and storage medium
CN116150541A (en) * 2023-04-19 2023-05-23 中国信息通信研究院 Background system identification method, device, equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106911668A (en) * 2017-01-10 2017-06-30 同济大学 A kind of identity identifying method and system based on personal behavior model
CN107402921A (en) * 2016-05-18 2017-11-28 阿里巴巴集团控股有限公司 Identify event-order serie data processing method, the apparatus and system of user behavior
CN108305094A (en) * 2017-12-18 2018-07-20 北京三快在线科技有限公司 A kind of user's behavior prediction method and device, electronic equipment
CN108881194A (en) * 2018-06-07 2018-11-23 郑州信大先进技术研究院 Enterprises user anomaly detection method and device
CN109684543A (en) * 2018-12-14 2019-04-26 北京百度网讯科技有限公司 User's behavior prediction and information distribution method, device, server and storage medium
CN110852442A (en) * 2019-10-29 2020-02-28 支付宝(杭州)信息技术有限公司 Behavior identification and model training method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107402921A (en) * 2016-05-18 2017-11-28 阿里巴巴集团控股有限公司 Identify event-order serie data processing method, the apparatus and system of user behavior
CN106911668A (en) * 2017-01-10 2017-06-30 同济大学 A kind of identity identifying method and system based on personal behavior model
CN108305094A (en) * 2017-12-18 2018-07-20 北京三快在线科技有限公司 A kind of user's behavior prediction method and device, electronic equipment
CN108881194A (en) * 2018-06-07 2018-11-23 郑州信大先进技术研究院 Enterprises user anomaly detection method and device
CN109684543A (en) * 2018-12-14 2019-04-26 北京百度网讯科技有限公司 User's behavior prediction and information distribution method, device, server and storage medium
CN110852442A (en) * 2019-10-29 2020-02-28 支付宝(杭州)信息技术有限公司 Behavior identification and model training method and device

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112395331B (en) * 2020-11-17 2023-10-10 平安科技(深圳)有限公司 User portrait method, device, equipment and medium for credit card customer
CN112395331A (en) * 2020-11-17 2021-02-23 平安科技(深圳)有限公司 User portrayal method, apparatus, device and medium for credit card client
CN112488507A (en) * 2020-11-30 2021-03-12 广东电网有限责任公司 Expert classification portrait method and device based on clustering and storage medium
CN112651433A (en) * 2020-12-17 2021-04-13 广州锦行网络科技有限公司 Abnormal behavior analysis method for privileged account
CN112651433B (en) * 2020-12-17 2021-12-14 广州锦行网络科技有限公司 Abnormal behavior analysis method for privileged account
CN112995331B (en) * 2021-03-25 2022-11-22 绿盟科技集团股份有限公司 User behavior threat detection method and device and computing equipment
CN112995331A (en) * 2021-03-25 2021-06-18 绿盟科技集团股份有限公司 User behavior threat detection method and device and computing equipment
CN114153716B (en) * 2022-02-08 2022-05-06 中国电子科技集团公司第五十四研究所 Real-time portrait generation method for people and nobody objects under semantic information exchange network
CN114153716A (en) * 2022-02-08 2022-03-08 中国电子科技集团公司第五十四研究所 Real-time portrait generation method for people and nobody objects under semantic information exchange network
CN114500075A (en) * 2022-02-11 2022-05-13 中国电信股份有限公司 User abnormal behavior detection method and device, electronic equipment and storage medium
CN114500075B (en) * 2022-02-11 2023-11-07 中国电信股份有限公司 User abnormal behavior detection method and device, electronic equipment and storage medium
CN116150541A (en) * 2023-04-19 2023-05-23 中国信息通信研究院 Background system identification method, device, equipment and storage medium
CN116150541B (en) * 2023-04-19 2023-06-23 中国信息通信研究院 Background system identification method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111709765A (en) User portrait scoring method and device and storage medium
US10977293B2 (en) Technology incident management platform
US20200019893A1 (en) Preventative diagnosis prediction and solution determination of future event using internet of things and artificial intelligence
US8365019B2 (en) System and method for incident management enhanced with problem classification for technical support services
US20200379868A1 (en) Anomaly detection using deep learning models
US11176206B2 (en) Incremental generation of models with dynamic clustering
US8676726B2 (en) Automatic variable creation for adaptive analytical models
JP2021524954A (en) Anomaly detection
US20070094216A1 (en) Uncertainty management in a decision-making system
CN104765733A (en) Method and device for analyzing social network event
JP2018503206A (en) Technical and semantic signal processing in large unstructured data fields
KR102281819B1 (en) Auto Encoder Ensemble Based Anomaly Detection Method and System
CN111427974A (en) Data quality evaluation management method and device
US20220318681A1 (en) System and method for scalable, interactive, collaborative topic identification and tracking
CN110457595A (en) Emergency event alarm method, device, system, electronic equipment and storage medium
CN113986674A (en) Method and device for detecting abnormity of time sequence data and electronic equipment
Liu et al. Residual useful life prognosis of equipment based on modified hidden semi-Markov model with a co-evolutional optimization method
US11816080B2 (en) Severity computation of anomalies in information technology operations
CN116186603A (en) Abnormal user identification method and device, computer storage medium and electronic equipment
CN115619245A (en) Portrait construction and classification method and system based on data dimension reduction method
CN111126629B (en) Model generation method, brush list identification method, system, equipment and medium
CN113778792A (en) Alarm classification method and system for IT equipment
Zhang et al. Lstfcfedlear: A LSTM-FC with vertical federated learning network for fault prediction
Anand et al. Anomaly Detection in Disaster Recovery: A Review, Current Trends and New Perspectives
CN117272170B (en) Knowledge graph-based IT operation and maintenance fault root cause analysis method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200925

RJ01 Rejection of invention patent application after publication