CN111178421A - Method, device, medium and electronic equipment for detecting user state - Google Patents

Method, device, medium and electronic equipment for detecting user state Download PDF

Info

Publication number
CN111178421A
CN111178421A CN201911352487.9A CN201911352487A CN111178421A CN 111178421 A CN111178421 A CN 111178421A CN 201911352487 A CN201911352487 A CN 201911352487A CN 111178421 A CN111178421 A CN 111178421A
Authority
CN
China
Prior art keywords
user
cluster
state
users
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911352487.9A
Other languages
Chinese (zh)
Other versions
CN111178421B (en
Inventor
李嘉晨
郭凯
付东东
刘雷
刘洋
胡磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beike Technology Co Ltd
Original Assignee
Beike Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beike Technology Co Ltd filed Critical Beike Technology Co Ltd
Priority to CN201911352487.9A priority Critical patent/CN111178421B/en
Publication of CN111178421A publication Critical patent/CN111178421A/en
Application granted granted Critical
Publication of CN111178421B publication Critical patent/CN111178421B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A method, apparatus, medium, and electronic device for detecting a user state are disclosed. The method comprises the following steps: acquiring service behavior characteristic information of a plurality of users to be detected according to service data of the users to be detected; clustering the service behavior characteristic information of each of the plurality of users to be detected to obtain at least one cluster; acquiring a cluster representative user of the cluster; determining the user state of the cluster representative user according to the service behavior characteristic information of the cluster representative user; and determining the user states of the users to be detected in the cluster in which the cluster representative user is located according to the user states of the cluster representative users. The technical scheme provided by the disclosure is beneficial to improving the accuracy of the user state detection result.

Description

Method, device, medium and electronic equipment for detecting user state
Technical Field
The present disclosure relates to network technologies, and in particular, to a method for detecting a user status, an apparatus for detecting a user status, a storage medium, and an electronic device.
Background
In some business fields, a business provider generally provides business services to users in the form of APP (application), website, and client. For a service provider, each user currently receiving its service is usually in a certain user state in a user life cycle.
Service providers often have a need to know the current user status of each user, so as to provide different service services for different users.
How to accurately detect the user state where the user is currently located so as to provide proper service for the user is a technical problem of great concern.
Disclosure of Invention
The present disclosure is proposed to solve the above technical problems. The embodiment of the disclosure provides a method for detecting a user state, a device for detecting the user state, a storage medium and an electronic device.
According to an aspect of an embodiment of the present disclosure, there is provided a method of detecting a user status, the method including: acquiring service behavior characteristic information of a plurality of users to be detected according to service data of the users to be detected; clustering the service behavior characteristic information of each of the plurality of users to be detected to obtain at least one cluster; acquiring a cluster representative user of the cluster; determining the user state of the cluster representative user according to the service behavior characteristic information of the cluster representative user; and determining the user states of the users to be detected in the cluster in which the cluster representative user is located according to the user states of the cluster representative users.
In an embodiment of the present disclosure, the acquiring, according to service data of a plurality of users to be detected, service behavior feature information of the plurality of users to be detected includes: and acquiring at least one of service behavior characteristic information based on time invariant attributes, service behavior characteristic information based on service behavior times and/or service behavior resource consumption statistics and service behavior characteristic information based on time sequence variables of the plurality of users to be detected according to the service data of the plurality of users to be detected.
In another embodiment of the present disclosure, the acquiring a cluster of the cluster represents a user, including: screening out clusters containing no more than a preset number of cluster nodes from the at least one cluster; and taking the user to be detected corresponding to the central node of the screened cluster as a cluster representative user.
In another embodiment of the present disclosure, the determining, according to the service behavior feature information of the cluster representative user, a user state of the cluster representative user includes: the service behavior characteristic information of the cluster representative user is used as the input of a classifier for predicting the user state and is provided for the classifier; and determining the user state of the cluster representing the user according to the classification prediction result output by the classifier.
In yet another embodiment of the present disclosure, the method further comprises: respectively acquiring the service data of the seed user in each user state; forming service behavior characteristic information of the seed user in each user state according to the service data of the seed user in each user state; generating a plurality of training samples according to the service behavior characteristic information of the seed user in each user state; training the classifier using the plurality of training samples; for any user state, the seed user of the user state is the user in the user state at a historical time.
In another embodiment of the present disclosure, before the step of obtaining the service data of the seed user in each user state, the method further includes: and determining users whose service data contain the state flag information or service data meeting the state condition in the service data of each user according to the preset state flag information or state condition corresponding to each user state, and taking the determined users as seed users of the corresponding user states.
In another embodiment of the present disclosure, the forming the service behavior feature information of the seed user in each user state according to the service data of the seed user in each user state includes: for any seed user in any user state, acquiring the service behavior characteristic information of the seed user according to the service data which is positioned before the time point of the state mark information in the service data of the seed user or the service data which is positioned before the time point meeting the state condition.
In another embodiment of the present disclosure, the generating a plurality of training samples according to the service behavior feature information of the seed user in each user state includes: for any state flag information or any state condition, according to the flag behavior information or the user state corresponding to the state condition, setting a user state label for the service behavior feature information of the corresponding seed user, and generating a training sample.
In yet another embodiment of the present disclosure, the training the classifier using the plurality of training samples includes: respectively providing training samples corresponding to the user states to the classifier; adjusting model parameters of the classifier according to the difference between the classification prediction result output by the classifier aiming at each training sample and the user state label of the corresponding training sample; and the training samples corresponding to the user states provided for the classifier are the same in number.
According to another aspect of the embodiments of the present disclosure, there is provided an apparatus for detecting a user status, the apparatus including: the information acquisition module is used for acquiring the service behavior characteristic information of each of a plurality of users to be detected according to the service data of the users to be detected; the cluster processing module is used for carrying out cluster processing on the service behavior characteristic information of each of the plurality of users to be detected to obtain at least one cluster; the acquisition representative user module is used for acquiring cluster representative users of the clusters; the first state determining module is used for determining the user state of the cluster representative user according to the service behavior characteristic information of the cluster representative user; and the second state determining module is used for determining the user states of the users to be detected in the cluster in which the cluster representative user is located according to the user states of the cluster representative users.
In an embodiment of the present disclosure, the information obtaining module is further configured to: and acquiring at least one of service behavior characteristic information based on time invariant attributes, service behavior characteristic information based on service behavior times and/or service behavior resource consumption statistics and service behavior characteristic information based on time sequence variables of the plurality of users to be detected according to the service data of the plurality of users to be detected.
In yet another embodiment of the present disclosure, the obtaining representative user module is further configured to: screening out clusters containing no more than a preset number of cluster nodes from the at least one cluster; and taking the user to be detected corresponding to the central node of the screened cluster as a cluster representative user.
In yet another embodiment of the present disclosure, the first determination status module is further configured to: the service behavior characteristic information of the cluster representative user is used as the input of a classifier for predicting the user state and is provided for the classifier; and determining the user state of the cluster representing the user according to the classification prediction result output by the classifier.
In yet another embodiment of the present disclosure, the apparatus further includes: a training module to: respectively acquiring the service data of the seed user in each user state; forming service behavior characteristic information of the seed user in each user state according to the service data of the seed user in each user state; generating a plurality of training samples according to the service behavior characteristic information of the seed user in each user state; training the classifier using the plurality of training samples; for any user state, the seed user of the user state is the user in the user state at a historical time.
In yet another embodiment of the present disclosure, the apparatus further includes: and the seed user determining module is used for determining the users of which the service data of each user contains the state flag information or the service data meeting the state condition according to the preset state flag information or the state condition corresponding to each user state, and taking the determined users as seed users of the corresponding user states.
In yet another embodiment of the present disclosure, the training module is further configured to: for any seed user in any user state, acquiring the service behavior characteristic information of the seed user according to the service data which is positioned before the time point of the state mark information in the service data of the seed user or the service data which is positioned before the time point meeting the state condition.
In yet another embodiment of the present disclosure, the training module is further configured to: for any state flag information or any state condition, according to the flag behavior information or the user state corresponding to the state condition, setting a user state label for the service behavior feature information of the corresponding seed user, and generating a training sample.
In yet another embodiment of the present disclosure, the training module is further configured to: respectively providing training samples corresponding to the user states to the classifier; adjusting model parameters of the classifier according to the difference between the classification prediction result output by the classifier aiming at each training sample and the user state label of the corresponding training sample; and the training samples corresponding to the user states provided for the classifier are the same in number.
According to still another aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium storing a computer program for executing the above method of detecting a user status.
According to still another aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing the processor-executable instructions; the processor is used for reading the executable instruction from the memory and executing the instruction to realize the method for detecting the user state.
Based on the method and the device for detecting the user state provided by the embodiment of the disclosure, the users to be detected with slightly changed (such as slightly changed days) service behavior feature information can be gathered together by clustering the service behavior feature information of each of the users to be detected; by using the user state of the cluster representative user in the cluster obtained by the clustering process as the user state of each user to be detected in the cluster representative user, the phenomenon that the user states of two users to be detected are changed due to the slight change of the service behavior characteristic information of the two users to be detected belonging to the same cluster can be avoided, so that the technical scheme disclosed by the invention has the resistance to the influence of the slight change of the service behavior characteristic information on the user states. Therefore, the technical scheme provided by the disclosure is beneficial to improving the accuracy of the user state detection result.
The technical solution of the present disclosure is further described in detail by the accompanying drawings and examples.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.
The present disclosure may be more clearly understood from the following detailed description, taken with reference to the accompanying drawings, in which:
FIG. 1 is a schematic diagram of one embodiment of a suitable scenario for use with the present disclosure;
FIG. 2 is a flow chart of one embodiment of a method of detecting a user state of the present disclosure;
FIG. 3 is a schematic diagram of one embodiment of a cluster obtained by the clustering process of the present disclosure;
FIG. 4 is a flow diagram of one embodiment of training a classifier of the present disclosure;
FIG. 5 is a schematic diagram illustrating an embodiment of an apparatus for detecting a user status according to the present disclosure;
fig. 6 is a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
Example embodiments according to the present disclosure will be described in detail below with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of the embodiments of the present disclosure and not all embodiments of the present disclosure, with the understanding that the present disclosure is not limited to the example embodiments described herein.
It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.
It will be understood by those of skill in the art that the terms "first," "second," and the like in the embodiments of the present disclosure are used merely to distinguish one element from another, and are not intended to imply any particular technical meaning, nor is the necessary logical order between them.
It is also understood that in embodiments of the present disclosure, "a plurality" may refer to two or more than two and "at least one" may refer to one, two or more than two.
It is also to be understood that any reference to any component, data, or structure in the embodiments of the disclosure, may be generally understood as one or more, unless explicitly defined otherwise or stated otherwise.
In addition, the term "and/or" in the present disclosure is only one kind of association relationship describing the associated object, and means that there may be three kinds of relationships, such as a and/or B, and may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the former and latter associated objects are in an "or" relationship.
It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and the same or similar parts may be referred to each other, so that the descriptions thereof are omitted for brevity.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
Embodiments of the present disclosure may be implemented in electronic devices such as terminal devices, computer systems, servers, etc., which are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with an electronic device, such as a terminal device, computer system, or server, include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set top boxes, programmable consumer electronics, network pcs, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above, and the like.
Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be implemented in a distributed cloud computing environment. In a distributed cloud computing environment, tasks may be performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.
Summary of the disclosure
In the process of implementing the present disclosure, the inventor finds that, in the process of detecting the user state, there may be two different users with similar service behavior characteristic information, but the detected user states are different. I.e. small changes in the traffic behavior characteristic information (e.g. small changes in the number of days, etc.), may cause the user status to change. This phenomenon adversely affects the accuracy of user status detection.
In the process of detecting the user state, if the phenomenon of user state change caused by small change of the service behavior characteristic information can be avoided, the accuracy of the user state detection result is improved, and therefore the service provider is enabled to provide more appropriate service for the user.
Brief description of the drawings
The technology for detecting the user state provided by the present disclosure is applied to an example of the real estate field, as shown in fig. 1.
In fig. 1, a house renting and selling service provider provides a house renting and selling service to each user through a server 102. Assume that an APP for house rental is installed in smart mobile phone 100 of user 101. When a user 101 has a house renting and selling demand, the APP in the smart mobile phone 100 is opened, the server 102 of the house renting and selling service provider pushes a main page of the APP to the smart mobile phone 100, and then the server 102 of the house renting and selling service provider can push corresponding page information to the smart mobile phone 100 according to operations of clicking or sliding a screen of the user 101 and the like. The server 102 of the service provider can obtain the service behavior feature information of the user 101 according to the historical operation log of the APP used by the user 101, and determine the current user state of the user 101 by using the service behavior feature information of the user 101, so that a more accurate house renting and selling service is provided for the user 101, and finally the purpose of recommending houses meeting the needs of the user 101 is achieved.
Exemplary method
Fig. 2 is a flowchart illustrating an embodiment of a method for detecting a user status according to the present disclosure. The method of the embodiment shown in fig. 2 comprises the steps of: s200, S201, S202, S203, and S204. The following describes each step.
S200, acquiring the service behavior characteristic information of a plurality of users to be detected according to the service data of the users to be detected.
The user to be detected in the present disclosure may refer to a user who needs to perform user state detection. The service data of the user to be detected in the present disclosure may include: information formed based on various actions performed by a user for a service provided by a service provider. The various actions performed by the user may include: an inline action or an offline action. The online action may refer to an action (e.g., browsing, leaving a message, or logging in, etc.) formed based on a network access operation of the user. An offline action may refer to a user's physical action (e.g., a field visit to a house or an incoming call consultation or an incoming store consultation, etc.). The offline actions of the user can be input into the corresponding system by the staff of the service provider (such as a real estate agent) so as to form the service data of the user.
In one example, the service data of a user to be detected may include: all records in the service provider's log that are relevant to the user to be detected. The service behavior feature information of the user to be detected in the present disclosure may refer to information for describing a behavior of the user to be detected, which is related to a service provided by a service provider.
The method and the device can acquire the service data of the user who executes the corresponding action within the latest time period (such as the latest 30 days) from the data warehouse of the service provider, and the acquired service data is the service data of the user to be detected. In addition, the method and the system can also obtain the service data of the user to be detected in other modes. For example, the service data of each user to be detected is obtained from a data warehouse of the service provider by means of retrieval (e.g., retrieval using an identifier of the user to be detected, etc.).
For any user to be detected, the method and the device can obtain the service behavior characteristic information of the user to be detected by searching the corresponding field in the service data of the user to be detected, and performing calculation, statistics and summarization (such as date calculation, frequency statistics, flow summarization and the like) and the like on the searched corresponding field. The content specifically included in the service behavior feature information of the user to be detected can be set according to the actual service condition.
S201, clustering the service behavior characteristic information of a plurality of users to be detected to obtain at least one cluster.
The method and the device can adopt a density-based clustering algorithm to cluster the service behavior characteristic information of a plurality of users to be detected to obtain at least one cluster. For example, the present disclosure may employ a DBSCAN (density based spatial clustering of Applications with Noise application) algorithm to cluster the service behavior feature information of a plurality of users to be detected, so as to obtain one or more clusters. The distances between all nodes in each cluster and the cluster center node are usually smaller than a preset distance threshold. For example, the present disclosure may perform a clustering process operation according to a preset hyper-parameter of a clustering algorithm, such as a minimum number of nodes in a cluster and a distance threshold, so as to form a cluster, where distances between all nodes in the formed cluster and a cluster center node are smaller than the preset distance threshold. The distance threshold value in the super-parameter can be set to be smaller, and the minimum node number in the cluster in the super-parameter can be set to be larger, so that each cluster obtained by clustering can be a high-density cluster with all nodes in the cluster distributed compactly.
One cluster obtained by the clustering process of the present disclosure is shown in fig. 3. Each node within circle 300 in fig. 3 forms a cluster, and the black node 301 therein is the cluster center node.
S202, acquiring a cluster representative user of the cluster.
Each node in a cluster in the present disclosure corresponds to a user, and different nodes correspond to different users. The cluster representative user of a cluster must correspond to a node in the cluster, and the cluster representative user of a cluster should correspond to a node located at the center of the cluster as much as possible. For example, the present disclosure may take a user corresponding to a cluster center node of a cluster as a cluster representative user of the cluster. For another example, the present disclosure may also use a user corresponding to a node closest to the cluster center node in one cluster as a cluster representative user of the cluster.
In the general case, one cluster has one cluster representing a user. Of course, this disclosure also does not exclude the case where one cluster has two or three or more clusters representing users. In addition, the method and the device can record the corresponding relation among the user identification of the cluster representative user, the cluster identification of the cluster representative user and the user identification of the user to be detected, which corresponds to other nodes of the cluster representative user, so that in the subsequent steps, the user state of the corresponding user to be detected can be set according to the user state of the corresponding cluster to be detected.
S203, determining the user state of the cluster representative user according to the service behavior characteristic information of the cluster representative user.
The method and the system can set the user life cycle according to the actual service condition, and the user life cycle can be a non-closed-loop user life cycle. A user lifecycle comprises a plurality of user states, i.e. a plurality of user states form a user lifecycle.
Each user state in the present disclosure has a corresponding feature, and when the feature of the service behavior feature information of one user is similar to the feature corresponding to a certain user state, the user can be considered to be in the corresponding user state. The method and the device can determine the user state of the cluster representative user by carrying out processing such as classification detection on the service behavior characteristic information of the cluster representative user.
The cluster has the condition that two or three or more clusters represent users, the cluster representative user can be randomly selected from a plurality of cluster representative users, and the user state of the cluster representative user is determined by using the service behavior characteristic information of the randomly selected cluster representative user. The user state of each cluster representing user can be determined according to the service characteristic information of each cluster representing user.
S204, determining the user states of the users to be detected in the cluster in which the cluster representative user is located according to the user states of the cluster representative users.
In this disclosure, a cluster represents that a user only corresponds to one cluster, and users corresponding to all nodes in the cluster include: the cluster represents users and the non-cluster represents users, and the cluster represents users and the non-cluster represents users belong to users to be detected. All users corresponding to all nodes in a cluster in the present disclosure typically have one and the same user state.
In the case where a cluster has a cluster representative user, the present disclosure may treat the user status of the cluster representative user in the cluster as the user status of all non-cluster representative users in the cluster.
Under the condition that one cluster has two or three or more clusters to represent users, the reliability of the user states of the multiple clusters representing users can be compared, and the user state with the highest reliability in the user states of the multiple clusters representing users is used as the user states of all the users to be detected corresponding to all nodes in the cluster where the cluster representing users are located. For example, assuming that one cluster has three clusters representing users, the present disclosure determines, according to the service behavior information of the three clusters representing users, that the user state of the first cluster representing user is user state 1 and the reliability is 0.7, that the user state of the second cluster representing user is user state 2 and the reliability is 0.65, and that the user state of the third cluster representing user is user state 3 and the reliability is 0.8; in the above case, the present disclosure may use the user state 3 as the user state of all the users to be detected corresponding to each node in the cluster where the cluster represents the user.
In the case that one cluster has two or three or more cluster representative users, the present disclosure may also use the user states of most users as the user states of all users to be detected corresponding to all nodes in the cluster where the cluster representative user is located. For example, assuming that one cluster has three clusters to represent users, the user states of two of the three clusters representing users are user state 1, and the user state of the other cluster representing user is user state 2, since user state 1 is the user state of most users, the present disclosure may use user state 1 as the user states of all users to be detected corresponding to all nodes in the cluster where the cluster representing users are located.
Under the condition that the corresponding relation among the user identification of the cluster representative user, the cluster identification of the cluster where the cluster representative user is located and the user identification of the to-be-detected user corresponding to other nodes of the cluster where the cluster representative user is located is recorded, the user identification of the to-be-detected user corresponding to other nodes in the corresponding cluster can be obtained according to the user identification of the cluster representative user and the cluster identification, and the user states are set for the to-be-detected users. Meanwhile, the method and the device can record the corresponding relation among the user identification of the cluster representative user, the cluster identification of the cluster where the cluster representative user is located, the user identification of the user to be detected corresponding to other nodes of the cluster where the cluster representative user is located and the user state.
According to the method and the device, the users to be detected with the service behavior characteristic information with tiny changes (such as tiny change of days) can be gathered together by clustering the service behavior characteristic information of the users to be detected; by using the user state of the cluster representative user in the cluster obtained by the clustering process as the user state of each user to be detected in the cluster representative user, the phenomenon that the user states of two users to be detected are changed due to the slight change of the service behavior characteristic information of the two users to be detected belonging to the same cluster can be avoided, so that the technical scheme disclosed by the invention has the resistance to the influence of the slight change of the service behavior characteristic information on the user states. Therefore, the technical scheme provided by the disclosure is beneficial to improving the accuracy of the user state detection result.
In an optional example, the present disclosure may obtain one or more of the service behavior feature information based on the time-invariant attribute, the service behavior feature information based on the service behavior frequency and/or the service behavior resource consumption statistic, and the service behavior feature information based on the time-series variable of the user to be detected by performing processing such as calculation, statistical summarization (e.g., date calculation, frequency statistics, traffic summarization, etc.) on corresponding fields in the service data of the user to be detected. The content specifically included in the service behavior feature information of the user to be detected may be set according to an actual service situation, and the above is only an example.
Optionally, the service behavior feature information based on the time invariant attribute of the user to be detected may include: in a time period, the user-side software and hardware basic condition information of the service data is generated based on the operation of various actions executed by the user to be detected, and the software and hardware basic condition information does not change along with the change of time in the time period. For example, the service behavior feature information based on the time-invariant property may include: the type of the terminal device used by the user to be detected when the user performs network access (such as the type of the smart mobile phone), the source channel of the application program used by the user to be detected (such as the download source of the APP of the user to be detected), and the like.
Optionally, the service behavior feature information of the user to be detected based on the service behavior frequency and/or the service behavior resource consumption statistics may refer to: and respectively carrying out statistical processing on a plurality of fields related to the business behavior action in the business data of the user to be detected according to the actual requirements of specific businesses to obtain results. For example, for the real estate field, the service behavior feature information of the user to be detected based on the service behavior frequency and/or the service behavior resource consumption statistics of the present disclosure may include: the PV (Page View) amount of the user, the Page View duration of the user, the house-viewing times of the user in the near N days, the user entrustment times in the near N days, the times of the opportunities of business opportunities generated by the user in the near N days, the times of business opportunities generated by the user in the near N days, and the like.
Optionally, the service behavior feature information of the user to be detected based on the time sequence variable may refer to: and reflecting the condition that the user to be detected executes the corresponding action from the time coordinate. For example, for the real estate field, the time-series variable-based business behavior feature information of the user to be detected may include: the last network/field access by the user, the first network/field access by the user, and the first look-at-room time by the user, etc.
According to the method and the device, the service data of the user to be detected are correspondingly processed, the service behavior characteristic information of the user to be detected based on the time-invariant attribute, the service behavior characteristic information based on the service behavior frequency and/or service behavior resource consumption statistics and the service behavior characteristic information based on the time sequence variable are obtained, the service behavior characteristic of the user to be detected can be reflected from multiple angles, and therefore the clustering processing accuracy is improved.
In an alternative example, the present disclosure may screen all clusters obtained by the clustering process, and select only clusters from the screened satisfactory clusters to represent users. The requirement for screening in the present disclosure may be a requirement set for the number of cluster nodes included in a cluster.
Optionally, the present disclosure may screen out clusters including clusters with a number of nodes not exceeding a predetermined number (e.g., 100) from all clusters obtained by clustering, where the screened clusters may be referred to as high-density small clusters, and the high-density small clusters may also be referred to as high-density small clusters; then, the user to be detected corresponding to the center node of the screened high-density small cluster can be used as a cluster representative user. Of course, the present disclosure may also use the center node in the screened high-density small cluster and the users to be detected corresponding to n (for example, n is 1 or 2) nodes closest to the center node as the cluster representative users. In addition, the present disclosure may use all users to be detected corresponding to n (for example, n is 1 or 2) nodes closest to the central node in the screened high-density small cluster as cluster representative users. For clusters containing more than a predetermined number of cluster nodes, the present disclosure may not perform an operation of selecting a cluster representative user thereon.
According to the method and the device, the number of the cluster nodes contained in the cluster is limited, so that the business behavior characteristic information of all the users to be detected, which belong to the same cluster, is slightly changed, the phenomenon that the business behavior characteristic information of different users to be detected is slightly changed when the number of the cluster nodes contained in one cluster is too large can be avoided, and the accuracy of the user state detection result can be improved.
In one optional example, the present disclosure may utilize a pre-set classifier to determine a user state in which a cluster represents a user. Specifically, the present disclosure may provide the service behavior feature information of the cluster representative user as an input of a classifier for predicting a user state to the classifier, so as to perform state classification prediction processing on the service behavior feature information of the cluster representative user through the classifier, thereby obtaining a classification prediction result output by the classifier, and then, the present disclosure may determine the user state of the cluster representative user according to the classification prediction result output by the classifier.
Optionally, the classifier of the present disclosure may be a multi-class classifier based on machine learning. For example, the classifier may be an XGboost-based multi-class classifier. The types of user states that can be predicted by the classifier of the present disclosure are generally related to training of the classifier, which can be referred to in the following description with respect to fig. 4.
Optionally, the classification prediction result output by the classifier of the present disclosure may be: the confidence level (i.e., confidence level) for each of all user states. For example, assuming that all the preset user states include three user states, namely user state 1, user state 2 and user state 3, the classifier outputs three reliabilities for the service behavior feature information of an input cluster representative user, where the first credibility represents the probability that the cluster representative user is in user state 1, the second credibility represents the probability that the cluster representative user is in user state 2, and the third credibility represents the probability that the cluster representative user is in user state 3. The main goal of the present disclosure for training the classifier is to optimize the confidence level of the classifier output.
According to the cluster user state prediction method and device, the user state prediction processing is carried out on the user state through the classifier, and the user state of the cluster representative user can be conveniently and accurately obtained.
In an alternative example, one example of the training of the classifier by the present disclosure is illustrated in FIG. 4.
In fig. 4, S400 obtains service data of seed users in each user state.
Optionally, in a case that service data of a user indicates that the user is in a user state once, the present disclosure may use the user as a seed user of the user state. That is, for any user state, the seed user for that user state is the user in that user state at a historical time.
Optionally, the present disclosure may determine the seed user in each user state by using preset state flag information or state conditions corresponding to each user state. For example, for any user, the present disclosure may determine whether the service data of the user includes status flag information or whether related information in the service data of the user conforms to a status condition, and if the service data of the user includes status flag information of a user status or related information in the service data of the user conforms to a status condition, the present disclosure may use the user as a seed user of the user status corresponding to the status flag information or the status condition. The state labeling information may be a specific value of a field content in the service data. The state condition may be that an actual value of a corresponding field (e.g., a time field, etc.) in the service data satisfies a corresponding condition (e.g., a time condition), etc. The status flag information and the status condition should be set according to the actual service situation, which is not limited by the present disclosure.
Alternatively, for the real estate domain, it is assumed that all user states in this disclosure may include: an online induction period, a silent period, an online active period, an online maturation period, an offline induction period, an offline active period, and an offline maturation period, these seven user states. If the service data of a user includes information for representing that the user accesses the house renting and selling service for the first time through an APP or other tools within a preset time range (for example, 180 days), the user can be used as a seed user of an online lead-in period. If the service data of a user can indicate the time (such as days) from the last time the user accesses the house renting and selling service through an APP and other tools to the current time, and exceeds a preset time interval (such as 14 days and the like), the user can be used as a seed user of a silent period. If a user's business data may indicate that the user performed a business activity (e.g., accessing a house rental business through a tool such as APP or accessing the field) within a predetermined time range (e.g., the previous week) before the current time, the user may be used as a seed user for the online active period. The user may jump from an online induction period to an online active period. A user may become a seed user of an online maturity stage if the user's business data includes information that characterizes the user's first occurrence of an online delegation activity (e.g., a user's online activity makes a property broker the user's exclusive server). The user may jump from the online active period to the online mature period. A user may become a seed user for an offline lead-in period if the user's traffic data includes information that characterizes the user's first occurrence of offline delegation, such as a property broker setting a user as his own. The user may jump from the online active period to the offline lead-in period. If a user's business data includes information that characterizes the user's first occurrence of a live viewing behavior, the user can become a seed user for offline activity. The user may jump from an online maturation period or an offline lead-in period to an offline active period. If the business data of a user comprises information for representing the behavior of renting and selling businesses of the house source, the user can become a seed user of the offline maturity period. The user may jump from the offline active period to the offline mature period. In addition, the user may hop from a silent period to an online active period or an offline introductory period. The foregoing seven user states, the seed users in the seven user states, and the jump among the seven user states are only examples, and the user states, the seed users in each user state, and the jump among the user states may be set according to actual service conditions, which is not limited in this disclosure.
Optionally, the present disclosure may obtain service data of a plurality of users from a data warehouse of a service provider, and determine whether each user can become a seed user of a corresponding user state by using state flag information or state conditions of each user state, so as to obtain service data of various sub-users. According to the method and the device, the seed users in the user states are set, so that the training samples can be formed quickly and accurately.
In addition, it should be noted that, the service data acquisition time windows of the seed users in each user state are usually different, for example, in the acquisition time windows with the same duration, the number of seed users in some user states is larger, and the number of seed users in some user states (such as offline maturity) is smaller, which results in fewer training samples in some user states, and if the classifier is more sensitive to the imbalance of the number of samples in different categories, the training effect of the classifier is greatly affected. According to the method and the device, the service data acquisition time windows of the seed users in different user states are different, for example, the service data acquisition time window in the offline maturity period is longer than the service data acquisition time windows of other user states, so that the phenomenon that training samples of part of user states are fewer is avoided, and the training effect of the classifier is improved.
S401, according to the service data of the seed user in each user state, forming service behavior characteristic information of the seed user in each user state.
For any seed user in any user state, the method and the system can acquire the service behavior characteristic information of the seed user according to the service data, located before the state flag information time point, in the service data of the seed user. The present disclosure may also obtain the service behavior feature information of the seed user according to the service data located before the time point satisfying the state condition in the service data of the seed user. Under the condition that the service data of one user indicates that the user is in a plurality of user states successively, the method and the system can form the service behavior characteristic information of the seed user of the latter user state by the service data between the time points of the two user states.
Optionally, for any seed user, the present disclosure may obtain the service behavior feature information of the seed user by searching a corresponding field in the service data of the seed user, and performing processing such as calculation and statistics summary (for example, date calculation, number statistics, flow summary) on the searched corresponding field. The content specifically included in the service behavior feature information of the seed user can be set according to the actual service condition. For example, the present disclosure may obtain the service behavior feature information of the seed user based on the time-invariant attribute, the service behavior feature information based on the service behavior frequency and/or service behavior resource consumption statistics, and the service behavior feature information based on the time sequence variable by performing calculation, statistical summarization (such as date calculation, frequency statistics, traffic summarization, etc.) and other processing on corresponding fields in the service data of the seed user.
Optionally, the service behavior feature information based on the time-invariant attribute of the seed user may include: in a time period, the user-side software and hardware basic condition information of the service data is generated based on the operations of various actions executed by the seed user, and the software and hardware basic condition information does not change along with the change of time in the time period. For example, the service behavior feature information based on the time-invariant property may include: the type of the terminal device used by the seed user when performing network access (such as the model of the smart mobile phone), the source channel of the application used by the seed user (such as the download source of the APP of the seed user), and the like.
Optionally, the service behavior feature information of the seed user based on the service behavior frequency and/or the service behavior resource consumption statistics may refer to: and respectively carrying out statistical treatment on a plurality of fields related to the business behavior action in the business data of the seed user according to the actual requirement of the specific business to obtain a result. For example, for the housing field, the service behavior feature information of the seed user based on the service behavior times and/or service behavior resource consumption statistics of the present disclosure may include: the PV amount of the seed user, the page browsing duration of the seed user, the house-watching times of the seed user with the seed in the last N days, the entrustment times of the seed user in the last N days, the times of the opportunities of the business opportunities generated by the seed user in the last N days, the business opportunities generated by the seed user in the last N days and the like.
Optionally, the service behavior feature information based on the time sequence variable of the seed user may refer to: and reflecting the situation that the seed user performs corresponding actions from the time coordinate. For example, for the real estate field, the time-series variable-based business behavior feature information of the seed user of the present disclosure may include: the last network/field access by the seed user, the first room look-at-room time by the seed user, etc.
S402, generating a plurality of training samples according to the service behavior characteristic information of the seed user in each user state.
Optionally, in the present disclosure, the service behavior feature information of each seed user corresponds to a state label information or corresponds to a state condition, each state label information corresponds to a user state, and each state condition corresponds to a user state. The method and the device can determine the user state corresponding to the business behavior feature information of the seed user according to the state label information or the state condition corresponding to the business behavior feature information of the seed user.
And S403, training the classifier by using a plurality of training samples.
Optionally, in the present disclosure, a plurality of training samples corresponding to each user state may be provided to the classifier, the classifier performs classification prediction processing on each input training sample, and outputs a classification prediction processing result for each input training sample, that is, a probability (which may be regarded as a confidence) that each training sample belongs to each user state. According to the method and the device, the loss can be calculated by using the corresponding loss function according to the difference between the classification prediction result output by the classifier aiming at each training sample and the user state label of the corresponding training sample, and the model parameter of the classifier is adjusted by using the calculated loss.
Optionally, the number of training samples corresponding to each user state provided to the classifier is the same. For example, in the case where the user states in the present disclosure are the seven user states exemplified above, the present disclosure provides the classifier with the number of all training samples of 7 × M (M is an integer greater than 0, for example, M is 3 ten thousand), and the number of training samples of each user state provided to the classifier is M.
Of course, the number of training samples provided by the present disclosure for each user state of the classifier may be approximately the same, rather than exactly the same. For example, the difference in the number of training samples for any two user states does not exceed a predetermined difference. For another example, the ratio of the number of training samples for any two user states is not greater than the predetermined ratio.
Under the condition that the classifier is unbalanced and sensitive to the number of training samples of different classes, the training effect of the classifier can be improved by enabling the number of the training samples corresponding to the user states provided for the classifier to be completely the same or approximately the same.
Optionally, the present disclosure may divide the training samples into a training set and a test set, and the present disclosure may train the classifier using the training samples in the training set, and detect the classification effect of the classifier using the training samples in the test set. And when the detection result does not meet the requirement, training the classifier by using the training samples in the training set.
It should be noted that, in the present disclosure, it is learned through practical experiments that the more training samples are used for training the classifier, the better the number is, when the number of training samples exceeds a certain number, the excessive training samples cannot make the classifier learn new knowledge, and therefore, the number of training samples should be controlled within a certain number (e.g. 3 ten thousand). Furthermore, the method and the device can perform clustering processing on the training samples to obtain a plurality of clusters, and can remove part of training samples which do not accord with cognition by performing multi-dimensional analysis on each training sample in each cluster, and also can correct the user state labels of part of training samples to eliminate the influence of improper training samples on the training result of the classifier. In addition, the user corresponding to the training sample of the present disclosure may also be used as the user to be detected, that is, the training sample may also be used as the service behavior feature information of the user to be detected.
Exemplary devices
Fig. 5 is a schematic structural diagram of an embodiment of an apparatus for detecting a user status according to the present disclosure. The apparatus of this embodiment may be used to implement the method embodiments of the present disclosure described above.
As shown in fig. 5, the apparatus of the present embodiment may include: an information acquisition module 500, a cluster processing module 501, an acquisition representative user module 502, a first determination status module 503, and a second determination status module 504. Optionally, the apparatus of this embodiment may further include: a training module 505 and a determine seed user module 506.
The information obtaining module 500 is configured to obtain service behavior feature information of each of the multiple users to be detected according to the service data of the multiple users to be detected. For example, the obtaining information module 500 may obtain at least one of service behavior feature information based on time invariant attributes, service behavior feature information based on service behavior times and/or service behavior resource consumption statistics, and service behavior feature information based on time sequence variables of a plurality of users to be detected according to service data of the plurality of users to be detected.
The clustering module 501 is configured to perform clustering on the service behavior feature information of each of the multiple users to be detected, which is acquired by the information acquiring module 500, to obtain at least one cluster.
The obtain representative user module 502 is configured to obtain a cluster representative user of the clusters obtained by the cluster processing module 501. For example, the acquiring representative user module 502 may screen out clusters that include no more than a predetermined number of cluster nodes from all clusters acquired by the clustering module 501; the acquiring representative user module 502 may use the to-be-detected user corresponding to the center node of the screened cluster as a cluster representative user.
The first state determining module 503 is configured to determine the user state of the cluster representative user according to the service behavior feature information of the cluster representative user obtained by the obtaining representative user module 502. For example, the first status determining module 503 may provide the service behavior feature information of the cluster representing the user to the classifier as an input of the classifier for predicting the user status; the first determination status module 503 may determine the user status of the cluster representing the user according to the classification prediction result output by the classifier.
The second status determining module 504 is configured to determine, according to the user status of the cluster representative user determined by the first status determining module 503, the user status of each user to be detected in the cluster where the cluster representative user is located.
The training module 505 is configured to obtain service data of the seed user in each user state, respectively, and form service behavior feature information of the seed user in each user state according to the service data of the seed user in each user state, the training module 505 generates a plurality of training samples according to the service behavior feature information of the seed user in each user state, and the training module 505 trains the classifier by using the plurality of training samples. For any user state, the seed user of the user state is the user in the user state at a historical time.
Optionally, for any seed user in any user state, the training module 505 may obtain the service behavior feature information of the seed user according to the service data located before the time point of the state flag information in the service data of the seed user or the service data located before the time point satisfying the state condition.
Optionally, for any state flag information or any state condition, the training module 506 may set a user state label for the service behavior feature information of the corresponding seed user according to the flag behavior information or the user state corresponding to the state condition, and generate a training sample.
Optionally, the training module 506 may provide the training samples corresponding to the user states to the classifier, and adjust the model parameters of the classifier according to the difference between the classification prediction result output by the classifier for each training sample and the user state label of the corresponding training sample. The training module 506 provides the same number of training samples corresponding to each user state to the classifier.
The seed user determining module 506 is configured to determine, according to the preset state flag information or state condition corresponding to each user state, a user whose service data includes the state flag information or satisfies the state condition in the service data of each user, and the seed user determining module 506 uses the determined user as a seed user in the corresponding user state.
Exemplary electronic device
An electronic device according to an embodiment of the present disclosure is described below with reference to fig. 6. FIG. 6 shows a block diagram of an electronic device in accordance with an embodiment of the disclosure. As shown in fig. 6, the electronic device 61 includes one or more processors 611 and a memory 612.
The processor 611 may be a Central Processing Unit (CPU) or other form of processing unit having the capability to detect user status and/or instruction execution capability, and may control other components in the electronic device 61 to perform desired functions.
The memory 612 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory, for example, may include: random Access Memory (RAM) and/or cache memory (cache), etc. The nonvolatile memory, for example, may include: read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer-readable storage medium and executed by processor 611 to implement the methods of detecting user status of the various embodiments of the present disclosure described above and/or other desired functionality. Various contents such as an input signal, a signal component, a noise component, etc. may also be stored in the computer-readable storage medium.
In one example, the electronic device 61 may further include: an input device 613, an output device 614, etc., which are interconnected by a bus system and/or other form of connection mechanism (not shown). The input device 613 may also include, for example, a keyboard, a mouse, and the like. The output device 614 can output various information to the outside. The output devices 614 may include, for example, a display, speakers, printer, and communication network and remote output devices connected thereto, among others.
Of course, for simplicity, only some of the components of the electronic device 61 relevant to the present disclosure are shown in fig. 6, omitting components such as buses, input/output interfaces, and the like. In addition, the electronic device 61 may include any other suitable components, depending on the particular application.
Exemplary computer program product and computer-readable storage Medium
In addition to the above-described methods and apparatus, embodiments of the present disclosure may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the method of detecting a user state according to various embodiments of the present disclosure described in the "exemplary methods" section above of this specification.
The computer program product may write program code for carrying out operations for embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform steps in a method of detecting a user state according to various embodiments of the present disclosure described in the "exemplary methods" section above in this specification.
The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium may include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.
In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts in the embodiments are referred to each other. For the system embodiment, since it basically corresponds to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, and systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," comprising, "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".
The methods and apparatus of the present disclosure may be implemented in a number of ways. For example, the methods and apparatus of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
It is also noted that in the devices, apparatuses, and methods of the present disclosure, each component or step can be decomposed and/or recombined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects, and the like, will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims (10)

1. A method of detecting a user status, comprising:
acquiring service behavior characteristic information of a plurality of users to be detected according to service data of the users to be detected;
clustering the service behavior characteristic information of each of the plurality of users to be detected to obtain at least one cluster;
acquiring a cluster representative user of the cluster;
determining the user state of the cluster representative user according to the service behavior characteristic information of the cluster representative user;
and determining the user states of the users to be detected in the cluster in which the cluster representative user is located according to the user states of the cluster representative users.
2. The method according to claim 1, wherein the acquiring the service behavior feature information of the users to be detected according to the service data of the users to be detected comprises:
and acquiring at least one of service behavior characteristic information based on time invariant attributes, service behavior characteristic information based on service behavior times and/or service behavior resource consumption statistics and service behavior characteristic information based on time sequence variables of the plurality of users to be detected according to the service data of the plurality of users to be detected.
3. The method of claim 1 or 2, wherein the obtaining the cluster of the cluster represents a user, comprising:
screening out clusters containing no more than a preset number of cluster nodes from the at least one cluster;
and taking the user to be detected corresponding to the central node of the screened cluster as a cluster representative user.
4. The method according to any one of claims 1 to 3, wherein the determining the user state of the cluster representative user according to the service behavior feature information of the cluster representative user comprises:
the service behavior characteristic information of the cluster representative user is used as the input of a classifier for predicting the user state and is provided for the classifier;
and determining the user state of the cluster representing the user according to the classification prediction result output by the classifier.
5. The method of claim 4, wherein the method further comprises:
respectively acquiring the service data of the seed user in each user state;
forming service behavior characteristic information of the seed user in each user state according to the service data of the seed user in each user state;
generating a plurality of training samples according to the service behavior characteristic information of the seed user in each user state;
training the classifier using the plurality of training samples;
for any user state, the seed user of the user state is the user in the user state at a historical time.
6. The method according to claim 5, wherein before the step of separately obtaining the service data of the seed user in each user state, the method further comprises:
and determining users whose service data contain the state flag information or service data meeting the state condition in the service data of each user according to the preset state flag information or state condition corresponding to each user state, and taking the determined users as seed users of the corresponding user states.
7. The method according to claim 6, wherein the forming the service behavior feature information of the seed user in each user state according to the service data of the seed user in each user state comprises:
for any seed user in any user state, acquiring the service behavior characteristic information of the seed user according to the service data which is positioned before the time point of the state mark information in the service data of the seed user or the service data which is positioned before the time point meeting the state condition.
8. An apparatus for detecting a user state, wherein the apparatus comprises:
the information acquisition module is used for acquiring the service behavior characteristic information of each of a plurality of users to be detected according to the service data of the users to be detected;
the cluster processing module is used for carrying out cluster processing on the service behavior characteristic information of each of the plurality of users to be detected to obtain at least one cluster;
the acquisition representative user module is used for acquiring cluster representative users of the clusters;
the first state determining module is used for determining the user state of the cluster representative user according to the service behavior characteristic information of the cluster representative user;
and the second state determining module is used for determining the user states of the users to be detected in the cluster in which the cluster representative user is located according to the user states of the cluster representative users.
9. A computer-readable storage medium, the storage medium storing a computer program for performing the method of any of the preceding claims 1-7.
10. An electronic device, the electronic device comprising:
a processor;
a memory for storing the processor-executable instructions;
the processor is configured to read the executable instructions from the memory and execute the instructions to implement the method of any one of claims 1-7.
CN201911352487.9A 2019-12-25 2019-12-25 Method, device, medium and electronic equipment for detecting user state Active CN111178421B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911352487.9A CN111178421B (en) 2019-12-25 2019-12-25 Method, device, medium and electronic equipment for detecting user state

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911352487.9A CN111178421B (en) 2019-12-25 2019-12-25 Method, device, medium and electronic equipment for detecting user state

Publications (2)

Publication Number Publication Date
CN111178421A true CN111178421A (en) 2020-05-19
CN111178421B CN111178421B (en) 2023-10-20

Family

ID=70655666

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911352487.9A Active CN111178421B (en) 2019-12-25 2019-12-25 Method, device, medium and electronic equipment for detecting user state

Country Status (1)

Country Link
CN (1) CN111178421B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610175A (en) * 2021-08-16 2021-11-05 上海冰鉴信息科技有限公司 Service strategy generation method and device and computer readable storage medium
CN115134665A (en) * 2021-03-22 2022-09-30 中国电信股份有限公司 Data processing method and device based on set top box, storage medium and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0893894A2 (en) * 1997-07-24 1999-01-27 AT&T Corp. A method for designing sonet ring networks suitable for local access
CN102087576A (en) * 2009-12-04 2011-06-08 索尼公司 Display control method, image user interface, information processing apparatus and information processing method
CN103927309A (en) * 2013-01-14 2014-07-16 阿里巴巴集团控股有限公司 Method and device for marking information labels for business objects
CN106455056A (en) * 2016-11-14 2017-02-22 百度在线网络技术(北京)有限公司 Positioning method and device
CN106529711A (en) * 2016-11-02 2017-03-22 东软集团股份有限公司 Method and apparatus for predicting user behavior
CN106603324A (en) * 2015-10-20 2017-04-26 富士通株式会社 Training set acquisition device and training set acquisition method
CN108710894A (en) * 2018-04-17 2018-10-26 中国科学院软件研究所 A kind of Active Learning mask method and device based on cluster representative point

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0893894A2 (en) * 1997-07-24 1999-01-27 AT&T Corp. A method for designing sonet ring networks suitable for local access
US6061335A (en) * 1997-07-24 2000-05-09 At&T Corp Method for designing SONET ring networks suitable for local access
CN102087576A (en) * 2009-12-04 2011-06-08 索尼公司 Display control method, image user interface, information processing apparatus and information processing method
CN103927309A (en) * 2013-01-14 2014-07-16 阿里巴巴集团控股有限公司 Method and device for marking information labels for business objects
CN106603324A (en) * 2015-10-20 2017-04-26 富士通株式会社 Training set acquisition device and training set acquisition method
CN106529711A (en) * 2016-11-02 2017-03-22 东软集团股份有限公司 Method and apparatus for predicting user behavior
CN106455056A (en) * 2016-11-14 2017-02-22 百度在线网络技术(北京)有限公司 Positioning method and device
CN108710894A (en) * 2018-04-17 2018-10-26 中国科学院软件研究所 A kind of Active Learning mask method and device based on cluster representative point

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115134665A (en) * 2021-03-22 2022-09-30 中国电信股份有限公司 Data processing method and device based on set top box, storage medium and electronic equipment
CN115134665B (en) * 2021-03-22 2024-03-01 中国电信股份有限公司 Data processing method and device based on set top box, storage medium and electronic equipment
CN113610175A (en) * 2021-08-16 2021-11-05 上海冰鉴信息科技有限公司 Service strategy generation method and device and computer readable storage medium

Also Published As

Publication number Publication date
CN111178421B (en) 2023-10-20

Similar Documents

Publication Publication Date Title
US11586972B2 (en) Tool-specific alerting rules based on abnormal and normal patterns obtained from history logs
AU2016204068B2 (en) Data acceleration
CN109241461B (en) User portrait construction method and device
CN105590055B (en) Method and device for identifying user credible behaviors in network interaction system
US20190065738A1 (en) Detecting anomalous entities
US9817893B2 (en) Tracking changes in user-generated textual content on social media computing platforms
US11625602B2 (en) Detection of machine learning model degradation
US11146586B2 (en) Detecting a root cause for a vulnerability using subjective logic in social media
CN111309614A (en) A/B test method and device and electronic equipment
US11244043B2 (en) Aggregating anomaly scores from anomaly detectors
US10587709B1 (en) Determining session intent
US20170083815A1 (en) Current behavior evaluation with multiple process models
CN113159615A (en) Intelligent information security risk measuring system and method for industrial control system
Landauer et al. Time series analysis: unsupervised anomaly detection beyond outlier detection
CN103631787A (en) Webpage type recognition method and webpage type recognition device
US9830344B2 (en) Evaluation of nodes
CN112801155B (en) Business big data analysis method based on artificial intelligence and server
CN111178421B (en) Method, device, medium and electronic equipment for detecting user state
CN112070559A (en) State acquisition method and device, electronic equipment and storage medium
CN106301979A (en) The method and system of the abnormal channel of detection
US20220214948A1 (en) Unsupervised log data anomaly detection
CN110110219A (en) The method and device of user preference is determined according to network behavior
CN115051863B (en) Abnormal flow detection method and device, electronic equipment and readable storage medium
CN116225848A (en) Log monitoring method, device, equipment and medium
CN111475380B (en) Log analysis method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant