CN108304432B - Information push processing method, information push processing device and storage medium - Google Patents

Information push processing method, information push processing device and storage medium Download PDF

Info

Publication number
CN108304432B
CN108304432B CN201710647371.2A CN201710647371A CN108304432B CN 108304432 B CN108304432 B CN 108304432B CN 201710647371 A CN201710647371 A CN 201710647371A CN 108304432 B CN108304432 B CN 108304432B
Authority
CN
China
Prior art keywords
terminal
user
classification
information
behavior data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710647371.2A
Other languages
Chinese (zh)
Other versions
CN108304432A (en
Inventor
张洋平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Yayue Technology Co ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201710647371.2A priority Critical patent/CN108304432B/en
Publication of CN108304432A publication Critical patent/CN108304432A/en
Application granted granted Critical
Publication of CN108304432B publication Critical patent/CN108304432B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Abstract

The invention discloses an information push processing method, which comprises the following steps: acquiring behavior data formed by detecting the operation of a user in a terminal; performing clustering processing according to the behavior data belonging to the terminal, and selecting operation characteristics of different dimensions of the terminal from clustering results; predicting the characteristics obtained by the terminal clustering according to a classification model to obtain a classification result of the current user in the terminal; judging the replacement of the user in the terminal according to the classification result of the current user in the terminal; and according to the replacement of the user in the terminal, the information push adapting to the corresponding classification result of the replaced user is realized. The invention also discloses an information push processing device and a storage medium.

Description

Information push processing method, information push processing device and storage medium
Technical Field
The present invention relates to communications technologies, and in particular, to an information push processing method, an information push processing apparatus, and a storage medium.
Background
The development of the internet, particularly the mobile internet, is becoming an increasingly important way to obtain information. In order to improve the efficiency of obtaining information by a user, the related art provides an information push technology, generally, by calculating the preference of the user and sending information conforming to the preference to the user.
In the related technology, different terminals (such as a smart phone and a tablet computer) are identified, the preference of a user is calculated, and information according with the preference of the user is sent to the terminals in various modes, so that the operation of searching information by the user is saved, and the efficiency of acquiring the information is improved.
However, in the related art, based on the scheme of pushing information by a terminal, the user always using the terminal does not perform information pushing by calculating the user preference on the assumption that switching does not occur, and obviously, in the current increasingly rich multi-user usage scenario of the terminal, accurate pushing of information cannot be realized.
Disclosure of Invention
The embodiment of the invention provides an information push processing method, an information push processing device and a storage medium; the method and the device can be suitable for realizing accurate information pushing in use scenes of different users of the terminal.
The technical scheme of the embodiment of the invention is realized as follows:
the embodiment of the invention provides an information push processing method, which comprises the following steps:
acquiring behavior data formed by detecting the operation of a user in a terminal;
performing clustering processing according to the behavior data belonging to the terminal, and selecting operation characteristics of different dimensions of the terminal from clustering results;
predicting the characteristics obtained by the terminal clustering according to a classification model to obtain a classification result of the current user in the terminal;
judging the replacement of the user in the terminal according to the classification result of the current user in the terminal;
and according to the replacement of the user in the terminal, the information push adapting to the corresponding classification result of the replaced user is realized.
An embodiment of the present invention further provides an information push processing apparatus, including:
the data acquisition unit is used for acquiring behavior data formed by detecting the operation of a user in the terminal;
the characteristic determining unit is used for carrying out clustering processing according to the behavior data belonging to the terminal and selecting the operation characteristics of different dimensions operated in the terminal from the clustering result;
the classification unit is used for predicting the characteristics obtained by the terminal clustering according to a classification model to obtain the classification result of the current user in the terminal;
the judging unit is used for judging the replacement of the user in the terminal according to the classification result of the current user in the terminal; and the pushing unit is used for pushing the information adapting to the corresponding classification result of the user to be replaced according to the replacement of the user in the terminal.
In the above scheme, the data obtaining unit is specifically configured to obtain behavior data that is formed when it is detected that the terminal satisfies a timing condition and is collected before the timing condition is satisfied; alternatively, the first and second electrodes may be,
acquiring behavior data which is formed by the terminal before a potential event is generated and is collected when the potential event representing user replacement is generated; alternatively, the first and second electrodes may be,
and detecting behavior data formed by operations according with the characteristics of the corresponding information push scene when the terminal is in a specific information push scene.
In the foregoing solution, the determining unit is specifically configured to determine that a user in the terminal is replaced when a distance between a classification of a current user in the terminal and a historical classification of the user in the terminal exceeds a distance threshold;
or, when the classification of the current user in the terminal is the same twice continuously and is different from the historical classification of the user in the terminal, judging that the user in the terminal is replaced.
In the above scheme, the method further comprises:
and the identification unit is used for identifying the terminal to which the acquired behavior data belongs according to the hardware identifier carried by the acquired behavior data when the behavior data is acquired from the plurality of terminals.
In the above solution, the pushing unit is specifically configured to, when a user in the terminal is replaced,
inquiring information of which the orientation conditions accord with corresponding classification results in the server according to the classification results of the replacement users;
and pushing the inquired information to the terminal.
In the above solution, the pushing unit is specifically configured to, when a user in the terminal is replaced,
and inquiring and presenting the information of which the orientation conditions accord with the corresponding classification results in the terminal according to the classification results of the replacement users.
The embodiment of the invention also provides a storage medium, wherein an executable program is stored on the storage medium, and when the executable program is executed by a processor, the information push processing method provided by the embodiment of the invention is realized.
An embodiment of the present invention further provides an information push processing apparatus, including: the information push processing method comprises a processor and a memory, wherein the memory is used for storing an executable program capable of running on the processor, and the processor is used for realizing the information push processing method provided by the embodiment of the invention when running the executable program.
The embodiment of the invention has the following beneficial effects:
1) extracting operation characteristics according to behavior data of recorded user operation, predicting user classification, and accurately classifying users based on a classification model because the operation characteristics can intuitively reflect the differences of different classified users in operation;
2) the method has the advantages that information pushing is adaptively achieved according to the situation of user replacement in the same terminal, the defect that information is mechanically pushed based on the condition that terminal users cannot change in the related technology is overcome, different scenes that the terminal is used by a single user and multiple users are suitable, and the precision of information pushing in the same terminal is remarkably improved.
Drawings
Fig. 1-1 is an alternative architecture diagram of information pushing provided by an embodiment of the present invention;
fig. 1-2 is a schematic diagram of another alternative architecture for information pushing provided by an embodiment of the present invention;
fig. 1-3 are schematic diagrams of alternative architectures for information push provided by embodiments of the present invention;
fig. 2 is a schematic diagram of an alternative software/hardware structure of an information push processing apparatus according to an embodiment of the present invention;
fig. 3-1 is a first schematic view of an alternative processing flow of an information push processing method according to an embodiment of the present invention;
FIG. 3-2 is an alternative diagram of clustering samples of behavioral data provided by an embodiment of the invention;
FIG. 4 is a schematic diagram of a process flow for training a decision tree model according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a process flow for calculating information gain according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a decision tree provided by an embodiment of the present invention;
fig. 7 is a schematic diagram of an alternative processing flow of the information push processing method according to the embodiment of the present invention;
fig. 8 is a schematic view of an alternative processing flow of an information push processing method according to an embodiment of the present invention;
fig. 9 is a schematic view of an alternative processing flow of an information push processing method according to a fourth embodiment of the present invention;
FIG. 10 is an alternative architectural diagram for advertisement delivery provided by embodiments of the present invention;
fig. 11 is a schematic diagram of an alternative architecture for news pushing according to an embodiment of the present invention;
FIG. 12 is a schematic diagram of a configuration of an information pushing apparatus according to an embodiment of the present invention;
fig. 13 is a schematic diagram illustrating information pushing performed by the related art for the same terminal;
fig. 14 is a schematic diagram illustrating information pushing performed by the same terminal according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Before further detailed description of the present invention, terms and expressions referred to in the embodiments of the present invention are described, and the terms and expressions referred to in the embodiments of the present invention are applicable to the following explanations.
1) Decision tree: a Decision Tree (Decision Tree) is a Tree structure (which may be a binary Tree or a non-binary Tree). Each non-leaf node represents a test on a feature, each branch represents the output of the feature over a range of values, and each leaf node stores a category. The process of using the decision tree to make a decision is to start from the root node, test the corresponding characteristics in the items to be classified, select an output branch according to the values of the characteristics until the leaf nodes are reached, and take the categories stored by the leaf nodes as decision results.
2) ID 3: the core idea of the algorithm for selecting the leaf nodes in the decision tree is to measure feature selection by using information gain and select the feature with the maximum information gain after splitting for splitting.
3) Information entropy: the size of the information volume, herein, entropy and refers to the information volume calculated based on a specific way when a classification condition is constructed by using one-dimensional features in a classification model.
4) Information gain: the difference value of the information entropies of the two classification systems is the information gain, and the information entropy is the quantization index of the information quantity.
The embodiment of the invention provides an information push processing method, an information push processing device and a storage medium, which can be understood to be applicable to scenes of pushing various types of information, such as news pushing and advertisement pushing, but is not limited to the scenes of pushing the information and the scenes of pushing the information in the internet.
With respect to implementation of the information push processing device, in an alternative embodiment of the present invention, a manner is provided in which the information push processing device can be implemented on both the terminal side and the server side, and corresponding storage media can be provided on both the terminal side and the server side to correspondingly complete processing on both the terminal side and the server side, which will be described below in connection with implementation of the information push processing device on both the terminal side and the server side.
As an example, an information push processing apparatus is implemented at a terminal side and a server side, fig. 1-1 shows an optional architecture diagram of information push provided by an embodiment of the present invention, a client for implementing a specific application purpose, such as a news client, a video client, a social network client, and the like, is installed in a terminal, and the client collects behavior data of a user operating in the terminal and reports the behavior data to the server; and the server predicts the classification result of the current user of the client according to the classification model and the behavior data, and acquires information conforming to the classification result from the database and pushes the information to the client.
As another example, the information push processing apparatus is implemented on a terminal side and a server side, and fig. 1-2 show another optional architecture diagram of information push provided in the embodiment of the present invention, where a client for implementing a specific application purpose, such as a news client, a video client, a social network client, and the like, is installed in a terminal, and the client collects behavior data of a user operating in the terminal, predicts a classification result of a current user of the client according to a classification model and the behavior data, and reports the classification result to the server through a network; and the server acquires information conforming to the classification result from the database according to the classification result and pushes the information to the client.
As another example, the information push processing apparatus is implemented on a terminal side, and fig. 1 to 3 show another optional architecture schematic diagram of information push provided by the embodiment of the present invention, where a client for implementing a specific application purpose, such as a news client, a video client, and a social network client, is installed in a terminal, and the client collects behavior data of a user operating in the terminal, predicts a classification result of a current user of the client according to a classification model and the behavior data, and obtains information conforming to the classification result from a database according to the classification result, and presents the information on the client.
As for the software/hardware structure of the information push processing device 10, referring to fig. 2, the following is included: a hardware layer, a driver layer, an operating system layer, and an application layer. However, it should be understood by those skilled in the art that the information push processing device 10 may have more components than those shown in fig. 2 according to the implementation requirement, or omit some components according to the implementation requirement.
The hardware layer of the information push processing device 10 includes a processor 161, an input/output interface 163, a memory 164 and a network interface 162, and the components can communicate via a system bus connection.
The processor 161 may be implemented by a Central Processing Unit (CPU), a Microprocessor (MCU), an Application Specific Integrated Circuit (ASIC), or a Field-Programmable Gate Array (FPGA).
The input/output interface 163 may be implemented using input/output devices such as a display screen, touch screen, speakers, or the like.
The memory 164 may be implemented by a nonvolatile storage medium such as a flash memory, a hard disk, and an optical disc, or may also be implemented by a volatile storage medium such as a Double Data Rate (DDR) dynamic cache, in which an executable instruction for executing the advertisement effect analysis method is stored.
The network interface 162 provides the processor 161 with access capability based on a network transmission Protocol (TCP)/User Data Protocol (UDP) of external data such as a remotely located memory 164.
The driver layer includes middleware 165 for the operating system 166 to recognize and communicate with the components of the hardware layer, such as may be a collection of drivers for the components of the hardware layer.
The operating system 166 is used to provide a graphical interface for users (such as advertisers and advertiser system operation and maintenance personnel), and the operating system 166 supports the users to control the devices via the graphical interface, and the embodiment of the present invention does not limit the software environments of the devices, such as the types and versions of the operating systems, and may be, for example, a Linux operating system, a UNIX operating system, and the like.
The application layer includes an application program 167 for implementing the information push processing method provided in the embodiment of the present invention, and may also include other application programs 168.
So far, the information push processing apparatus (server or terminal) according to the embodiment of the present invention has been described in terms of its functions, and the description is continued on the basis of the network architecture shown in fig. 1-1, the functional structure diagram of the information push processing apparatus shown in fig. 1-2, and the software/hardware structure of the information push processing apparatus shown in fig. 2.
Next, a scheme of information push processing provided by the embodiment of the present invention is described with reference to an architecture diagram of information push shown in fig. 1-1, and fig. 3-1 shows a first optional flow diagram of an information push processing method provided by the embodiment of the present invention, which will be described according to each step.
Step S101, the terminal detects the operation of the user to form behavior data.
In an alternative embodiment, examples of behavioral data are: operations, times, environments, etc.; the operation refers to, for example, an operation type, such as using a shortcut key, using a mouse gesture, and the like; the time includes: a start time and an end time of the operation; an environment refers to an environment formed by operations, such as a native functional interface of a client or operating system.
The embodiment of the invention detects that behavior data formed by user operation has different dimensionalities, such as behavior data formed by global user operation received in a terminal operation system; behavior data formed for detecting user operations in a particular type of client may also be formed. The terminal records behavior data of the user from the global dimension, wherein the behavior data comprises behavior data formed by the operation of the user on a native function interface of an operating system and behavior data recorded by the user aiming at the client; therefore, by obtaining more dimensional operation characteristics of the user, the effectiveness of behavior data in a detection period can be ensured, and the accuracy of determining the user category according to the operation characteristics can be improved; the accuracy of information push is improved. The terminal records the behavior data of the user from the specific type client, can judge the user classification using the specific type client based on the operation of the user aiming at the specific type client, pushes the information according to the user classification aiming at the specific type client, and can improve the directionality of information pushing.
In an optional embodiment, the behavior data formed by detecting the user operation in the terminal may also be a scheme that an information push scenario in which the terminal is located is detected first, a feature associated with the detected information push scenario is determined, and then behavior data conforming to the associated feature is detected. Here, the information push scenario in which the terminal is located may be determined according to the category of the client that the terminal operates, and accordingly, the information classification requirements corresponding to different information push scenarios are also different.
For example, when a news client (such as flight news) in the terminal is in an operating state, an information push scene is a news push scene; then, the information classification corresponding to the news push scenario may include: sports news, entertainment news, financial news, and military news; correspondingly, the information classification corresponds to the features: sports, entertainment, finance, military affairs. Therefore, the terminal operates on the news client aiming at the theme contents of sports, entertainment, finance, military and the like, and forms behavior data in a targeted manner based on the detected operation; i.e. to form behavioural data for sports, entertainment, finance, military, respectively. The terminal detects behavior data which accord with the associated characteristics according to the collected behavior data; the behavior data conforming to the associated characteristic may be detected, for example, in a storage area predetermined by the terminal for storing the behavior data.
For another example, when a music client (such as QQ music) in the terminal is in an operating state, an information push scene where the terminal is located is a scene for music push; then, the information classification corresponding to the music push scenario may include: classical music, popular music; correspondingly, the characteristics corresponding to the information classification are classical music and popular music. The terminal detects behavior data which accord with the associated characteristics according to the collected behavior data; for example, the behavior data of the service association feature may be detected in a storage area predetermined by the terminal for storing the behavior data.
Taking an application scene of the pushed advertisement as an example, any client installed on the terminal (such as Taobao, Tencent news, QQ music, American group take-out and the like) can be classified from the perspective of products related to the pushed advertisement in the operation process, such as fast consumer goods, electronic products, clothes and food; correspondingly, the corresponding characteristics of the classification are consumer products, electronic products, clothes and food; the terminal detects the behavior data which accords with the associated characteristics in the collected behavior data, for example, the behavior data of the service associated characteristics can be detected in a storage area which is predetermined by the terminal and used for storing the behavior data.
And step S102, the terminal records the detected behavior data and reports the behavior data to the server when the reporting condition is met.
In an optional embodiment of the present invention, the reporting of the behavior data from the terminal to the server may include two schemes, the first scheme is that the terminal reports the behavior data acquired by itself to the server when detecting that the terminal itself meets a timing condition; here, the timing condition is a period, such as 10 seconds, 1 minute, etc., for reporting the behavior data to the server by the terminal, which is flexibly set according to actual needs; when the terminal periodically reports the behavior data according to the set timing condition, the reported behavior data only comprises the behavior data collected by the terminal after the last reporting operation closest to the current reporting operation.
Here, the type of the timing condition may be a globally uniform value to ensure timeliness of the server obtaining the behavior data. The types of the timing conditions can be distinguished according to the priority (importance degree) of the push terminal, for example, the timing of the apple terminal is smaller than that of the android terminal, and the timing of the smart phone terminal is smaller than that of the tablet computer; therefore, the server can be ensured to obtain the behavior data of the key push type terminal immediately.
In the optional embodiment of the invention, when a potential event representing user replacement is generated, the terminal reports the behavior data to the server; wherein, the potential event for characterizing the user replacement may include: switching of application programs, switching of operating system login users and the like; the switching of the application program refers to the change of the application program running by the terminal from one to another. At the moment, the behavior data reported by the server comprises behavior data collected between two potential events representing user replacement; therefore, the terminal flexibly reports the corresponding behavior data to the server according to the change of the operation state of the terminal, the timeliness of the server for obtaining the behavior data can be guaranteed, and the server can conveniently realize information pushing of the corresponding classification result of the adaptive replacement user according to the behavior data in time.
As can be seen from the specific implementation process of the two servers obtaining the behavior data of the terminal, the behavior data is reported to the server by the terminal; the server can obtain the behavior data of the terminal through a reporting mechanism of the client in the terminal.
Taking a scenario that a server pushes an advertisement to a terminal user as an example, a Software Development Kit (SDK) for accessing an advertisement background server can be integrated on a client, and the client can complete the collection and report of behavior data of the user by the client by running the SDK. Therefore, all clients that integrate the SDK for accessing the ad backend server can push ads.
And step S103, the server carries out clustering processing according to the behavior data belonging to the terminal and extracts operation characteristics of different dimensions from the behavior data according to different clusters.
In the optional embodiment of the invention, clustering is performed into different groups according to the closeness degree of the connection among the samples of the behavior data, and characteristics are selected from each clustering group as operation characteristics.
Fig. 3-2 is an alternative schematic diagram illustrating clustering performed on samples of behavior data according to an embodiment of the present invention, where hierarchical clustering is taken as an example, and samples with similar distances (e.g., euclidean distance, absolute distance, etc.) are aggregated into different groups according to distribution of each sample of behavior data in a sample space, so as to achieve 1) distance maximization between different groups; 2) the distance between samples in each group is maximized so that the samples within the same group have as high a homogeneity as possible, while the samples in different groups should have as high a heterogeneity as possible.
For each group of the cluster, extracting a feature subset from all features of the grouped samples, then evaluating the feature subset by using an evaluation criterion (the criterion for evaluating the quality of the feature subset is different under different application scenes, for example, the feature subset can be judged according to whether the distribution of the features is uniform or not), comparing the evaluation result with a stop criterion, stopping if the stop criterion is met, otherwise, continuing to generate the next group of feature subsets, and continuing to select the features.
Taking behavior data as an example obtained by detecting user operation in a browser, clustering according to the behavior data and selecting the following operation characteristics in different groups of the clusters: shortcut keys, mouse gestures, keyboard input, multiple windows, browser setting, web browsing, interest web classification and the like.
And step S104, the server predicts the operation characteristics obtained by the terminal clustering according to the classification model to obtain the classification result of the current user in the terminal.
In an optional embodiment of the present invention, the server needs to train a classification model for user classification in advance, and after the operation features are input into the classification model, the classification result corresponding to the input operation features can be obtained according to a processing logic built in the classification model. The pre-established classification model can be constructed based on sample data acquired before a timing condition when the terminal meets the timing condition; or the method can be constructed based on sample data which is formed when a potential event representing user replacement is generated in the terminal and is collected before the potential event is generated; the method can also be constructed based on sample data formed when the terminal is in a specific information pushing scene. Correspondingly, when the classification model is used for predicting the acquired behavior data of the current user, the classification model corresponding to the type of the behavior data of the current user can be used for predicting.
For example, when the obtained behavior data of the current user is that the terminal is in a specific information pushing scene, the classification model is trained and predicted by using sample data formed when the terminal is in the specific information pushing scene, and the mode of distinguishing the information pushing scene for prediction fully utilizes the difference of operations of different users in the same information pushing scene, so that the different users can be accurately classified in the same information pushing scene.
In an optional embodiment of the present invention, when the classification model uses a classification tree model, the server performs classification judgment on the obtained operation features of the corresponding dimensions sequentially according to the priority descending order of the operation features of each dimension included in the classification tree model and the classification conditions corresponding to the operation features of each dimension until the classification conditions of one feature in the priority descending order are met, and determines the classification corresponding to the met features as the classification result of the current user in the terminal.
A server needs to train a decision tree model for user classification in advance; in an embodiment, the flow chart of the training decision tree model, as shown in fig. 4, includes the following steps:
step S1041, obtaining candidate features with different dimensions of each user sample in the user sample set.
Here, the user sample set is a behavior data set of the user, for example, candidate features of different dimensions include: whether to use a shortcut key, whether to use a mouse gesture, whether to modify a browser configuration.
Step S1042, calculating corresponding information gains when classifying users in the user sample set by the candidate features of each dimension.
In the following, the ID3 algorithm based decision tree theory classifies users in the user sample set based on candidate features of each dimension.
The realization process is as follows: firstly, the behavior habits of a plurality of users are recorded, and the plurality of users have different differences of occupation, age, sex, hobbies and the like so as to collect comprehensive and accurate behavior data. The behavior habit of using the browser includes whether to use a shortcut key, whether to use a mouse gesture, whether to modify browser settings, text input speed, whether to open multiple windows, average browsing duration of web pages, interested web page classification, and the like.
Secondly, performing decision tree modeling by using an ID3 algorithm; here, the principle of performing decision tree modeling is first introduced: if a random variable X takes on the value X ═ X1, X2, x3... nx, and each of the obtained probabilities is { p1, p2, p3... pn }, then the self-information of X is defined as X ═ X1, X2, x3... nx }, respectively
Figure BDA0001367148900000111
That is, the more a variable may change, the greater the amount of information it carries. For the classification system, the class C is a variable, and its value is C1, C2, c3... cn, and the probability of occurrence of each class is P (C1), P (C2), P (C3.. P (cn); wherein n is the total number of categories; at this time, the entropy of the classification system can be expressed as
Figure BDA0001367148900000112
Based on the above definition of the information entropy, the information gain is briefly described below. The information gain is for a feature, that is, for a feature t, the amount of information of the system with the feature and the amount of information of the system without the feature are respectively, and the difference between the two is the amount of information of the feature brought to the decision tree, that is, the information gain.
It should be noted that, here, the ID3 algorithm is only used as an example for model training to construct a decision tree model, and other algorithms for constructing a decision tree model and other training models for identifying an end user are all applicable to the embodiment of the present invention.
In a specific embodiment, a specific implementation process of the corresponding information gain when classifying the users in the user sample set by the candidate features of each dimension is calculated, as shown in fig. 5, which includes the following steps:
step S10421, calculating initial information entropy based on prior classification result in user sample set.
Step S10422, constructing different classification conditions according to the operation characteristics of any dimension, and calculating the reference information entropy corresponding to the decision tree model when the classification conditions are different for sample users in the sample set.
Step S10423, calculating a difference value between the reference information entropy and the initial information entropy as the information entropy of the corresponding dimension feature.
For example, the obtained features and the corresponding user classifications are shown in table 1 below:
Figure BDA0001367148900000121
TABLE 1
Table 1 has five pieces of user data in common, and the user classification result includes two types; wherein, there are three A-class users and two non A-class users; the proportion of class a users in the total users is 3/5, and the proportion of non-class a users in the total users is 2/5. According to the formula H (x), the information entropy of the current system can be calculated to be (2/5*log22/5+3/5*log23/5)=0.1591+0.1331=0.2922。
Taking whether the shortcut key is used as a standard to divide the user classification as an example, three users use the shortcut key, two of the three users using the shortcut key are class A users, and the proportion of the class A users to the users using the shortcut key is 2/3; one user is a non-A-type user, and the proportion of the non-A-type users to the users using the shortcut keys is 1/3; then, the entropy obtained by continuously applying the above formula is: - (1/3 log)21/3+2/3*log22*3)=0.2764。
For two users who do not use the shortcut key, one is a class a user and one is a non-class a user, and the proportion of users who do not use the shortcut key between the class a user and the class B user is 1/2. Then, the entropy obtained by continuously applying the above formula is: - (1/2 log)21/2+1/2*log21/2)=0.301。
For the characteristic of whether the shortcut key is used, the proportion of users using the shortcut key is 3/5, and the corresponding information entropy is 0.2764; the user ratio without using the shortcut key is 2/5, and the corresponding information entropy is 0.301. Therefore, the information entropy corresponding to whether the characteristic of the shortcut key is used is as follows: 3/5 × 0.2746+2/5 × 0.301 ═ 0.28516. Therefore, the information gain of whether to use the shortcut key is equal to the initial system information entropy minus the system information entropy classified using the shortcut key feature, i.e., 0.2922-0.28516 is 0.00704. The value indicates how much change can be brought to the whole system information amount by whether the user classification is performed by using the shortcut key, and the larger the value is, the better the characteristic is used for the user classification effect is, and the higher the accuracy is. The same method calculates whether to use mouse gestures and whether to modify the information gain of the browser configuration.
Step S1043, selecting a predetermined number of candidate features with the highest information gain as features corresponding to the corresponding dimensions, and forming a descending order of priority of the operation features of the corresponding dimensions according to the descending order of the information gain.
In one embodiment, the server, based on the above calculation, determines whether to modify the information entropy corresponding to the browser configuration > whether to use the information entropy corresponding to the shortcut key > whether to use the information entropy corresponding to the mouse gesture; therefore, the temperature of the molten metal is controlled,
the descending order of priority of the operation features forming the corresponding dimension is: whether to modify browser configuration, whether to use shortcut keys, whether to use mouse gestures.
When the decision tree model is built according to the magnitude of the information gain, the larger the information gain is, the stronger the certainty of the feature corresponding to the information gain on the classification decision is, the feature is determined as a root node, and then, the corresponding feature is arranged downwards once according to the magnitude of the information gain, so that the decision tree shown in fig. 6 is formed. The decision tree is a model for predicting future data formed according to sample data, so that the more sample data is collected, the more corresponding features are, and the larger the scale of the formed decision tree is; correspondingly, the more accurate the prediction result of the large-scale decision tree is.
When classifying users based on the decision tree, firstly, detecting whether the users have the characteristics corresponding to the root nodes of the decision tree, namely whether the users modify the browser configuration, if the users modify the browser configuration, determining that the users are class A users, and ending the process. If the user does not modify the browser configuration, whether the user uses the shortcut key is further detected, if the user uses the shortcut key, the user is determined to be a class A user, and the process is ended. If the user does not use the shortcut key, whether the user uses the mouse gesture is further detected, if the user uses the mouse gesture, the user is determined to be the class A user, and the process is ended. And if the user does not use the mouse gesture, determining that the user is a non-class A user.
And step S105, the server judges the replacement of the user in the terminal according to the classification result of the current user in the terminal.
In an optional embodiment of the invention, the server judges the classification of the current user in the terminal, and the distance between the server and the historical classification of the user in the terminal exceeds a distance threshold value, so that the user in the terminal is judged to be replaced. The classified distance refers to the distance in a user classification table formed by arranging according to the similarity of user categories, and the larger the user classified distance is, the larger the category difference of the user is; the historical classification of the user may be the most recent classification of the user. Therefore, in a specific embodiment, the server determines the category of the current user in the terminal, and determines that the terminal user is replaced when the distance from the latest category of the terminal user exceeds a distance threshold.
In another optional embodiment of the present invention, the server determines that the current user classification is the same for two consecutive times, and that the user replacement occurs in the terminal when the same user classification is different from the historical classification of the user in the terminal for two consecutive times; here, the history category of the user in the terminal refers to the latest user category.
In the embodiment of the invention, when the user in the terminal is judged to be replaced according to the classification result of the current user in the terminal, a fault-tolerant mechanism is adopted, and the accuracy of classifying the user is improved.
And step S106, realizing information pushing of the corresponding classification result of the adaptive replacement user according to the replacement of the user in the terminal.
In an optional embodiment of the present invention, when the server determines that the user in the terminal is replaced, the server queries, according to the classification result of the replacement user, information whose server orientation condition meets the corresponding classification result, and pushes the queried information to the terminal, so that the pushed information is presented on the display interface of the terminal.
Fig. 7 is a schematic diagram of an optional processing flow of the information push processing method according to the embodiment of the present invention; the method of the embodiment of the present invention is similar to the method shown in fig. 3, except that before step S102, the method further includes:
and step S107, when the server obtains the behavior data from the plurality of terminals, identifying the terminal to which the obtained behavior data belongs according to the hardware identifier carried by the obtained behavior data.
For example, when the server obtains behavior data from a plurality of terminals, the terminal to which the obtained behavior data belongs is identified according to a hardware identifier, such as a GUID, carried by the obtained behavior data.
Correspondingly, when step S105 is executed, the server pushes the information adapted to the corresponding classification result of the replacement user to the corresponding terminal.
The third flow diagram of the information push processing method provided by the embodiment of the present invention, as shown in fig. 8, is applied to a terminal, and includes the following steps:
in step S201, the terminal obtains behavior data formed by detecting user operation by itself.
The behavior data formed by the user operation according to the embodiment of the present invention may be behavior data formed by the user for the operation of the terminal itself, such as data on whether to use a shortcut key, whether to use a mouse gesture, a character input speed based on a keyboard, and the like; the behavior data formed by the related user operation may be operation data of the user for the client installed on the crash, such as: whether to open multiple windows, whether to modify browser settings, average browsing duration of web pages, interest web page classification, and the like.
In the optional embodiment of the invention, the first method is that the terminal records the behavior data acquired by the terminal when detecting that the terminal meets the timing condition; here, the timing condition is a period, such as 10 seconds, 1 minute, etc., for reporting the behavior data to the server by the terminal, which is flexibly set according to actual needs; when the terminal periodically reports the behavior data according to the set timing condition, the behavior data recorded each time only comprises the behavior data collected by the terminal after the last recording operation closest to the current recording operation.
Here, the type of the timing condition may be a globally uniform value to ensure timeliness of the recorded behavior data. The types of the timing conditions can be distinguished according to the priority (importance degree) of the push terminal, for example, the timing of the apple terminal is smaller than that of the android terminal, and the timing of the smart phone terminal is smaller than that of the tablet computer; therefore, the server can be ensured to obtain the behavior data of the key push type terminal immediately.
In an optional embodiment of the invention, the terminal records the behavior data when generating a potential event representing the replacement of the user; wherein, the potential event for characterizing the user replacement may include: switching of application programs, switching of operating system login users and the like; the switching of the application program refers to the change of the application program running by the terminal from one to another. At the moment, the behavior data recorded by the server comprises behavior data collected between two potential events representing user replacement; therefore, when the terminal generates a potential event representing user replacement according to the terminal, the corresponding behavior data is recorded, and the terminal can be ensured to obtain the behavior data which is possibly changed by the user in time.
In an optional embodiment, the step of the terminal obtaining the behavior data formed by detecting the user operation by itself may further include a third scheme that the terminal first detects an information pushing scenario in which the terminal is located, determines a feature associated with the detected information pushing scenario, and then detects the behavior data conforming to the associated feature in the terminal. Here, the terminal may determine the information push scenario where the terminal is located according to the category of the client that operates the terminal, and accordingly, the information classification requirements corresponding to different information push scenarios are different.
For example, when a news client (such as flight news) in the terminal is in an operating state, an information push scene is a news push scene; then, the information classification corresponding to the news push scenario may include: sports news, entertainment news, financial news, and military news; correspondingly, the information classification corresponds to the features: sports, entertainment, finance, military affairs. Therefore, the terminal operates on the news client aiming at the theme contents of sports, entertainment, finance, military and the like, and forms behavior data in a targeted manner based on the detected operation; i.e. to form behavioural data for sports, entertainment, finance, military, respectively. The terminal detects behavior data which accord with the associated characteristics according to the collected behavior data; for example, the behavior data conforming to the associated feature may be detected in a storage area predetermined by the terminal for storing the behavior data.
For another example, when a music client (such as QQ music) in the terminal is in an operating state, an information push scene where the terminal is located is a scene for music push; then, the information classification corresponding to the music push scenario may include: classical music, popular music; correspondingly, the characteristics corresponding to the information classification are classical music and popular music. The terminal detects behavior data which accord with the associated characteristics according to the collected behavior data; for example, the behavior data of the service association feature may be detected in a storage area predetermined by the terminal for storing the behavior data.
Taking an application scene of the pushed advertisement as an example, any client installed on the terminal (such as Taobao, Tencent news, QQ music, American group take-out and the like) can be classified from the perspective of products related to the pushed advertisement in the operation process, such as fast consumer goods, electronic products, clothes and food; correspondingly, the corresponding characteristics of the classification are consumer products, electronic products, clothes and food; the terminal detects the behavior data which accords with the associated characteristics in the collected behavior data, for example, the behavior data of the service associated characteristics can be detected in a storage area which is predetermined by the terminal and used for storing the behavior data.
And step S202, the terminal carries out clustering processing according to the behavior data of the terminal to obtain the operation characteristics of different dimensions of the operation in the terminal.
In the optional embodiment of the present invention, clustering is performed into different groups according to the closeness of the connection between the samples of the behavior data, and a feature is selected as an operation feature in each group of the clusters.
And step S203, the terminal predicts the characteristics obtained by the terminal clustering according to the classification model to obtain the classification result of the current user in the terminal.
In an optional embodiment of the present invention, the operation performed by the terminal in this step is the same as the operation performed by the server in step S104 in the foregoing embodiment, and only the replacement of the main body is performed, which is not described herein again.
And step S204, the terminal judges the replacement of the user according to the classification result of the current user.
In an optional embodiment of the present invention, the terminal determines that the distance between the current user classification and the historical user classification in the terminal exceeds a distance threshold, and determines that the user is replaced. The classified distance refers to the distance in a user classification table formed by arranging according to the similarity of user categories, and the larger the user classified distance is, the larger the category difference of the user is; the historical classification of the user may be the most recent classification of the user. Therefore, in one embodiment, the terminal determines the current user category, and determines that the terminal user is replaced when the distance from the last category of the terminal user exceeds a distance threshold.
In another optional embodiment of the present invention, the terminal determines that the current user classification is the same for two consecutive times, and determines that the user in the terminal is replaced when the same user classification in two consecutive times is different from the historical classification of the user in the terminal; here, the history category of the user in the terminal refers to the latest user category.
And step S205, the terminal obtains the information adapting to the corresponding classification result according to the replacement of the user.
In the optional embodiment of the invention, when the terminal determines that the user is replaced, the terminal queries the information of which the orientation condition accords with the corresponding classification result according to the classification result of the replacement user, and presents the queried information on the display interface of the terminal.
Here, the specific implementation process of the terminal querying the information whose orientation condition meets the corresponding classification result may be: the terminal retrieves corresponding information locally or sends a request to the background server to acquire the corresponding information.
For example, when the information of which the orientation condition meets the corresponding classification result is classical music, the terminal can search the music with the characteristics of classical in the music file stored by the terminal; and when the information of which the orientation condition accords with the corresponding classification result is a current news, the terminal sends a request to the background server to obtain the latest current news.
Fig. 9 is a schematic diagram illustrating an alternative flow chart of an information push processing method according to an embodiment of the present invention, and will be described according to various steps.
Steps S301 to S304 are identical to the operations performed in steps S201 to S204.
In step S305, the terminal sends the classification result of the current user to the server.
And S306, the server acquires the information adapting to the corresponding classification result according to the replacement of the user.
For example, the server queries the information of which the orientation condition of the server meets the corresponding classification result according to the classification result of the replacement user, and pushes the queried information to the terminal, so that the pushed information is presented on a display interface of the terminal.
The following describes a flow of an information push processing method according to an embodiment of the present invention, taking advertisement push as an example. Fig. 10 is a schematic diagram of an alternative network architecture of an information push processing method according to an embodiment of the present invention, where the embodiment of the present invention relates to a terminal and a server equipped with a client; the terminal includes: a smart phone, a tablet computer, a vehicle-mounted terminal, a fixed terminal (desktop), and the like, in an embodiment of the present invention, the terminal may be any one or more of the terminal 21, the terminal 22, the terminal 23, and the terminal 24 shown in fig. 1-1, and the server at least includes any one of the server 11 to the server 1 n. In the embodiment of the invention, the advertiser uploads the advertisement to the server and sets the delivery condition corresponding to the advertisement. The client of the terminal is integrated with the SDK, the client can finish the collection of the behavior data of the user by the client by running the SDK, and the collected data is reported to the advertisement background server. And the advertisement background server performs clustering processing on the behavior data of the users and predicts the classification results of the users. And the advertisement background server pulls the advertisements which accord with the classification result, and sorts the advertisements which accord with the classification result according to the ranking strategies such as bidding ranking and the like. And the launching end of the advertisement background server launches the advertisements in the sequencing queue according to the sequencing priority, namely, the advertisements in the sequencing queue are pushed to the client side to be presented according to the priority.
The following describes a flow of an information push processing method according to an embodiment of the present invention, taking news push as an example. The client collects behavior data of a user operating in the terminal and reports the behavior data to the server; and the server predicts the classification result of the current user of the client according to the classification model and the behavior data. The server captures news matched with the classification result of the current user from a plurality of network platforms; the server sorts the captured news according to the strategies of timeliness, click rate and the like; and pushing the news in the sequencing queue to the client side for presentation according to the sequencing priority.
Based on the foregoing description, it is understood that the configuration of the information push processing apparatus 100 of the application 167 that realizes the information push processing function of fig. 2 is as shown in fig. 12, and the functions of the respective units will be described below.
A data acquisition unit 101 configured to acquire behavior data formed by detecting an operation of a user in a terminal;
the characteristic determining unit 102 is configured to perform clustering processing according to the behavior data belonging to the terminal to obtain operation characteristics of different dimensions of operations in the terminal;
the classification unit 103 is used for predicting the characteristics obtained by terminal clustering according to a classification model to obtain a classification result of a current user in the terminal;
a judging unit 104, configured to judge, according to a classification result of a current user in the terminal, a change of the user in the terminal; and the pushing unit 105 is used for pushing information adapting to the corresponding classification result of the user to be replaced according to the replacement of the user in the terminal.
In a specific embodiment, the data obtaining unit 101 is specifically configured to obtain behavior data acquired before the timing condition is met when the terminal detects that the terminal meets the timing condition, or obtain behavior data acquired before the potential event is generated when the terminal generates a potential event representing user replacement.
In a specific embodiment, the data obtaining unit 101 is specifically configured to detect an information push scenario in which the terminal is located, and determine a feature associated with the corresponding push scenario; behavior data formed by detecting operations conforming to the associated characteristics in the corresponding terminal is obtained.
In a specific embodiment, the classification unit 103 is specifically configured to perform classification judgment on the obtained operation features of the corresponding dimensions sequentially according to the descending order of the priority of the feature of each dimension included in the classification tree model and the classification conditions corresponding to the operation features of each dimension until the obtained operation features meet the classification condition of a feature in the descending order of the priority, and determine the classification corresponding to the met feature as the classification result of the current user in the terminal.
In one embodiment, the information push processing apparatus further includes: a sorting unit 106, configured to obtain candidate features with different dimensions of each user sample in the user sample set; calculating corresponding information gain when classifying users in the user sample set by the candidate characteristics of each dimension;
selecting a preset number of candidate features with the highest information gain as features corresponding to the corresponding dimensionality, and forming a priority descending order of the operation features of the corresponding dimensionality according to the descending order of the information gain.
In a specific embodiment, the sorting unit 106 is specifically configured to calculate an initial information entropy based on a priori classification result in the user sample set; constructing different classification conditions according to the operation characteristics of any dimension, and calculating corresponding reference information entropy when classifying sample users in a sample set according to the different classification conditions; and calculating the difference value between the reference information entropy and the initial information entropy as the information entropy of the corresponding dimension characteristics.
In one embodiment, the information push processing apparatus further includes: the identifying unit 107 is configured to identify a terminal to which the obtained behavior data belongs according to a hardware identifier carried by the obtained behavior data when the behavior data is obtained from a plurality of terminals.
In a specific embodiment, the pushing unit 105 is specifically configured to, when a user in the terminal is replaced, query, according to a classification result of the replacement user, information in the server that an orientation condition meets a corresponding classification result; and pushing the inquired information to the terminal.
In an embodiment, the push unit 105 is used, in particular when a user in the terminal has changed,
and according to the classification result of the replacement user, inquiring and presenting the information of which the orientation condition accords with the corresponding classification result in the terminal.
In a specific embodiment, the determining unit 104 is specifically configured to determine that the user in the terminal is replaced when the distance between the classification of the current user in the terminal and the historical classification of the user in the terminal exceeds a distance threshold; or, when the classification of the current user in the terminal is the same twice continuously and is different from the historical classification of the user in the terminal, judging that the user in the terminal is replaced.
It should be noted that: in the information push processing apparatus provided in the above embodiment, when performing the information push processing, only the division of each program module is exemplified, and in practical applications, the processing may be distributed to different program modules according to needs, that is, the internal structure of the apparatus may be divided into different program modules to complete all or part of the processing described above. In addition, the information push processing apparatus and the information push processing method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.
In an exemplary embodiment, the embodiment of the present invention further provides a computer-readable storage medium, for example, a memory including an executable program, where the executable program is executable by a processor of an information push processing device to perform the foregoing method steps. The computer readable storage medium can be Memory such as FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface Memory, optical disk, or CD-ROM; or may be various devices, such as mobile phones, computers, tablet devices, personal digital assistants, servers, etc., including one or any combination of the above-mentioned memories.
An embodiment of the present invention further provides an information push processing apparatus, including: a processor and a memory for storing an executable program capable of running on the processor,
wherein, when the processor is used for running the executable program, the processor executes:
acquiring behavior data formed by detecting the operation of a user in a terminal;
clustering according to the behavior data belonging to the terminal to obtain operation characteristics of different dimensions of the operation in the terminal;
predicting the characteristics obtained by terminal clustering according to a classification model to obtain a classification result of a current user in the terminal;
judging the replacement of the user in the terminal according to the classification result of the current user in the terminal;
and according to the replacement of the user in the terminal, the information push adapting to the corresponding classification result of the replaced user is realized.
The processor is further configured to, when executing the executable program, perform:
acquiring behavior data which is formed by the terminal and is collected before the timing condition is met when the terminal detects that the timing condition is met, or,
and acquiring behavior data which is formed by the terminal when a potential event representing user replacement is generated and is collected before the potential event is generated.
The processor is further configured to, when executing the executable program, perform:
detecting an information push scene where a terminal is located, and determining characteristics associated with the corresponding push scene;
behavior data formed by detecting operations conforming to the associated characteristics in the corresponding terminal is obtained.
The processor is further configured to, when executing the executable program, perform:
according to the descending order of the priority of each dimension characteristic included by the classification tree model, sequentially classifying and judging the obtained operation characteristics of the corresponding dimension according to the classification conditions corresponding to the operation characteristics of each dimension until,
and determining the classification corresponding to the accorded feature as the classification result of the current user in the terminal according to the classification condition of the feature in the priority descending order.
The processor is further configured to, when executing the executable program, perform:
obtaining candidate characteristics with different dimensionalities of each user sample in a user sample set;
calculating corresponding information gain when classifying users in the user sample set by the candidate characteristics of each dimension;
selecting a preset number of candidate features with the highest information gain as features corresponding to the corresponding dimensionality, and forming a priority descending order of the operation features of the corresponding dimensionality according to the descending order of the information gain.
The processor is further configured to, when executing the executable program, perform:
calculating initial information entropy based on prior classification results in a user sample set;
constructing different classification conditions according to the operation characteristics of any dimension, and calculating corresponding reference information entropy when classifying sample users in a sample set according to the different classification conditions;
and calculating the difference value between the reference information entropy and the initial information entropy as the information entropy of the corresponding dimension characteristics.
The processor is further configured to, when executing the executable program, perform:
when the behaviour data is obtained from a plurality of terminals,
and identifying the terminal to which the obtained behavior data belongs according to the hardware identifier carried by the obtained behavior data.
The processor is further configured to, when executing the executable program, perform:
when the user in the terminal is replaced, inquiring the information of which the orientation condition accords with the corresponding classification result in the server according to the classification result of the user; and pushing the inquired information to the terminal.
The processor is further configured to, when executing the executable program, perform:
when the user in the terminal is replaced, the information of which the orientation conditions accord with the corresponding classification results in the terminal is inquired and presented according to the classification results of the user to be replaced.
The processor is further configured to, when executing the executable program, perform:
when the classification of the current user in the terminal exceeds a distance threshold value from the historical classification of the user in the terminal, judging that the user in the terminal is replaced;
or, when the classification of the current user in the terminal is the same twice continuously and is different from the historical classification of the user in the terminal, judging that the user in the terminal is replaced.
An embodiment of the present invention further provides a storage medium, where an executable program is stored on the storage medium, and when the executable program is executed by a processor, the executable program performs:
acquiring behavior data formed by detecting the operation of a user in a terminal;
clustering according to the behavior data belonging to the terminal to obtain operation characteristics of different dimensions of the operation in the terminal;
predicting the characteristics obtained by terminal clustering according to a classification model to obtain a classification result of a current user in the terminal;
judging the replacement of the user in the terminal according to the classification result of the current user in the terminal;
and according to the replacement of the user in the terminal, the information push adapting to the corresponding classification result of the replaced user is realized.
The executable program, when executed by the processor, further performs:
the method comprises the steps of obtaining behavior data which are formed by a terminal and collected before timing conditions are met when the terminal detects that the terminal meets the timing conditions, or obtaining behavior data which are formed by the terminal and collected before potential events are generated when the terminal generates the potential events representing user replacement.
The executable program, when executed by the processor, further performs:
detecting an information push scene where a terminal is located, and determining characteristics associated with the corresponding push scene;
behavior data formed by detecting operations conforming to the associated characteristics in the corresponding terminal is obtained.
The executable program, when executed by the processor, further performs:
according to the descending order of the priority of each dimension characteristic included by the classification tree model, sequentially classifying and judging the obtained operation characteristics of the corresponding dimension according to the classification conditions corresponding to the operation characteristics of each dimension until,
and determining the classification corresponding to the accorded feature as the classification result of the current user in the terminal according to the classification condition of the feature in the priority descending order.
The executable program, when executed by the processor, further performs:
obtaining candidate characteristics with different dimensionalities of each user sample in a user sample set;
calculating corresponding information gain when classifying users in the user sample set by the candidate characteristics of each dimension;
selecting a preset number of candidate features with the highest information gain as features corresponding to the corresponding dimensionality, and forming a priority descending order of the operation features of the corresponding dimensionality according to the descending order of the information gain.
The executable program, when executed by the processor, further performs:
calculating initial information entropy based on prior classification results in a user sample set;
constructing different classification conditions according to the operation characteristics of any dimension, and calculating corresponding reference information entropy when classifying sample users in a sample set according to the different classification conditions;
and calculating the difference value between the reference information entropy and the initial information entropy as the information entropy of the corresponding dimension characteristics.
The executable program, when executed by the processor, further performs:
when the behaviour data is obtained from a plurality of terminals,
and identifying the terminal to which the obtained behavior data belongs according to the hardware identifier carried by the obtained behavior data.
The executable program, when executed by the processor, further performs:
when a user in the terminal is replaced,
inquiring information of which the orientation conditions accord with corresponding classification results in the server according to the classification results of the replacement users;
and pushing the inquired information to the terminal.
The executable program, when executed by the processor, further performs:
when a user in the terminal is replaced,
according to the classification result of the replacement user, the information of which the orientation condition accords with the corresponding classification result in the query terminal is presented.
Based on the above description of the embodiments, it can be understood that, in the related art, a schematic diagram of pushing information to the same terminal is shown in fig. 13, since A, B, C types of users using the terminal cannot be clearly distinguished, and A, B, C three types of users respectively have different usage characteristics; therefore, the terminal or the server can present/push information corresponding to three usage characteristics of A, B, C-class users only for the terminal; in the process of presenting or pushing the information with the three using characteristics, the situation of presenting/pushing the information by mistake or presenting/pushing the information more is inevitable. If the class A user uses the terminal, the presented/pushed information of the electronic product belongs to the condition of mistakenly presented/pushed information; in order to avoid the situation of presenting/pushing information by mistake, the information with three using characteristics can be presented/pushed selectively, namely the situation of presenting/pushing information in multiple ways. Therefore, the information popularization effect and the network resource utilization rate are reduced, and the operation cost is increased.
By adopting the information pushing processing method of the embodiment of the invention, the replacement of the terminal user can be judged according to the behavior data of the user, and the information pushing adapting to the corresponding classification result of the replacement user is realized according to the replacement of the terminal user; as shown in fig. 14, for the same terminal, the current user category of the terminal can be clearly distinguished, so that the information adapted to the user category can be pushed accurately according to the current user category.
In summary, the embodiments of the present invention have the following technical effects
1) The behavior data formed by the user operation is extracted to obtain the operation characteristics, the user classification is predicted based on the operation characteristics with different dimensionalities, information pushing is carried out according to the user classification adaptability, information pushing can be carried out in a targeted mode based on the user classification, and the information pushing accuracy and the network resource utilization rate are improved.
2) By recording the behavior data of the user from the global dimension, the operation characteristics of the user with more dimensions can be obtained, the effectiveness of the behavior data in a detection period can be guaranteed, the accuracy of determining the category of the user can be improved, and the accuracy of information push is further improved.
3) The user classification using the specific type of client is judged by recording the behavior data of the user from the specific type of client, and the information is pushed according to the user classification aiming at the specific type of client, so that the directionality of information pushing can be improved.
4) When the terminal meets the timing condition or generates a potential event representing user replacement, the behavior data of the user can be acquired, the timeliness of the behavior data can be ensured, and the timeliness of information pushing is further improved.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (15)

1. An information push processing method, comprising:
acquiring behavior data formed by detecting the operation of a user in a terminal;
performing clustering processing according to the behavior data belonging to the terminal, and selecting operation characteristics of different dimensions of the terminal from clustering results;
predicting the selected features according to a classification model to obtain a classification result of the current user in the terminal;
judging the replacement of the user in the terminal according to the classification result of the current user in the terminal;
and according to the replacement of the user in the terminal, the information push adapting to the corresponding classification result of the replaced user is realized.
2. The method of claim 1, wherein the obtaining behavior data formed by detecting the user's operation in the terminal comprises:
obtaining behavioral data in at least one of the following ways:
acquiring behavior data which is formed and acquired before the timing condition is met when the terminal is detected to meet the timing condition;
acquiring behavior data which is formed when a potential event representing user replacement is generated in the terminal and is acquired before the potential event is generated;
and detecting behavior data formed by operations according with the characteristics of the corresponding information push scene when the terminal is in a specific information push scene.
3. The method of claim 1, wherein the predicting the features clustered by the terminal according to a classification model to obtain a classification result of a current user in the terminal comprises:
when the classification model uses a classification tree model,
according to the descending order of the priority of each dimension characteristic included by the classification tree model, sequentially carrying out classification judgment on the obtained operation characteristics of the corresponding dimension according to the classification conditions corresponding to the operation characteristics of each dimension until the operation characteristics are classified,
and determining the classification corresponding to the accorded features as the classification result of the current user in the terminal according to the classification condition of any feature in the priority descending order.
4. The method of claim 1, further comprising:
when the classification model is a classification tree model,
obtaining candidate characteristics with different dimensionalities of each user sample in a user sample set;
calculating information gain corresponding to the classification tree model when the candidate features of the dimensions are used for classifying the users in the user sample set;
selecting a preset number of candidate features with the highest information gain as features corresponding to the corresponding dimensionality, and forming a priority descending order of the operation features of the corresponding dimensionality according to the descending order of the information gain.
5. The method of claim 4, wherein the calculating the information gain corresponding to the classification tree model when classifying the users in the user sample set by the candidate features of each of the dimensions comprises:
calculating initial information entropy based on prior classification results in the user sample set;
constructing different classification conditions according to the operation characteristics of any dimension, and calculating corresponding reference information entropy when the different classification conditions are used for classifying the sample users in the sample set;
and calculating the difference value between the reference information entropy and the initial information entropy as the information gain of the corresponding dimension characteristics.
6. The method of claim 1, wherein the determining the change of the user in the terminal according to the classification result of the current user in the terminal comprises:
when the classification of the current user in the terminal exceeds a distance threshold value from the historical classification of the user in the terminal, judging that the user in the terminal is replaced;
or, when the classification of the current user in the terminal is the same twice continuously and is different from the historical classification of the user in the terminal, judging that the user in the terminal is replaced.
7. The method of any of claims 1 to 6, further comprising:
when the behaviour data is obtained from a plurality of terminals,
and identifying different terminals to which the obtained behavior data belongs according to the hardware identifiers carried by the obtained behavior data.
8. The method according to any one of claims 1 to 6, wherein said implementing information push adapted to replace respective classification results of users comprises:
when a user in the terminal is replaced,
inquiring information of which the orientation conditions accord with corresponding classification results according to the classification results of the replacement users;
and before the user in the terminal is replaced, pushing the inquired information to the terminal.
9. The method according to any one of claims 1 to 6, wherein said implementing information push adapted to replace respective classification results of users comprises:
when a user in the terminal is replaced,
and inquiring and presenting the information of which the orientation conditions accord with the corresponding classification results in the terminal according to the classification results of the replacement users.
10. An information push processing device, comprising:
the data acquisition unit is used for acquiring behavior data formed by detecting the operation of a user in the terminal;
the characteristic determining unit is used for carrying out clustering processing according to the behavior data belonging to the terminal and selecting the operation characteristics of different dimensions operated in the terminal from the clustering result;
the classification unit is used for predicting the selected features according to a classification model to obtain a classification result of the current user in the terminal;
the judging unit is used for judging the replacement of the user in the terminal according to the classification result of the current user in the terminal;
and the pushing unit is used for pushing the information adapting to the corresponding classification result of the user to be replaced according to the replacement of the user in the terminal.
11. The information push processing device according to claim 10,
the classification unit is specifically configured to, when the classification model uses a classification tree model, perform classification judgment on the obtained operation features of the corresponding dimensions sequentially according to the descending order of the priority of the operation features of each dimension included in the classification tree model and the classification conditions corresponding to the operation features of each dimension until,
and determining the classification corresponding to the accorded features as the classification result of the current user in the terminal according to the classification condition of any feature in the priority descending order.
12. The information push processing device according to claim 10, further comprising:
the sorting unit is used for obtaining candidate characteristics with different dimensionalities of each user sample in the user sample set when the classification model is a classification tree model;
calculating information gain corresponding to the classification tree model when the candidate features of the dimensions are used for classifying the users in the user sample set;
selecting a preset number of candidate features with the highest information gain as features corresponding to the corresponding dimensionality, and forming a priority descending order of the operation features of the corresponding dimensionality according to the descending order of the information gain.
13. The information push processing device according to claim 12,
the sorting unit is specifically configured to calculate an initial information entropy based on a priori classification result in the user sample set;
constructing different classification conditions according to the operation characteristics of any dimension, and calculating corresponding reference information entropy when the different classification conditions are used for classifying the sample users in the sample set;
and calculating the difference value between the reference information entropy and the initial information entropy as the information gain of the corresponding dimension characteristics.
14. An information push processing device, comprising:
a memory for storing an executable program;
a processor, configured to implement the information push processing method according to any one of claims 1 to 9 when executing the executable program stored in the memory.
15. A storage medium storing an executable program that, when executed by a processor, implements the information push processing method according to any one of claims 1 to 9.
CN201710647371.2A 2017-08-01 2017-08-01 Information push processing method, information push processing device and storage medium Active CN108304432B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710647371.2A CN108304432B (en) 2017-08-01 2017-08-01 Information push processing method, information push processing device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710647371.2A CN108304432B (en) 2017-08-01 2017-08-01 Information push processing method, information push processing device and storage medium

Publications (2)

Publication Number Publication Date
CN108304432A CN108304432A (en) 2018-07-20
CN108304432B true CN108304432B (en) 2021-09-07

Family

ID=62872582

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710647371.2A Active CN108304432B (en) 2017-08-01 2017-08-01 Information push processing method, information push processing device and storage medium

Country Status (1)

Country Link
CN (1) CN108304432B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109101658B (en) * 2018-08-31 2022-05-10 优视科技新加坡有限公司 Information searching method and device, and equipment/terminal/server
CN109509028A (en) * 2018-11-15 2019-03-22 北京奇虎科技有限公司 A kind of advertisement placement method and device, storage medium, computer equipment
CN109561424B (en) * 2018-11-30 2021-08-27 维沃移动通信(深圳)有限公司 Data identifier generation method and mobile terminal
EP3671483B1 (en) * 2018-12-19 2024-01-24 Audi Ag A method and a computer program for receiving, managing and outputting a plurality of user-related data files of different data types on a user-interface of a device and a device for storage and operation of the computer program
CN110248217B (en) * 2019-07-08 2022-04-22 中国联合网络通信集团有限公司 User data synchronization method and device
CN115187344B (en) * 2022-09-13 2022-12-09 南通久拓智能装备有限公司 Big data-based user preference analysis and identification method

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10247198A (en) * 1997-03-05 1998-09-14 Nippon Telegr & Teleph Corp <Ntt> Taste sorting method and device
CN101339562A (en) * 2008-08-15 2009-01-07 北京航空航天大学 Portal personalized recommendation service system introducing into interest model feedback and update mechanism
CN103516588A (en) * 2012-06-30 2014-01-15 北京神州泰岳软件股份有限公司 Method and system of background processing of client-side
CN104008184A (en) * 2014-06-10 2014-08-27 百度在线网络技术(北京)有限公司 Method and device for pushing information
CN104933075A (en) * 2014-03-20 2015-09-23 百度在线网络技术(北京)有限公司 User attribute predicting platform and method
CN105005593A (en) * 2015-06-30 2015-10-28 北京奇艺世纪科技有限公司 Scenario identification method and apparatus for multi-user shared device
CN105404680A (en) * 2015-11-25 2016-03-16 百度在线网络技术(北京)有限公司 Searching recommendation method and apparatus
CN106131703A (en) * 2016-06-28 2016-11-16 青岛海信传媒网络技术有限公司 A kind of method of video recommendations and terminal

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005157535A (en) * 2003-11-21 2005-06-16 Canon Inc Content extraction method, content extraction device, content information display method, and display device
JP2011257918A (en) * 2010-06-08 2011-12-22 Sony Corp Content recommendation device and content recommendation method

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10247198A (en) * 1997-03-05 1998-09-14 Nippon Telegr & Teleph Corp <Ntt> Taste sorting method and device
CN101339562A (en) * 2008-08-15 2009-01-07 北京航空航天大学 Portal personalized recommendation service system introducing into interest model feedback and update mechanism
CN103516588A (en) * 2012-06-30 2014-01-15 北京神州泰岳软件股份有限公司 Method and system of background processing of client-side
CN104933075A (en) * 2014-03-20 2015-09-23 百度在线网络技术(北京)有限公司 User attribute predicting platform and method
CN104008184A (en) * 2014-06-10 2014-08-27 百度在线网络技术(北京)有限公司 Method and device for pushing information
CN105005593A (en) * 2015-06-30 2015-10-28 北京奇艺世纪科技有限公司 Scenario identification method and apparatus for multi-user shared device
CN105404680A (en) * 2015-11-25 2016-03-16 百度在线网络技术(北京)有限公司 Searching recommendation method and apparatus
CN106131703A (en) * 2016-06-28 2016-11-16 青岛海信传媒网络技术有限公司 A kind of method of video recommendations and terminal

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Accurate and Novel Recommendations: An Algorithm Based on Popularity Forecasting;Amin Javari等;《ACM Transactions on Intelligent Systems and Technology》;20151231;第5卷(第4期);56-88页 *
基于用户兴趣变化的 Slope One 协同过滤推荐算法;黄皓璇等;《工业控制计算机》;20170725;112-115页 *

Also Published As

Publication number Publication date
CN108304432A (en) 2018-07-20

Similar Documents

Publication Publication Date Title
CN108304432B (en) Information push processing method, information push processing device and storage medium
CN111782965B (en) Intention recommendation method, device, equipment and storage medium
CN111708964B (en) Recommendation method and device for multimedia resources, electronic equipment and storage medium
US8589378B2 (en) Topic-oriented diversified item recommendation
CN105630856B (en) Automatic aggregation of online user profiles
CN110543598B (en) Information recommendation method and device and terminal
Li et al. Twevent: segment-based event detection from tweets
US10152479B1 (en) Selecting representative media items based on match information
CN109634698B (en) Menu display method and device, computer equipment and storage medium
US9946775B2 (en) System and methods thereof for detection of user demographic information
US20150178282A1 (en) Fast and dynamic targeting of users with engaging content
CN107622072B (en) Identification method for webpage operation behavior, server and terminal
CN110012060B (en) Information pushing method and device of mobile terminal, storage medium and server
US20140280554A1 (en) Method and system for dynamic discovery and adaptive crawling of content from the internet
Ghazanfar et al. The advantage of careful imputation sources in sparse data-environment of recommender systems: Generating improved svd-based recommendations
CN107613022A (en) Content delivery method, device and computer equipment
CN105893406A (en) Group user profiling method and system
WO2014149199A1 (en) Method and system for multi-phase ranking for content personalization
WO2015120798A1 (en) Method for processing network media information and related system
CN109165975B (en) Label recommending method, device, computer equipment and storage medium
KR20130062442A (en) Method and system for recommendation using style of collaborative filtering
US11244245B2 (en) Method for approximate k-nearest-neighbor search on parallel hardware accelerators
US10511681B2 (en) Establishing and utilizing behavioral data thresholds for deep learning and other models to identify users across digital space
US9177066B2 (en) Method and system for displaying comments associated with a query
Song et al. Query-less: Predicting task repetition for nextgen proactive search and recommendation engines

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20221117

Address after: 1402, Floor 14, Block A, Haina Baichuan Headquarters Building, No. 6, Baoxing Road, Haibin Community, Xin'an Street, Bao'an District, Shenzhen, Guangdong 518133

Patentee after: Shenzhen Yayue Technology Co.,Ltd.

Address before: 518000 Tencent Building, No. 1 High-tech Zone, Nanshan District, Shenzhen City, Guangdong Province, 35 Floors

Patentee before: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd.