CN106529711B - User behavior prediction method and device - Google Patents

User behavior prediction method and device Download PDF

Info

Publication number
CN106529711B
CN106529711B CN201610951917.9A CN201610951917A CN106529711B CN 106529711 B CN106529711 B CN 106529711B CN 201610951917 A CN201610951917 A CN 201610951917A CN 106529711 B CN106529711 B CN 106529711B
Authority
CN
China
Prior art keywords
user
cluster
behavior
similarity
clusters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610951917.9A
Other languages
Chinese (zh)
Other versions
CN106529711A (en
Inventor
赵影
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Neusoft Corp
Original Assignee
Neusoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Neusoft Corp filed Critical Neusoft Corp
Priority to CN201610951917.9A priority Critical patent/CN106529711B/en
Publication of CN106529711A publication Critical patent/CN106529711A/en
Application granted granted Critical
Publication of CN106529711B publication Critical patent/CN106529711B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Abstract

The disclosure relates to a user behavior prediction method and a device, wherein the method comprises the following steps: collecting behavior record data of at least two users; clustering the behavior record data of each user respectively to form a plurality of clusters; respectively filtering the plurality of clusters corresponding to each user to obtain a long-term behavior characteristic cluster of each user; and determining the similarity between the users according to the long-term behavior feature cluster of each user so as to predict the user behavior. The method and the device have the advantages that the long-term behavior feature cluster of a single user is utilized to achieve acquisition of similar users so as to predict the user behavior, and the behavior prediction of the single user can be more accurate and fine; short-term behaviors in the user behaviors are filtered out, and the accuracy of prediction can be improved.

Description

User behavior prediction method and device
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a user behavior prediction method and apparatus.
Background
With the acceleration of the urbanization process, the urban traffic system is also rapidly developing. The efficiency of traffic operation is improved, all play important effect such as smooth and easy, the energy resource consumption of urban traffic and user's trip.
In the related technology, the travel habits of a large number of users are collected by utilizing big data analysis, and traffic jam prediction is carried out by combining information such as festivals and holidays so as to guide the users to go out off peak by peak or plan a travel route in advance to avoid jam and the like.
However, the prediction of the related art is a macroscopic prediction, which is based on the travel habits of a large number of users, and a more accurate and fine travel behavior prediction cannot be realized for a single user.
On the other hand, the influence of 'sporadic travel data' in a large amount of data is not considered in the prediction of the related technology, so that the prediction result cannot accurately reflect the travel behavior of the user.
Disclosure of Invention
The purpose of the present disclosure is to provide a user behavior prediction method and device, so as to realize fine and accurate prediction of user behavior.
In order to achieve the above object, in a first aspect, the present disclosure provides a user behavior prediction method, including:
collecting behavior record data of at least two users;
clustering the behavior record data of each user respectively to form a plurality of clusters;
respectively filtering the plurality of clusters corresponding to each user to obtain a long-term behavior characteristic cluster of each user;
and determining the similarity between the users according to the long-term behavior feature cluster of each user so as to predict the user behavior.
Optionally, the step of clustering the behavior records of each user respectively to form a plurality of clusters includes:
setting a sliding window with a preset length;
and clustering the behavior record data positioned in the sliding window to form a plurality of clusters.
Optionally, the clustering the behavior record data located in the sliding window to form a plurality of clusters includes:
respectively taking one or more pieces of behavior recording data in the sliding window as single clusters to form a cluster set;
respectively obtaining the similarity between the other behavior record data in the sliding window and each cluster in the cluster set;
for each cluster in the cluster set, behavior record data with the maximum similarity to each cluster is respectively acquired;
for the maximum similarity corresponding to each cluster, if the maximum similarity is greater than a preset threshold, attributing behavior record data corresponding to the maximum similarity to the cluster, and recalculating the centroid of the cluster; and if the maximum similarity is smaller than a preset threshold value, adding the behavior record data corresponding to the maximum similarity into the cluster set as a new cluster.
Optionally, the step of filtering the multiple clusters corresponding to each user respectively to obtain the long-term behavior feature cluster of each user includes:
counting the quantity of behavior record data in each cluster;
deleting the clusters with the quantity of the behavior record data in the clusters smaller than a preset threshold value to obtain the long-term behavior characteristic cluster of each user; or
And filtering the clusters with the cluster dispersion smaller than the preset behavior dispersion according to the preset behavior dispersion to obtain the long-term behavior feature cluster of each user.
Optionally, the step of filtering the multiple clusters corresponding to each user respectively to obtain the long-term behavior feature cluster of each user includes:
counting the quantity of behavior record data in each cluster;
deleting the clusters with the quantity of the behavior recording data in the clusters smaller than a preset threshold value to obtain clusters to be processed;
and filtering clusters with the dispersion smaller than the preset behavior dispersion in the clusters to be processed according to the preset behavior dispersion so as to obtain the long-term behavior feature cluster of each user.
Optionally, the step of determining similarity between users according to the long-term behavior feature cluster of each user to predict user behavior includes:
obtaining cluster similarity of long-term behavior feature clusters of a user to be predicted and one or more users;
according to the cluster similarity, obtaining the similarity between the user to be predicted and the one or more users;
according to the similarity among the users, determining a target user similar to the user to be predicted in the one or more users;
and predicting the behavior of the user to be predicted according to the determined target user.
Optionally, the step of determining similarity between users according to the long-term behavior feature cluster of each user to predict user behavior includes:
acquiring cluster similarity of the long-term behavior feature clusters of the multiple users according to the long-term behavior feature clusters of the multiple users;
acquiring the similarity of a plurality of users according to the cluster similarity;
classifying users with the user similarity exceeding a preset similarity threshold into a similar user set;
and predicting the user behavior according to the similar user set.
Optionally, the method further comprises:
and recommending information to the user according to the result of the user behavior prediction.
In a second aspect, the present disclosure provides a user behavior prediction apparatus, including:
the acquisition module is used for acquiring behavior record data of at least two users;
the clustering module is used for clustering the behavior record data of each user respectively to form a plurality of clusters;
the filtering module is used for respectively filtering the plurality of clusters corresponding to each user to obtain the long-term behavior characteristic cluster of each user;
and the prediction module is used for determining the similarity between the users according to the long-term behavior feature cluster of each user so as to predict the user behavior.
Optionally, the clustering module comprises:
a cluster set forming submodule, configured to take one or more pieces of behavior recording data in the sliding window as a single cluster, respectively, and form a cluster set;
a similarity obtaining submodule, configured to obtain similarity between the remaining behavior record data in the sliding window and each cluster in the cluster set;
the maximum similarity obtaining sub-module is used for respectively obtaining behavior record data with the maximum similarity with each cluster in the cluster set;
the cluster updating submodule is used for attributing the behavior record data corresponding to the maximum similarity to each cluster and recalculating the centroid of each cluster if the maximum similarity is greater than a preset threshold; and if the maximum similarity is smaller than a preset threshold value, adding the behavior record data corresponding to the maximum similarity into the cluster set as a new cluster.
Optionally, the filtration module comprises:
the statistic submodule is used for counting the quantity of the behavior record data in each cluster;
the deleting submodule is used for deleting the clusters of which the quantity of the behavior record data is less than a preset threshold value in the clusters to obtain the clusters to be processed;
and the dispersion filtering submodule is used for filtering the clusters with the dispersion smaller than the preset behavior dispersion in the cluster to be processed according to the preset behavior dispersion so as to obtain the long-term behavior characteristic cluster of each user.
Optionally, the prediction module comprises:
the first cluster similarity obtaining submodule is used for obtaining cluster similarity of long-term behavior feature clusters of a user to be predicted and one or more users;
the first user similarity obtaining submodule is used for obtaining the similarity between the user to be predicted and the one or more users according to the cluster similarity;
the target user obtaining sub-module is used for determining a target user similar to the user to be predicted in the one or more users according to the similarity among the users;
and the first behavior prediction sub-module is used for predicting the behavior of the user to be predicted according to the determined target user.
Optionally, the prediction module comprises:
the second cluster similarity obtaining sub-module is used for obtaining cluster similarity of the long-term behavior feature clusters of the multiple users according to the long-term behavior feature clusters of the multiple users;
the second user similarity obtaining submodule is used for obtaining the similarity of a plurality of users according to the cluster similarity;
the similar user set acquisition submodule is used for attributing the users with the user similarity exceeding a preset similarity threshold to the similar user set;
and the second behavior prediction submodule is used for predicting the user behavior according to the similar user set.
Optionally, the apparatus further comprises:
and the information recommendation module is used for recommending information to the user according to the result of the user behavior prediction.
In a third aspect, the present disclosure provides a user behavior prediction apparatus, including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to: collecting behavior record data of at least two users; clustering the behavior record data of each user respectively to form a plurality of clusters; respectively filtering the plurality of clusters corresponding to each user to obtain a long-term behavior characteristic cluster of each user; and determining the similarity between the users according to the long-term behavior feature cluster of each user so as to predict the user behavior.
By the technical scheme, similar users are obtained by utilizing the long-term behavior feature cluster of a single user, so that the user behavior is predicted, for example, whether the user will go out in a certain specific time or not, the place of the user going out and the like can be predicted, and the behavior prediction of the single user can be more accurate and fine; short-term behaviors in user behaviors are filtered out, so that the accuracy of prediction can be improved; by predicting the travel behaviors of the user, the guiding significance is provided for the operation of rail transit; in addition, information recommendation is performed according to the predicted user behaviors, targeted information recommendation can be achieved, and user experience and commercial value are improved.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:
FIG. 1 is a schematic flow chart diagram of a user behavior prediction method according to an exemplary embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a data collection platform according to an embodiment of the present disclosure;
FIG. 3 is a schematic flow chart illustrating clustering of behavior record data located in a sliding window to form a plurality of clusters according to an embodiment of the present disclosure;
FIG. 4 is a schematic flow chart illustrating filtering clusters in an embodiment of the present disclosure;
FIG. 5 is a schematic flow chart illustrating user behavior prediction according to an embodiment of the present disclosure;
FIG. 6 is a schematic flow chart illustrating user behavior prediction according to another embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of a user behavior prediction apparatus according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of a user behavior prediction apparatus according to another embodiment of the present disclosure;
fig. 9 is a block diagram illustrating an apparatus for a user behavior prediction method according to an example embodiment of the present disclosure.
Detailed Description
The following detailed description of specific embodiments of the present disclosure is provided in connection with the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present disclosure, are given by way of illustration and explanation only, not limitation.
Fig. 1 is a flowchart illustrating a user behavior prediction method according to an exemplary embodiment of the present disclosure. The user behavior prediction method comprises the following steps:
step 101, collecting behavior record data of at least two users.
In an embodiment of the present disclosure, an example is given in which behavior record data of a user includes behavior data of a trip, and an embodiment of the present disclosure is described. The travel behavior record data may include data acquired from a public transportation system, a subway system, a railway system, an aviation system, a road transportation system, a third-party platform (e.g., various travel software systems, a weather forecast system, a news system, etc.), and the like.
For the subway system, when a user goes out, if the user uses a bus card to swipe the card for entering and exiting, the user can obtain the trip data such as the departure place, the departure time, the destination, the arrival time and the like according to the card swiping transaction data. For example, in one embodiment, each bus card has a respective identification number. When each card-holding user swipes the card in and out of the station, transaction data is generated, and the transaction data corresponds to the identification numbers, so that the travel data of the card-holding user corresponding to each identification number can be acquired.
For the public traffic system only performing the system of swiping the card on the bus, the departure time can be obtained according to the transaction data generated when the card is swiped on the bus. In one embodiment, the departure station (origin) may be determined from the departure time in conjunction with the GPS of the bus. The destination and arrival time may be determined by obtaining the destination and arrival time in conjunction with other information, such as transfer swipe information for the same bus card. If the information of transfer card swiping cannot be combined, the destination and the arrival time can be set as defaults, or the destination station of the bus is taken as the destination, and the arrival time of the bus at the destination station is taken as the arrival time.
In an embodiment of the disclosure, in order to more comprehensively reflect the travel behavior of the user, the travel information of the same user through the railway system, the aviation system, the public transportation system, the road transportation system and the third-party platform is integrated with the travel information of the railway system, so as to obtain the travel behavior record data of the user.
For a railway system and an aviation system, travel data such as departure place, departure time, destination, arrival time and the like can be obtained according to ticket information of purchased tickets. For a travel software system in a third-party platform (for example, a network car booking platform, an internet bus travel platform, and the like), travel data such as a departure place, departure time, a destination, arrival time, and the like can be obtained according to travel selection (selection of the departure place and the destination) of a user.
It should be understood that for a railway system, an aviation system, a road transportation system and a third-party platform adopting a real-name system, behavior record data of travel of the same user can be acquired from different systems and platforms according to identity information (for example, identity card information) of the user. And for the subway system and the public transport system, the identity information of the user can be bound with the identification number of the bus when the bus card is purchased, so that the trip data of the same user of each system and platform can be integrated according to the identity information of the user.
In an embodiment of the present disclosure, the behavior record data of the user trip at least includes one of the following: travel date, transportation means used for travel, departure place, departure time, destination, arrival time, weather information (weather condition of departure place, weather condition of destination), event information (e.g., major holiday information, major meeting), and the like.
As described above, the travel date, the transportation means used for travel, the departure place, the departure time, the destination, and the arrival time can be obtained according to card-swiping transaction data, ticket purchasing data, or travel selection of the user; weather information may be obtained from a weather forecasting system; the event information may be obtained from a news system, calendar, etc.
In an embodiment of the present disclosure, a data collection platform may be established to collect behavior data of a user. Referring to fig. 2, the data collection platform 200 is communicatively connected to a subway system 201, a public transportation system 202, a railway system 203, an aviation system 204, a road transportation system 205, and a third party platform 206, respectively. The data collection platform 200 may acquire the trip data of the user from each system, and may perform operations such as format conversion and information extraction on the trip data from different systems to obtain the trip behavior record data of each user.
And 102, clustering the behavior record data of each user respectively to form a plurality of clusters.
Clustering is to divide a data set into different clusters according to a specific criterion (such as a distance criterion), so that the similarity of data objects in the same cluster is as large as possible, and the difference of data objects not in the same cluster is also as large as possible.
In the embodiment of the disclosure, the behavior record data of the user is obtained by tracking the travel record of the user, and in order to accurately reflect the travel behavior of the user, a large amount of data needs to be collected. In order to facilitate analysis of a large amount of data, a sliding window with a preset length is set, and behavior records of a user are sorted according to time. As the user behavior record increases, the sliding window slides (e.g., to the right) to include the newest behavior record in the sliding window, while removing the old behavior record from the sliding window. And clustering the behavior records in the sliding window, and forming a plurality of clusters after clustering, wherein the behaviors in each cluster are similar.
In an embodiment of the present disclosure, the sliding window with the preset length is defined according to a time length, for example, the length of the sliding window may be set to be half a year, and the like.
Referring to fig. 3, a schematic flow chart of clustering behavior record data located in a sliding window to form a plurality of clusters according to an embodiment of the present disclosure is shown.
In an embodiment of the present disclosure, first, the behavior record data in the sliding window is converted into a behavior vector matrix. If the sliding window comprises m behavior record data r of the user A1,r2,……,rmAnd a dimension of each behavior record data is n (for example, if the behavior record data includes travel date, transportation adopted by travel, departure place, departure time, destination, arrival time and weather information, the dimension n is 7), the behavior vector matrix corresponding to the user a is m × n.
Step 301, regarding one or more behavior recording data in the sliding window as a single cluster respectively, and forming a cluster set C.
In the embodiment of the present disclosure, initially, any behavior in the sliding window may be recorded as the first single cluster C in the cluster set C1I.e. C = { C = { (C)1}. The cluster set C will be gradually updated in the subsequent steps.
Step 302, respectively obtaining the similarity between the remaining behavior recording data in the sliding window and each cluster in the cluster set.
In one embodiment, a vector space model is used to calculate the similarity, i.e. the similarity is shown in formula (1).
Figure 47650DEST_PATH_IMAGE001
(1)
Wherein n is the dimension of the behavior record data,rc i is a cluster CiThe center of mass of the lens. Cluster CiCenter of mass ofrc i Can be obtained by the formula (2).
Figure 596443DEST_PATH_IMAGE002
(2)
Therein,. mu.gr 1 Is the cluster CiThe number of the data in (1) is,p j is a cluster CiThe data object of (1).
In one embodiment of the present disclosure, the first behavior is recorded as datar 1 As the first cluster C in the cluster set C1Then the remaining behavioral record data (r) in the sliding window are calculated sequentially from far to near in time2,……,rm) And cluster C1The similarity of (c). I.e. in formula (1)iTaking out the number 1 of the samples,jtaking 2 to m to respectively obtain behavior record data r2,……,rmAnd cluster C1The similarity of (c).
Step 303, for each cluster in the cluster set, acquiring behavior record data having the maximum similarity with each cluster respectively.
304, for the maximum similarity corresponding to each cluster, if the maximum similarity is greater than a preset threshold, attributing the behavior record data corresponding to the maximum similarity to the cluster, and recalculating the centroid of the cluster; and if the maximum similarity is smaller than a preset threshold value, adding the behavior record data corresponding to the maximum similarity into the cluster set as a new cluster.
Steps 302, 303 and 304 are repeatedly executed until the behavior record data in the sliding window all realize clustering and are classified into corresponding clusters.
For example, for cluster C including data recorded by the first behavior in the cluster set1Sequentially acquiring the behavior record data r2,……,rmAnd cluster C1And obtains the similarity with the cluster C1With maximum similarityS max Is recorded as r2. If the maximum similarity isS max If the maximum similarity is larger than the preset threshold, the behavior record data r corresponding to the maximum similarity is recorded2Is classified into cluster C1Attributing the behavior record data corresponding to the maximum similarity to the cluster C1Then, the cluster C is updated according to equation (2)1Center of mass ofrc 1 . When the loop execution reaches the step 302, the behavior record data r in the sliding serial port is obtained3,……,rmWith cluster C having updated centroid1And clustering the behavior record data corresponding to the maximum similarity according to a preset threshold.
If the maximum similarity isS max If the maximum similarity is less than the preset threshold value, the behavior record data r corresponding to the maximum similarity is recorded2As a new cluster C2Added to the cluster set, cluster set C = { C = { C = }1,C2}. When the step 302 is circulated, the behavior in the sliding window is recorded into data r3,……,rmRespectively with cluster C in the cluster set1And cluster C2Similarity calculation is carried out, and the similarity calculation result and the cluster C are respectively obtained1And cluster C2And clustering the behavior record data corresponding to the maximum similarity according to a preset threshold.
In the embodiment of the present disclosure, as time goes on, when adding one or more new behavior record data, the newly added behavior record data according to step 302 and step 304 may be clustered to be classified into an existing cluster in the cluster set or to form a new cluster. Since the length of the sliding window is fixed, when the behavior record data is newly added, the behavior record data that is the oldest time is removed from the sliding window, and the centroid of the cluster to which the removed behavior record data belongs is recalculated.
And 103, respectively filtering a plurality of clusters corresponding to each user to obtain a long-term behavior feature cluster of each user.
The same cluster records are either clustered or discrete, and clusters that reflect short-term behaviors are more clustered because short-term behaviors often occur frequently at a specific time, while clusters that reflect long-term behaviors of users are more discrete because such behaviors are more normal and will continue to appear stably in the behavior records. In the embodiment of the disclosure, by identifying the short-term behaviors in the user behavior record, the noise of user behavior prediction can be reduced, and the prediction accuracy is improved.
In one embodiment of the present disclosure, the formed clusters are filtered in one or two or a combination of the following ways to filter out short-term behaviors and improve the accuracy of prediction.
The first method is as follows: filtering by defining filtering factor
Some of the user behaviors are accidental behaviors of the user, have randomness and cannot reflect the behavior characteristics of the user, and records are often clustered into small clusters in the clusters formed by recording data according to the behaviors in the sliding window. Therefore, the cluster reflecting the long-term behavior characteristic and the recent short-term behavior characteristic of the user can be found out from the clustered clusters by defining the filtering factor f.
Thus, the number of behavior record data in each cluster is counted, and clusters in which the number of behavior record data in the cluster is smaller than a preset threshold f × m (m is the total number of behavior record data in the sliding window) are deleted as noise clusters. Clusters in which the number of behavior recording data in the cluster is larger than f × m are regarded as valid clusters reflecting the behavior characteristics of the user. Therefore, only when the proportion of the quantity of the behavior recording data in the cluster to the total quantity of the behavior recording data in the sliding window reaches a certain value, the cluster is considered to be capable of reflecting the behavior characteristics of the user, and the long-term behavior characteristic cluster of the user is obtained.
The second method comprises the following steps: and filtering the clusters with the cluster dispersion smaller than the preset behavior dispersion according to the preset behavior dispersion to obtain the long-term behavior feature cluster of each user.
For the clusters reflecting the long-term behavior characteristics of the user, the more widely the records in the clusters are distributed, which indicates that the user has similar behaviors for a long time. Therefore, the dispersion w of each cluster is obtained according to the formula (3) so as to filter the clusters and obtain the long-term behavior feature cluster of the user.
Figure 418905DEST_PATH_IMAGE003
(3)
Where n is a dimension of the behavior recording data in the cluster, and t is a value obtained by time-dividing a behavior occurrence time (e.g., a departure time) in the cluster, for example, the behavior occurrence time is: 2016-09-2615: 05:26, then t is the number of points elapsed from point 0, i.e., t =15 x 60+5= 905; d is the span of days in which each row in the cluster is recording data. The preset behavior dispersion w is inversely proportional to the time fluctuation and directly proportional to the number of days.
When the dispersion of a cluster is smaller than the preset behavior dispersion, the behavior is an accidental behavior with short duration or a behavior without large regular time fluctuation and cannot represent the generality of the user behavior, so that the cluster is filtered. Therefore, formed clusters can be filtered according to the dispersion of the clusters, and the long-term behavior feature cluster of the user can be obtained.
In an embodiment of the present disclosure, the clusters formed in step 102 may be filtered in one or two ways as described above, so as to obtain the long-term behavior feature cluster of the user.
Referring to fig. 4, in another embodiment of the present disclosure, the first and second ways may be combined to perform filtering of clusters, further improving the accuracy of prediction.
Step 401, counting the number of behavior recording data in each cluster;
step 402, deleting the clusters with the quantity of the behavior recording data in the clusters smaller than a preset threshold value to obtain the clusters to be processed;
and 403, filtering clusters with the dispersion smaller than the preset behavior dispersion in the clusters to be processed according to the preset behavior dispersion so as to obtain the long-term behavior feature cluster of each user.
And step 104, determining the similarity between the users according to the long-term behavior feature cluster of each user so as to predict the user behavior.
In an embodiment of the present disclosure, users with similar behaviors can be found by using a collaborative idea, so that the target user behavior can be predicted according to the known user behavior.
Referring to fig. 5, a schematic flow chart of the user behavior prediction according to an embodiment of the present disclosure includes the following steps:
step 501, obtaining cluster similarity of long-term behavior feature clusters of a user to be predicted and one or more users.
In one embodiment, user U is connected toxEach cluster in the long-term behavior feature cluster is respectively associated with a user U to be predictedyThe similarity of each cluster in the long-term behavior feature clusters is calculated. User UxTarget cluster in long-term behavior feature cluster and user U to be predictedyThe similarity of the target clusters in the long-term behavior feature cluster of (4) can be obtained by equation (4).
Figure 482676DEST_PATH_IMAGE004
(4)
In the formula CxiAnd CyiAre respectively a user UxTarget cluster in long-term behavior feature cluster and user U to be predictedyThe centroid of a target cluster in the long-term behavioral feature cluster. From equation (2), the centroid is a vector, and | | | | in equation (4) is the length of the vector.
In one embodiment, the maximum similarity value is obtained as the user U to be predictedyAnd user UxCluster similarity of (2).
And 502, acquiring the similarity between the user to be predicted and one or more users according to the cluster similarity.
In one embodiment, user U is obtained according to equation (5)xAnd the user U to be predictedyThe similarity of (c).
Figure 415997DEST_PATH_IMAGE005
(5)
Where n is the dimension of the behavior recording data in the cluster,S(C xi ,C yi )is the cluster similarity.
In the embodiment of the disclosure, according to the cluster similarity, the similarity between the users is further obtained, the similarity between the two users is measured from a finer perspective, and the prediction accuracy is improved.
Step 503, according to the similarity between users, determining a target user similar to the user to be predicted in one or more users.
In one embodiment, the user with the highest similarity may be the target user Ui. In some embodiments, users with similarity reaching a set threshold may also be all targeted users.
And step 504, predicting the behavior of the user to be predicted according to the determined target user.
When the target user U is obtainediAnd the user U to be predictedyIf the behavior of the target user is known, the behavior of the user to be predicted can be predicted.
In one embodiment, if target user UiThe behavior L is carried out at a certain moment, and the user U to be predicted can be obtained through the formula (6)yPerforming the probability of the behavior L so as to treat the predicted user UyAnd performing behavior prediction.
Figure 881613DEST_PATH_IMAGE006
(6)
Wherein the content of the first and second substances,p(U y L)representing the user U to be predictedyProbability of proceeding with action L;S(U x ,U y )and N is the number of target users performing the behavior L.
In one embodiment, behavior L may represent a trip through a site at a time, then U in equation (6)iFor users (one or more of target users) who go out through a certain site at a certain moment, the user U to be predicted is obtained through the formula (6)yProbability of going through a site at a time. When there are multiple sites (i.e. there are multiple behaviors L), the trip users of different sites can be divided among the target users, so as to predict the trip probability of each site, and the site with the highest probability is predicted as UyAnd (5) sites to be traveled.
Referring to fig. 6, a flowchart illustrating a user behavior prediction according to another embodiment of the present disclosure is shown. The difference between this embodiment and the embodiment shown in fig. 5 is that, in this embodiment, the similarity between users is calculated according to the long-term behavior feature cluster of the users, and the users whose similarity exceeds the preset similarity threshold are taken as a similar user group, so that the user behavior prediction is performed according to the similar user group.
This embodiment comprises the steps of:
step 601, obtaining cluster similarity of the long-term behavior feature clusters of the multiple users according to the long-term behavior feature clusters of the multiple users.
Step 602, obtaining the similarity of a plurality of users according to the cluster similarity.
It should be understood that steps 601 and 602 are the same as the above embodiments of steps 501 and 502, respectively, and are not described again here.
And 603, attributing the users with the user similarity exceeding the preset similarity threshold to a similar user set.
In an embodiment of the present disclosure, the preset similarity threshold may be set between 60% and 100%.
And step 604, predicting the user behavior according to the similar user set.
In an embodiment of the present disclosure, when any user or more than a certain proportion of users in the similar user set is detected to perform a certain behavior, it may be predicted that other users in the similar user set may perform the same behavior, and thus, prediction of user behavior may be achieved. For example, for the trip behavior of the user, if 1000 users are collected from the similar users, 1 or 10 users are detected to trip from the site a, and it can be predicted that similar trip behaviors will occur for the rest of the users; therefore, relevant stations can make operation scheduling preparation in advance, and guidance significance for operation of rail transit is achieved. In some embodiments, prompt information may also be sent to users in the similar user set, for example, to prompt whether a travel jam will occur, prompt the user to make a plan in advance, and the like.
In an embodiment of the disclosure, information recommendation may be performed to a user according to the behavior prediction result. The recommended information may be merchandise information, reminder information (e.g., weather reminders, event reminders), and the like. For example, according to the behavior prediction result, recommendation of a corresponding product (e.g., a movie, an advertisement, etc.) is performed at the predicted user travel point.
According to the user behavior prediction method, the acquisition of similar users is realized by utilizing the long-term behavior feature cluster of a single user, so that the user behavior can be predicted, for example, whether the user will go out in a certain specific time, the place of the user going out and the like can be predicted, and the more accurate and fine behavior prediction of the single user can be realized; short-term behaviors in user behaviors are filtered out, so that the accuracy of prediction can be improved; by predicting the travel behaviors of the user, the guiding significance is provided for the operation of rail transit; in addition, information recommendation is performed according to the predicted user behaviors, targeted information recommendation can be achieved, and user experience and commercial value are improved.
Fig. 7 is a schematic structural diagram of a user behavior prediction apparatus according to an embodiment of the present disclosure. The user behavior prediction apparatus 700 includes:
an acquisition module 701, configured to acquire behavior record data of at least two users;
a clustering module 702, configured to cluster the behavior record data of each user to form multiple clusters;
a filtering module 703, configured to filter the multiple clusters corresponding to each user, respectively, to obtain a long-term behavior feature cluster of each user;
and the predicting module 704 is configured to determine similarity between users according to the long-term behavior feature cluster of each user, so as to predict user behavior.
In one embodiment, clustering module 702 includes:
a cluster set forming sub-module 7021, configured to use one or more behavior recording data in the sliding window as a single cluster, respectively, to form a cluster set;
a similarity obtaining sub-module 7022, configured to obtain similarities between the remaining behavior record data in the sliding window and each cluster in the cluster set, respectively;
a maximum similarity obtaining sub-module 7023, configured to, for each cluster in the cluster set, respectively obtain behavior record data having a maximum similarity to the each cluster;
a cluster updating submodule 7024, configured to, for the maximum similarity corresponding to each cluster, if the maximum similarity is greater than a preset threshold, assign behavior record data corresponding to the maximum similarity to the cluster, and recalculate a centroid of the cluster; and if the maximum similarity is smaller than a preset threshold value, adding the behavior record data corresponding to the maximum similarity into the cluster set as a new cluster.
In one embodiment, the filtering module 703 includes:
a statistics submodule 7031, configured to count the number of behavior record data in each cluster;
a deletion submodule 7032, configured to delete a cluster in which the amount of behavior recording data in the cluster is smaller than a preset threshold, to obtain a cluster to be processed;
and the dispersion filtering submodule 7033 is configured to filter, according to the preset behavior dispersion, clusters in the to-be-processed cluster, whose dispersion is smaller than the preset behavior dispersion, so as to obtain the long-term behavior feature cluster of each user.
It should be understood that the filtering module 703 in the embodiment of the present disclosure may also include a statistics sub-module 7031 and a deletion sub-module 7032 to implement filtering to obtain the long-term behavior feature cluster of the user. In some embodiments, the filtering module 703 may include a dispersion filtering sub-module 7033 to implement filtering to obtain the long-term behavior feature cluster of the user.
In one embodiment, the prediction module 704 includes:
a first cluster similarity obtaining sub-module 7041, configured to obtain cluster similarities of long-term behavior feature clusters of the user to be predicted and one or more users;
a first user similarity obtaining sub-module 7042, configured to obtain, according to the cluster similarity, similarities between a user to be predicted and one or more users;
the target user obtaining sub-module 7043 is configured to determine, among the one or more users, a target user similar to the user to be predicted according to the similarity between the users;
and the first behavior prediction sub-module 7044 is configured to predict the behavior of the user to be predicted according to the determined target user.
Referring to fig. 8, in one embodiment, prediction module 704 includes:
a second cluster similarity obtaining sub-module 7045, configured to obtain cluster similarities of long-term behavior feature clusters of the multiple users according to the long-term behavior feature clusters of the multiple users;
a second user similarity obtaining sub-module 7046, configured to obtain similarities of multiple users according to the cluster similarity;
the similar user set obtaining sub-module 7047 is configured to assign users whose user similarity exceeds a preset similarity threshold to a similar user set;
and the second behavior prediction sub-module 7048 is configured to perform user behavior prediction according to the similar user set.
In one embodiment, the apparatus 700 further comprises:
and the information recommendation module is used for recommending information to the user according to the result of the user behavior prediction.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 9 is a block diagram illustrating an apparatus 900 for a user behavior prediction method according to an example embodiment. For example, the apparatus 900 may be provided as a server. Referring to fig. 9, the apparatus 900 includes a processing component 901 that further includes one or more processors and memory resources, represented by memory 902, for storing instructions, e.g., applications, that are executable by the processing component 901. The application programs stored in memory 902 may include one or more modules that each correspond to a set of instructions. Further, the processing component 901 is configured to execute instructions to perform the user behavior prediction method described above.
The device 900 may also include a power component 903 configured to perform power management of the device 900, a wired or wireless network interface 904 configured to connect the device 900 to a network, and an input/output (I/O) interface 905. The apparatus 900 may operate based on an operating system stored in the memory 902, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.
In one embodiment, the data collection platform shown in fig. 2 can be disposed in the apparatus 900, and the user behavior record data can be obtained from various systems through the network interface 904 and/or the input/output interface 905, and processed through the processing component 901. These data may be stored in memory 902.
It should be understood that, in the above embodiments of the present disclosure, the user behavior prediction is performed by taking the user travel behavior data as an example, and the user behavior prediction can be performed on the consumption behavior data, the web browsing behavior data, and the usage behavior data of various applications in a manner similar to the user travel behavior data.
According to the user behavior prediction method and device, the acquisition of similar users is realized by utilizing the long-term behavior feature cluster of a single user, so that the user behavior can be predicted, for example, whether the user will go out in a certain specific time, the place of the user going out and the like can be predicted, and the more accurate and fine behavior prediction of the single user can be realized; short-term behaviors in user behaviors are filtered out, so that the accuracy of prediction can be improved; by predicting the travel behaviors of the user, the guiding significance is provided for the operation of rail transit; in addition, information recommendation is performed according to the predicted user behaviors, targeted information recommendation can be achieved, and user experience and commercial value are improved.
The preferred embodiments of the present disclosure are described in detail with reference to the accompanying drawings, however, the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present disclosure within the technical idea of the present disclosure, and these simple modifications all belong to the protection scope of the present disclosure.
It should be noted that the various features described in the above embodiments may be combined in any suitable manner without departing from the scope of the invention. In order to avoid unnecessary repetition, various possible combinations will not be separately described in this disclosure.
In addition, any combination of various embodiments of the present disclosure may be made, and the same should be considered as the disclosure of the present disclosure, as long as it does not depart from the spirit of the present disclosure.

Claims (15)

1. A method for predicting user behavior, comprising:
collecting behavior record data of at least two users;
clustering the behavior record data of each user respectively to form a plurality of clusters;
respectively filtering the plurality of clusters corresponding to each user to obtain a long-term behavior characteristic cluster of each user;
determining the similarity between users according to the long-term behavior feature cluster of each user so as to predict the user behavior; wherein the content of the first and second substances,
the determining the similarity between the users according to the long-term behavior feature cluster of each user to predict the user behavior includes:
according to a preset similarity calculation formula and cluster similarity of a user to be predicted and one or more long-term behavior feature clusters of the user to be predicted, obtaining the similarity of the user to be predicted and one or more users so as to predict user behaviors of the user to be predicted; wherein the content of the first and second substances,
the similarity calculation formula is as follows:
Figure 395317DEST_PATH_IMAGE001
wherein the content of the first and second substances,S(U x ,U y )for any user U in the one or more usersxAnd the user U to be predictedySimilarity between them, n is the dimension of the behavior record data in the long-term behavior feature cluster, CxiFor the userUxC of a target cluster in each of the long-term behavioral feature clustersyiFor the user U to be predictedyThe centroid of a target cluster in each of the clusters of long-term behavioral characteristic clusters,S(C xi ,C yi )is the CxiAnd said CyiThe cluster similarity between the two clusters is high,E(S(C xi ,C yi ))is the CxiAnd said CyiA mathematical expectation of cluster similarity between; wherein the content of the first and second substances,
the calculation formula of the cluster similarity is as follows:
Figure 415226DEST_PATH_IMAGE002
wherein, the CxiAnd said CyiCalculating in a vector form in the calculation formula of the cluster similarity, wherein | | CxiI is the CxiLength of the corresponding vector, the | | | CyiI is the CyiThe length of the corresponding vector.
2. The method of claim 1, wherein the step of clustering the behavior records of each user separately to form a plurality of clusters comprises:
setting a sliding window with a preset length;
and clustering the behavior record data positioned in the sliding window to form a plurality of clusters.
3. The method of claim 2, wherein clustering the behavior record data located within the sliding window to form a plurality of clusters comprises:
respectively taking one or more pieces of behavior recording data in the sliding window as single clusters to form a cluster set;
respectively obtaining the similarity between the other behavior record data in the sliding window and each cluster in the cluster set;
for each cluster in the cluster set, behavior record data with the maximum similarity to each cluster is respectively acquired;
for the maximum similarity corresponding to each cluster, if the maximum similarity is greater than a preset threshold, attributing behavior record data corresponding to the maximum similarity to the cluster, and recalculating the centroid of the cluster; and if the maximum similarity is smaller than a preset threshold value, adding the behavior record data corresponding to the maximum similarity into the cluster set as a new cluster.
4. The method according to claim 1, wherein the step of filtering the plurality of clusters corresponding to each user respectively to obtain the long-term behavior feature cluster of each user comprises:
counting the quantity of behavior record data in each cluster;
deleting the clusters with the quantity of the behavior record data in the clusters smaller than a preset threshold value to obtain the long-term behavior characteristic cluster of each user; or
And filtering the clusters with the cluster dispersion smaller than the preset behavior dispersion according to the preset behavior dispersion to obtain the long-term behavior feature cluster of each user.
5. The method according to claim 1, wherein the step of filtering the plurality of clusters corresponding to each user respectively to obtain the long-term behavior feature cluster of each user comprises:
counting the quantity of behavior record data in each cluster;
deleting the clusters with the quantity of the behavior recording data in the clusters smaller than a preset threshold value to obtain clusters to be processed;
and filtering clusters with the dispersion smaller than the preset behavior dispersion in the clusters to be processed according to the preset behavior dispersion so as to obtain the long-term behavior feature cluster of each user.
6. The method of claim 1, wherein the step of determining similarity between users according to the long-term behavior feature cluster of each user to predict user behavior comprises:
obtaining cluster similarity of long-term behavior feature clusters of a user to be predicted and one or more users;
according to the cluster similarity, obtaining the similarity between the user to be predicted and the one or more users;
according to the similarity among the users, determining a target user similar to the user to be predicted in the one or more users;
and predicting the behavior of the user to be predicted according to the determined target user.
7. The method of claim 1, wherein the step of determining similarity between users according to the long-term behavior feature cluster of each user to predict user behavior comprises:
acquiring cluster similarity of the long-term behavior feature clusters of the multiple users according to the long-term behavior feature clusters of the multiple users;
acquiring the similarity of a plurality of users according to the cluster similarity;
classifying users with the user similarity exceeding a preset similarity threshold into a similar user set;
and predicting the user behavior according to the similar user set.
8. The method of claim 1, further comprising:
and recommending information to the user according to the result of the user behavior prediction.
9. A user behavior prediction apparatus, comprising:
the acquisition module is used for acquiring behavior record data of at least two users;
the clustering module is used for clustering the behavior record data of each user respectively to form a plurality of clusters;
the filtering module is used for respectively filtering the plurality of clusters corresponding to each user to obtain the long-term behavior characteristic cluster of each user;
the prediction module is used for determining the similarity between the users according to the long-term behavior feature cluster of each user so as to predict the user behavior; wherein the content of the first and second substances,
the prediction module is configured to:
according to a preset similarity calculation formula and cluster similarity of a user to be predicted and one or more long-term behavior feature clusters of the user to be predicted, obtaining the similarity of the user to be predicted and one or more users so as to predict user behaviors of the user to be predicted; wherein the content of the first and second substances,
the similarity calculation formula is as follows:
Figure 355500DEST_PATH_IMAGE001
wherein the content of the first and second substances,S(U x ,U y )for any user U in the one or more usersxAnd the user U to be predictedySimilarity between them, n is the dimension of the behavior record data in the long-term behavior feature cluster, CxiIs the user UxC of a target cluster in each of the long-term behavioral feature clustersyiFor the user U to be predictedyThe centroid of a target cluster in each of the clusters of long-term behavioral characteristic clusters,S(C xi ,C yi )is the CxiAnd said CyiThe cluster similarity between the two clusters is high,E(S(C xi ,C yi ))is the CxiAnd said CyiA mathematical expectation of cluster similarity between; wherein the content of the first and second substances,
the calculation formula of the cluster similarity is as follows:
Figure 144464DEST_PATH_IMAGE002
wherein, the CxiAnd said CyiIn the calculation formula of the cluster similarity, the calculation is carried out in a vector modeCalculating, the | | CxiI is the CxiLength of the corresponding vector, the | | | CyiI is the CyiThe length of the corresponding vector.
10. The apparatus of claim 9, wherein the clustering module comprises:
a cluster set forming submodule, configured to take one or more pieces of behavior recording data in the sliding window as a single cluster, respectively, and form a cluster set;
a similarity obtaining submodule, configured to obtain similarity between the remaining behavior record data in the sliding window and each cluster in the cluster set;
the maximum similarity obtaining sub-module is used for respectively obtaining behavior record data with the maximum similarity with each cluster in the cluster set;
the cluster updating submodule is used for attributing the behavior record data corresponding to the maximum similarity to each cluster and recalculating the centroid of each cluster if the maximum similarity is greater than a preset threshold; and if the maximum similarity is smaller than a preset threshold value, adding the behavior record data corresponding to the maximum similarity into the cluster set as a new cluster.
11. The apparatus of claim 9, wherein the filtration module comprises:
the statistic submodule is used for counting the quantity of the behavior record data in each cluster;
the deleting submodule is used for deleting the clusters of which the quantity of the behavior record data is less than a preset threshold value in the clusters to obtain the clusters to be processed;
and the dispersion filtering submodule is used for filtering the clusters with the dispersion smaller than the preset behavior dispersion in the cluster to be processed according to the preset behavior dispersion so as to obtain the long-term behavior characteristic cluster of each user.
12. The apparatus of claim 9, wherein the prediction module comprises:
the first cluster similarity obtaining submodule is used for obtaining cluster similarity of long-term behavior feature clusters of a user to be predicted and one or more users;
the first user similarity obtaining submodule is used for obtaining the similarity between the user to be predicted and the one or more users according to the cluster similarity;
the target user obtaining sub-module is used for determining a target user similar to the user to be predicted in the one or more users according to the similarity among the users;
and the first behavior prediction sub-module is used for predicting the behavior of the user to be predicted according to the determined target user.
13. The apparatus of claim 9, wherein the prediction module comprises:
the second cluster similarity obtaining sub-module is used for obtaining cluster similarity of the long-term behavior feature clusters of the multiple users according to the long-term behavior feature clusters of the multiple users;
the second user similarity obtaining submodule is used for obtaining the similarity of a plurality of users according to the cluster similarity;
the similar user set acquisition submodule is used for attributing the users with the user similarity exceeding a preset similarity threshold to the similar user set;
and the second behavior prediction submodule is used for predicting the user behavior according to the similar user set.
14. The apparatus of claim 9, further comprising:
and the information recommendation module is used for recommending information to the user according to the result of the user behavior prediction.
15. A user behavior prediction apparatus, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to: collecting behavior record data of at least two users; clustering the behavior record data of each user respectively to form a plurality of clusters; respectively filtering the plurality of clusters corresponding to each user to obtain a long-term behavior characteristic cluster of each user; determining the similarity between users according to the long-term behavior feature cluster of each user so as to predict the user behavior; wherein the content of the first and second substances,
the determining the similarity between the users according to the long-term behavior feature cluster of each user to predict the user behavior includes:
according to a preset similarity calculation formula and cluster similarity of a user to be predicted and one or more long-term behavior feature clusters of the user to be predicted, obtaining the similarity of the user to be predicted and one or more users so as to predict user behaviors of the user to be predicted; wherein the content of the first and second substances,
the similarity calculation formula is as follows:
Figure 279036DEST_PATH_IMAGE001
wherein the content of the first and second substances,S(U x ,U y )for any user U in the one or more usersxAnd the user U to be predictedySimilarity between them, n is the dimension of the behavior record data in the long-term behavior feature cluster, CxiIs the user UxC of a target cluster in each of the long-term behavioral feature clustersyiFor the user U to be predictedyThe centroid of a target cluster in each of the clusters of long-term behavioral characteristic clusters,S(C xi ,C yi )is the CxiAnd said CyiThe cluster similarity between the two clusters is high,E(S(C xi ,C yi ))is the CxiAnd said CyiA mathematical expectation of cluster similarity between; wherein the content of the first and second substances,
the calculation formula of the cluster similarity is as follows:
Figure 989503DEST_PATH_IMAGE002
wherein, the CxiAnd said CyiCalculating in a vector form in the calculation formula of the cluster similarity, wherein | | CxiI is the CxiLength of the corresponding vector, the | | | CyiI is the CyiThe length of the corresponding vector.
CN201610951917.9A 2016-11-02 2016-11-02 User behavior prediction method and device Active CN106529711B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610951917.9A CN106529711B (en) 2016-11-02 2016-11-02 User behavior prediction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610951917.9A CN106529711B (en) 2016-11-02 2016-11-02 User behavior prediction method and device

Publications (2)

Publication Number Publication Date
CN106529711A CN106529711A (en) 2017-03-22
CN106529711B true CN106529711B (en) 2020-06-19

Family

ID=58325244

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610951917.9A Active CN106529711B (en) 2016-11-02 2016-11-02 User behavior prediction method and device

Country Status (1)

Country Link
CN (1) CN106529711B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111695114B (en) * 2017-08-07 2023-09-01 奇安信科技集团股份有限公司 User behavior detection method and device
CN107888450B (en) * 2017-11-16 2021-06-22 国云科技股份有限公司 Desktop cloud virtual network behavior classification method
CN108228779B (en) * 2017-12-28 2021-03-23 华中师范大学 Score prediction method based on learning community conversation flow
CN110390415A (en) 2018-04-18 2019-10-29 北京嘀嘀无限科技发展有限公司 A kind of method and system carrying out trip mode recommendation based on user's trip big data
CN108932525B (en) * 2018-06-07 2022-04-29 创新先进技术有限公司 Behavior prediction method and device
CN109325847A (en) * 2018-09-11 2019-02-12 上海梓颂信息科技有限公司 The data computing system and method for network credit scoring
CN109784970B (en) * 2018-12-13 2020-09-25 交控科技股份有限公司 Service recommendation method and device based on AFC passenger riding data
CN111178421B (en) * 2019-12-25 2023-10-20 贝壳技术有限公司 Method, device, medium and electronic equipment for detecting user state
CN111882421B (en) * 2020-06-17 2022-06-07 马上消费金融股份有限公司 Information processing method, wind control method, device, equipment and storage medium
CN112767032A (en) * 2021-01-22 2021-05-07 北京嘀嘀无限科技发展有限公司 Information processing method and device, electronic equipment and storage medium
CN114780606B (en) * 2022-03-30 2022-10-14 上海必盈特软件系统有限公司 Big data mining method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103699601A (en) * 2013-12-12 2014-04-02 深圳先进技术研究院 Temporal-spatial data mining-based metro passenger classification method
CN105243128A (en) * 2015-09-29 2016-01-13 西华大学 Sign-in data based user behavior trajectory clustering method
CN105608598A (en) * 2015-12-16 2016-05-25 上海交通大学 Modeling method of user' travel behaviors by plane
CN105718946A (en) * 2016-01-20 2016-06-29 北京工业大学 Passenger going-out behavior analysis method based on subway card-swiping data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103699601A (en) * 2013-12-12 2014-04-02 深圳先进技术研究院 Temporal-spatial data mining-based metro passenger classification method
CN105243128A (en) * 2015-09-29 2016-01-13 西华大学 Sign-in data based user behavior trajectory clustering method
CN105608598A (en) * 2015-12-16 2016-05-25 上海交通大学 Modeling method of user' travel behaviors by plane
CN105718946A (en) * 2016-01-20 2016-06-29 北京工业大学 Passenger going-out behavior analysis method based on subway card-swiping data

Also Published As

Publication number Publication date
CN106529711A (en) 2017-03-22

Similar Documents

Publication Publication Date Title
CN106529711B (en) User behavior prediction method and device
Gurumurthy et al. Analyzing the dynamic ride-sharing potential for shared autonomous vehicle fleets using cellphone data from Orlando, Florida
CN107506864B (en) Passenger bus route planning method and device
WO2016124118A1 (en) Order processing method and system
Dai et al. Bus travel time modelling using GPS probe and smart card data: A probabilistic approach considering link travel time and station dwell time
CN106651213B (en) Service order processing method and device
CN106372674B (en) Driver classification method and device in online taxi service platform
CN110874668B (en) Rail transit OD passenger flow prediction method, system and electronic equipment
CN111932925A (en) Method, device and system for determining travel passenger flow of public transport station
CN112579718B (en) Urban land function identification method and device and terminal equipment
CN106875670A (en) Taxi concocting method based on gps data under Spark platforms
Tavassoli et al. Modelling passenger waiting time using large-scale automatic fare collection data: An Australian case study
Chen et al. Extracting bus transit boarding stop information using smart card transaction data
Luo et al. Using data mining to explore the spatial and temporal dynamics of perceptions of metro services in China: the case of Shenzhen
CN106295868A (en) Traffic trip data processing method and device
JP2012073976A (en) Information service device, information service method, and information service system
US9594926B2 (en) Data processing apparatus, data processing system, and data processing method
CN111327661A (en) Pushing method, pushing device, server and computer readable storage medium
CN110826943A (en) Method and related equipment for judging whether bus allocation is needed or not and determining bus allocation number
CN111310961A (en) Data prediction method, data prediction device, electronic equipment and computer readable storage medium
CN108230670B (en) Method and apparatus for predicting number of mobile bodies appearing at given point in given time period
CN111833595B (en) Shared automobile auxiliary vehicle configuration method, electronic device and storage medium
EP3379851B1 (en) Mobility data processing apparatus, mobility data processing method and mobility data processing system
Reyes et al. An Application of Queueing Theory on the Ticketing Booth of Light Rail Transit 1 (LRT-1) Central Station
JP5768704B2 (en) Taxi user information output method, taxi user information output program, and taxi user information output device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant