WO2016033901A1 - 一种确定移动用户的常驻点信息的方法和装置 - Google Patents
一种确定移动用户的常驻点信息的方法和装置 Download PDFInfo
- Publication number
- WO2016033901A1 WO2016033901A1 PCT/CN2014/093759 CN2014093759W WO2016033901A1 WO 2016033901 A1 WO2016033901 A1 WO 2016033901A1 CN 2014093759 W CN2014093759 W CN 2014093759W WO 2016033901 A1 WO2016033901 A1 WO 2016033901A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- clustering
- point information
- mobile user
- hypothesis quantity
- resident
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
Definitions
- the present invention relates to the field of computer technologies, and in particular, to a method and apparatus for determining resident location information of a mobile user in a computer device.
- the current location of the mobile user is obtained, for example, by the mobile user actively reporting or triggering the mobile user to report, and the current location of the mobile user is obtained. Further, operations such as positioning are performed based on the current position.
- a method for determining resident location information of a mobile user in a computer device comprises the steps of:
- the space-time point information is used to indicate a spatial location of the mobile user and time point information corresponding to when the mobile user is located in the spatial location;
- an apparatus for determining a resident point information of a mobile user in a computer device comprising the following means:
- a device for acquiring a plurality of space-time point information of the mobile user where the space-time point information is used to indicate a spatial location of the mobile user and time point information corresponding to when the mobile user is located in the spatial location;
- a device for performing cluster analysis on the plurality of spatiotemporal point information based on a clustering algorithm to determine a plurality of resident point information of the mobile user a device for performing cluster analysis on the plurality of spatiotemporal point information based on a clustering algorithm to determine a plurality of resident point information of the mobile user.
- the present invention has the following advantages: 1) The clustering analysis of the time and space information of the mobile user can be used to determine a plurality of resident points of the mobile user, so that the mobile user's activity can be more accurately understood. The scope and the law of life; 2) The type of each resident point of the mobile user can be determined according to the multiple resident point information of the mobile user, and the probability of the user appearing in a certain resident point area is predicted to some extent.
- FIG. 1 is a schematic flow chart of a method for determining a resident point information of a mobile user in a computer device according to an embodiment of the present invention
- FIG. 2 is a schematic flowchart of a method for determining a resident point information of a mobile user in a computer device according to another embodiment of the present invention
- FIG. 3 is a schematic structural diagram of an apparatus for determining a resident point information of a mobile user in a computer device according to an embodiment of the present invention
- FIG. 4 is a schematic structural diagram of an apparatus for determining a resident point information of a mobile user in a computer device according to another embodiment of the present invention.
- FIG. 1 is a schematic flow chart of a method for determining resident location information of a mobile user in a computer device according to an embodiment of the present invention.
- the method of this embodiment is mainly implemented by a computer device; the computer device includes a network device and a user device.
- the network device includes, but is not limited to, a single network server, a server group composed of multiple network servers, or a cloud computing-based cloud composed of a large number of computers or network servers, wherein the cloud computing It is a kind of distributed computing, a super virtual computer composed of a group of loosely coupled computers;
- the network where the network device is located includes, but not limited to, the Internet, a wide area network, a metropolitan area network, a local area network, a VPN network, and the like.
- the user equipment includes, but is not limited to, a PC, a tablet, a smart phone, a PDA, an IPTV, and the like.
- the method according to the present embodiment includes step S1 and step S2.
- step S1 the computer device acquires a plurality of time and space point information of the mobile user.
- the space-time point information is used to indicate a spatial location of the mobile user and time point information corresponding to when the mobile user is located in the spatial location.
- the space-time point information may have multiple representations, including but not limited to: a vector of one metric, one point in a multi-dimensional space, etc.; more preferably, the space-time point information is a four-dimensional space vector.
- the time point information corresponding to the location.
- the computer device can acquire multiple time and space point information of the mobile user in multiple manners. For example, the computer device receives a plurality of time-space point information of the mobile user from other computer devices; and, for example, the mobile user periodically reports the time-space information to the computer device, and the computer device receives the plurality of reports reported by the mobile user for a period of time. Time and space information, etc.
- step S2 the computer device performs cluster analysis on the plurality of spatiotemporal point information based on a clustering algorithm to determine a plurality of resident point information of the mobile user.
- the clustering algorithm includes any algorithm that can be used for clustering analysis, for example, a density-based clustering algorithm, an EM algorithm, and the like.
- the clustering algorithm needs to set the number of cluster centers; more preferably, the clustering algorithm is a density-based clustering algorithm.
- the resident point information includes any information indicating a resident point of the mobile user; preferably, the resident point information includes any information related to a resident point of the mobile user; preferably, the The class in the clustering result obtained by the cluster analysis is used as the resident point information. More preferably, the resident point information corresponding to the class may be determined by performing statistical analysis on the class in the clustering result, where the resident point information includes location attribute information and time attribute information, and the location attribute information And indicating a spatial location or a location range of the resident point, where the time attribute information is used to indicate multiple time point information or a time range when the mobile user is located at the resident point.
- the computer device performs cluster analysis on the plurality of spatiotemporal point information based on a clustering algorithm to obtain a clustering result including multiple classes, and determines a plurality of resident point information of the mobile user according to the plurality of classes. .
- the computer device sets the number of cluster centers of the clustering algorithm to a predetermined number, such as 4; the computer device selects 4 time-space point information from the plurality of time-space point information as a cluster center, for multiple time-space points. Each time and space point information in the information, the computer device calculates a distance between the time and space point information and the four cluster centers respectively, and classifies the time and space point information into a cluster center corresponding to the minimum distance; after that, the computer The device determines the four resident point information of the mobile user according to the four classes in the clustering result.
- a predetermined number such as 4
- the computer device selects 4 time-space point information from the plurality of time-space point information as a cluster center, for multiple time-space points.
- the computer device calculates a distance between the time and space point information and the four cluster centers respectively, and classifies the time and space point information into a cluster center corresponding to the minimum distance; after that, the computer The device determines the four resident point information of the mobile user according to the four
- the clustering algorithm needs to set the number of cluster centers.
- the manner in which the computer device sets the clustering algorithm of the clustering center based on the clustering algorithm to cluster the plurality of spatiotemporal point information, and determines the plurality of resident point information of the mobile user includes but is not limited to :
- the number of clustering centers of the clustering algorithm has been determined in advance, and the computer device directly runs the clustering algorithm based on the determined number of clustering centers to perform cluster analysis on the plurality of space-time point information, and determine A plurality of resident point information of the mobile user.
- the number of clustering centers of the clustering algorithm is not determined. In this case, the computer device needs to first determine the number of suitable cluster centers.
- the computer device may determine an appropriate hypothesis quantity from the plurality of hypothetical quantities as the number of cluster centers.
- the step S2 further includes Step S21 and step S22.
- step S21 for each hypothesized number of all or a part of the plurality of hypothetical quantities, the computer device sets the number of cluster centers of the clustering algorithm to the hypothesis quantity, and based on the clustering algorithm pair
- the plurality of spatiotemporal point information is subjected to cluster analysis, and a clustering result corresponding to the hypothesis quantity is obtained, and a hypothesis quantity is selected according to the plurality of clustering results respectively corresponding to the plurality of hypothesis quantities.
- the computer device selects a hypothesis quantity according to the plurality of clustering results respectively corresponding to the plurality of hypothesized numbers based on at least one of the following:
- the more the number of spatiotemporal point information contained in the class the better the clustering result is.
- the lower the dispersion of the class the better the clustering result is.
- the dispersion is used to indicate the intensity of the class.
- the computer device can determine the dispersion in various manners. For example, the computer device determines the mean of the class according to all the space-time point information in the class, and calculates the range and average between each time-space point information and the mean value. The difference or standard deviation or the like is used to indicate the dispersion of the class.
- step S21 can be implemented in various manners.
- the implementation of step S21 includes but is not limited to:
- step S21 further includes step S2111, step S2112, and step S2113.
- step S2111 for one of the plurality of hypotheses, the number of hypotheses of the clustering result is not determined, the computer device sets the number of cluster centers of the clustering algorithm to the hypothesis quantity, and based on the The clustering algorithm clusters the plurality of spatiotemporal point information to obtain a clustering result corresponding to the hypothesis quantity.
- step S2112 when the clustering result corresponding to the hypothesized quantity meets the first predetermined condition, the computer device uses the hypothetical quantity as the selected hypothesis quantity.
- the first predetermined condition includes any predetermined condition for selecting a hypothetical quantity.
- the first predetermined condition includes but is not limited to:
- the dispersion of the class in the clustering result is below a predetermined dispersion threshold.
- the predetermined number of thresholds is 100
- the number of clustering results corresponding to the number includes four classes
- the number of space-time point information in the four classes is 120, 110, 108, and 150, respectively.
- the computer device determines that the number of spatiotemporal point information in each class of the clustering result corresponding to the hypothetical quantity exceeds a predetermined number threshold, and the computer device determines that the clustering result meets the first predetermined condition, and Assume the quantity as the number of hypotheses selected.
- step S2113 when the clustering result corresponding to the hypothesized number does not meet the first predetermined condition, the computer device repeats the step S2111.
- step S2111 when the clustering result corresponding to the hypothesis quantity does not meet the first predetermined condition, the computer device repeats step S2111 to obtain a clustering result corresponding to the hypothesis quantity of the clustering result that is not determined; and so on, until When the clustering result corresponding to a hypothetical quantity meets the first predetermined condition, the assumed number is taken as the selected hypothesis quantity, and the operation is stopped.
- multiple hypothetical quantities include all natural numbers from 2 to 1000.
- the number of hypotheses selected by the computer device is 2, and in the case where the number of cluster centers is set to 2, clustering analysis is performed on the plurality of spatiotemporal point information based on a clustering algorithm.
- the computer device determines that the clustering result corresponding to “2” does not meet the first predetermined condition, and in step S2113, the computer device repeats step S2111, and the selection is not determined.
- the computer device only needs to obtain a hypothetical number that meets the first predetermined condition, and may perform subsequent operations based on the assumed number without traversing and obtaining all hypothesized number of clustering results.
- step S21 further includes step S2121, step S2122, step S2123, and step S2124.
- step S2121 the computer device uses one of the plurality of hypothetical quantities as the current hypothesis quantity, sets the number of cluster centers of the clustering algorithm to the current hypothesis quantity, and based on the clustering The algorithm performs cluster analysis on the plurality of spatiotemporal point information to obtain a clustering result corresponding to the current hypothesis quantity.
- a plurality of hypothetical quantities include a plurality of natural numbers that are incremented from 2 to 1000.
- the computer device takes "2" as the current hypothesis quantity, sets the number of cluster centers of the clustering algorithm to "2", and performs the plurality of spatiotemporal point information based on the clustering algorithm. Cluster analysis is performed to obtain clustering results corresponding to "2".
- step S2122 the computer device sets the number of cluster centers of the clustering algorithm to the next hypothesis quantity of the current hypothesis quantity, and clusters the plurality of spatiotemporal point information based on the clustering algorithm. Analysis, obtaining clustering results corresponding to the next hypothetical number.
- the computer device sets the number of cluster centers of the clustering algorithm to the next hypothesis quantity “3” of “2”, and performs cluster analysis on the plurality of space-time point information based on the clustering algorithm, and obtains The next hypothetical number corresponds to the clustering result.
- step S2123 when the clustering result corresponding to the next hypothesis quantity is worse than the clustering result corresponding to the current hypothesis quantity, the computer device uses the current hypothesis quantity as the selected hypothesis quantity.
- whether the clustering result corresponding to the next hypothesis quantity is worse than the clustering result corresponding to the current hypothesis quantity may be determined according to the dispersion degree of the class in the clustering result and/or the number of spatiotemporal information points included in the class.
- the computer device may determine that the clustering result corresponding to the next hypothesis quantity is worse than the clustering result corresponding to the current hypothesis quantity; when E 1 is less than E 2 , the computer device may determine that the clustering result corresponding to the next hypothesis quantity is superior.
- the clustering result corresponding to the current assumed number it is possible to calculate the variance E1 between the classes in the clustering result corresponding to the next hypothesis quantity, and the variance E 2 between the classes in the clustering result corresponding to the current hypothesis quantity, and compare E 1 and E 2 when E 1 If it is greater than E 2 , the computer device may determine that the clustering result corresponding to the next hypothesis quantity is worse than the clustering result corresponding to the current hypothesis quantity; when E 1 is less than E 2 , the computer device may determine that the clustering result corresponding to the next hypothesis quantity is superior.
- the clustering result corresponding to the current assumed number it is possible to calculate the variance E1 between the classes in the clustering result corresponding to the next hypothesis quantity, and the variance E 2 between the classes in the clustering result
- step S2124 when the clustering result corresponding to the next hypothesis quantity is better than the clustering result corresponding to the current hypothesis quantity, the computer device repeats the step S2122 by using the next hypothesis quantity as the current hypothesis quantity.
- the computer device will "3" As the current assumed number, and repeating step S2122, the clustering result of "4" is obtained; then, if the clustering result corresponding to "4" is better than the clustering result corresponding to "3", the computer device takes "4" as The current hypothesis quantity continues to repeat step S2122; and so on, until the clustering result corresponding to the next hypothesis quantity is worse than the clustering result corresponding to the current hypothesis quantity, in step S2123, the computer device takes the current hypothesis quantity as the The number of hypotheses selected.
- the computer The device gets the best number of hypotheses. Moreover, since the following operations can be performed based on the assumed number after obtaining the optimal number of hypotheses without continuously obtaining clustering results of other hypothetical quantities, in general, the implementation does not need to traverse and obtain clustering results of all hypothetical quantities. .
- step S21 further includes step S2131 and step S2132.
- step 2131 for each hypothesized number of the plurality of hypothetical quantities, the computer device sets the number of cluster centers of the clustering algorithm to the hypothesized number, and pairs the plurality based on the clustering algorithm The space-time point information is clustered to obtain clustering results corresponding to the number of hypotheses.
- the computer device obtains the clustering result when the number of cluster centers is 2, the clustering result when the number of cluster centers is 3, the clustering result when the number of cluster centers is 4, and the clustering result respectively according to the clustering algorithm The clustering result when the number of cluster centers is 5.
- step 2132 the computer device selects a hypothesis quantity according to the plurality of clustering results respectively corresponding to the plurality of hypothesized numbers.
- the computer device selects multiple clustering results corresponding to the plurality of hypothesized numbers respectively
- the implementation of choosing a hypothetical quantity has been detailed in the foregoing and will not be repeated here.
- hypotheses can be expressed in the form of a set, such as a set [2, 3, 4, ..., 1000], and the computer device can directly read the hypothetical quantity from the set.
- step S22 the computer device determines a plurality of resident point information of the mobile user according to the clustering result corresponding to the selected hypothesis quantity.
- the computer device may determine the plurality of resident point information of the mobile user according to the clustering result corresponding to the selected hypothesis quantity in multiple manners.
- a computer device can directly use multiple classes of clustering results as multiple resident point information for a mobile user.
- the computer device can perform statistical analysis on the class, such as separately counting the spatial location and time point information of all the spatiotemporal point information in the class to determine the corresponding correspondence. Resident point information.
- multiple resident points of the mobile user can be determined by performing cluster analysis on the time and space information of the mobile user, so that the mobile user's activity range and living rules can be more accurately understood.
- FIG. 2 is a schematic flow chart of a method for determining resident location information of a mobile user in a computer device according to another embodiment of the present invention.
- the method of the present embodiment is mainly implemented by a computer device, and any description of the computer device in the embodiment shown in FIG. 1 is incorporated herein by reference.
- the method according to the present embodiment includes step S1, step S2, and step S3.
- the steps S1 and S2 are described in detail in the embodiment shown in FIG. 1, and details are not described herein again.
- step S3 the computer device determines, according to the plurality of resident point information, the type of each of the plurality of resident point information.
- the type of the resident point information is used to indicate the nature of the resident point of the mobile user, such as a home, a restaurant, a entertainment place, a work place, and the like.
- the computer device determines the type of the resident point information by analyzing the resident point information.
- the computer device determines that the location range corresponding to the resident point information is within a residential area, and the computer device determines that the type of the resident point information is home.
- the step S3 further comprises a step S31 and a step S32 performed on each of the plurality of resident point information.
- step S31 the computer device acquires location attribute information and time attribute information of the resident point information.
- the computer device can obtain the location attribute information and the time attribute information of the resident point information in multiple manners.
- the computer device performs statistical analysis on all the space-time point information in the class to obtain the resident point information according to the spatial position in all the space-time point information.
- the location attribute information is obtained, and the time attribute information of the resident point information is obtained according to the time point information in all the time and space point information.
- the computer device may directly extract location attribute information and time of the resident point information from the resident point information. Attribute information.
- step S32 the computer device determines the type of the resident point information according to the location attribute information and the time attribute information.
- the time attribute information of the resident point information indicates that the time range when the mobile user is located at the resident point is concentrated from 9:00 to 18:00 every week from Monday to Friday, and the location attribute information indication of the resident point information is indicated.
- the location of the resident point is an office building, and the computer device determines that the type of the resident point information is a work place.
- the time attribute information of the resident point information indicates that the time range when the mobile user is located at the resident point is concentrated on the weekend from 21:00 to 24:00, and the location attribute information of the resident point information indicates the resident point.
- the computer device determines that the type of resident information is a casino.
- the type of each resident point of the mobile user may be determined according to the multiple resident point information of the mobile user, and the probability that the user appears in a certain resident point area is predicted to some extent.
- FIG. 3 is a schematic structural diagram of an apparatus for determining a resident point information of a mobile user in a computer device according to an embodiment of the present invention.
- the apparatus for determining the resident point information of the mobile user according to the embodiment includes means for acquiring a plurality of spatiotemporal point information of the mobile user (hereinafter referred to as "first acquisition means 1") and for The plurality of spatiotemporal point information is subjected to cluster analysis to determine a plurality of resident point information of the mobile user (hereinafter referred to as "first determining means 2").
- the first obtaining device 1 acquires a plurality of time and space point information of the mobile user.
- the space-time point information is used to indicate a spatial location of the mobile user and time point information corresponding to when the mobile user is located in the spatial location.
- the space-time point information may have multiple representations, including but not limited to: a vector of one metric, one point in a multi-dimensional space, etc.; more preferably, the space-time point information is a four-dimensional space vector.
- the time point information corresponding to the location.
- the first acquiring device 1 can acquire multiple time and space point information of the mobile user in multiple manners.
- the first obtaining device 1 receives a plurality of time and space point information of the mobile user from other computer devices; and, for example, the mobile user periodically reports the time and space information to the computer device, and the first obtaining device 1 receives the data for a period of time. Multiple time-space point information and so on reported by the mobile user.
- the first determining device 2 performs cluster analysis on the plurality of spatiotemporal point information based on a clustering algorithm to determine a plurality of resident point information of the mobile user.
- the clustering algorithm includes any algorithm that can be used for clustering analysis, for example, a density-based clustering algorithm, an EM algorithm, and the like.
- the clustering algorithm needs to set the number of cluster centers; more preferably, the clustering algorithm is a density-based clustering algorithm.
- the resident point information includes any information indicating a resident point of the mobile user; preferably, the resident point information includes any information related to a resident point of the mobile user; preferably, the The class in the clustering result obtained by the cluster analysis is used as the resident point information. More preferably, the resident point information corresponding to the class may be determined by performing statistical analysis on the class in the clustering result, where the resident point information includes location attribute information and time attribute information, and the location attribute information And indicating a spatial location or a location range of the resident point, where the time attribute information is used to indicate multiple time point information or a time range when the mobile user is located at the resident point.
- the first determining device 2 advances the multiple time and space point information based on a clustering algorithm Row clustering analysis is performed to obtain clustering results including a plurality of classes, and determining a plurality of resident point information of the mobile user according to the plurality of classes.
- the first determining means 2 sets the number of clustering centers of the clustering algorithm to a predetermined number, such as 4; the first determining means 2 selects 4 time-space point information from the plurality of time-space point information as a clustering center For each of the plurality of spatiotemporal point information, the first determining means 2 calculates a distance between the spatiotemporal point information and the four cluster centers, and classifies the spatiotemporal point information to a minimum distance. Corresponding cluster center; after that, the first determining means 2 determines four resident point information of the mobile user based on the four classes in the clustering result.
- a predetermined number such as 4
- the first determining means 2 selects 4 time-space point information from the plurality of time-space point information as a clustering center
- the first determining means 2 calculates a distance between the spatiotemporal point information and the four cluster centers, and classifies the spatiotemporal point information to a minimum distance.
- Corresponding cluster center after that, the first determining means 2 determines
- the clustering algorithm needs to set the number of cluster centers.
- the first determining device 2 performs cluster analysis on the plurality of spatiotemporal point information based on the clustering algorithm of the cluster center, and determines a plurality of resident point information of the mobile user. But not limited to:
- the number of clustering centers of the clustering algorithm has been determined in advance, and the first determining device 2 directly runs the clustering algorithm to perform cluster analysis on the plurality of spatiotemporal point information based on the determined number of cluster centers. And determining a plurality of resident point information of the mobile user.
- the number of clustering centers of the clustering algorithm is not determined, and in this case, the first determining means 2 needs to first determine the number of a suitable clustering center.
- the first determining device 2 may determine an appropriate hypothesis quantity from the plurality of hypotheses as the number of cluster centers.
- the first determining means 2 further comprises means for setting the number of clustering centers of the clustering algorithm to the number of hypotheses for all or a part of the plurality of hypothesized quantities, and based on the clustering
- the algorithm performs cluster analysis on the plurality of spatiotemporal point information, obtains a clustering result corresponding to the hypothesis quantity, and selects a hypothetical number of devices according to the plurality of clustering results respectively corresponding to the plurality of hypothesis quantities (not shown) , hereinafter referred to as "selection device") and means for determining a plurality of resident point information of the mobile user according to the clustering result corresponding to the selected hypothesis quantity (not shown, hereinafter referred to as "first sub-determination device" ").
- the selection device For each hypothesized number of all or part of a plurality of hypothetical quantities, the selection device will The number of clustering centers of the clustering algorithm is set to the number of hypotheses, and clustering the plurality of spatiotemporal point information based on the clustering algorithm to obtain a clustering result corresponding to the hypothesis quantity, and A hypothesis quantity is selected according to a plurality of clustering results corresponding to a plurality of hypothesized numbers respectively.
- the selecting means selects a hypothesis quantity according to the plurality of clustering results respectively corresponding to the plurality of hypothesized numbers based on at least one of the following:
- the more the number of spatiotemporal point information contained in the class the better the clustering result is.
- the lower the dispersion of the class the better the clustering result is.
- the dispersion is used to indicate the intensity of the class.
- the selection device may determine the dispersion in various manners. For example, the computer device determines the mean value of the class according to all the space-time point information in the class, and calculates the range and average between each time-space point information and the mean value. The difference or standard deviation or the like is used to indicate the dispersion of the class.
- selection device can be implemented in various ways.
- implementations of the selection device include, but are not limited to:
- the selecting apparatus further includes a hypothesis quantity for determining a clustering result corresponding to one of the plurality of hypotheses, and setting the number of cluster centers of the clustering algorithm to the A device for performing cluster analysis on the plurality of spatiotemporal point information based on the clustering algorithm to obtain a clustering result corresponding to the hypothesized number (not shown, hereinafter referred to as “first clustering device”) And when the clustering result corresponding to the hypothetical quantity meets the first predetermined condition, the hypothetical quantity is used as the device of the selected hypothetical quantity (not shown, hereinafter referred to as “first setting device”) and When the clustering result corresponding to the assumed number does not meet the first predetermined condition, triggering the device that the first clustering device repeatedly performs the operation (not shown, hereinafter referred to as “first triggering device”).
- the first clustering device sets the number of cluster centers of the clustering algorithm to the hypothesis quantity, and based on the aggregation
- the class algorithm performs cluster analysis on the plurality of space-time point information, and obtains The clustering result corresponding to the assumed number is obtained.
- the first setting means uses the assumed number as the selected hypothesis quantity.
- the first predetermined condition includes any predetermined condition for selecting a hypothetical quantity.
- the first predetermined condition includes but is not limited to:
- the dispersion of the class in the clustering result is below a predetermined dispersion threshold.
- the predetermined number of thresholds is 100, and the number of clustering results corresponding to the number includes four classes, and the number of space-time point information in the four classes is 120, 110, 108, and 150, respectively.
- the first setting means determines that the number of spatiotemporal point information in each class of the clustering result corresponding to the hypothetical quantity exceeds a predetermined number threshold, the first setting means determines that the clustering result meets the first predetermined condition, and This assumed number is taken as the number of hypotheses selected.
- the first triggering device triggers the first clustering device to repeatedly perform the operation.
- the first triggering device triggers the first clustering apparatus to repeatedly perform an operation to obtain a hypothetical quantity of the clustering result whose corresponding result is not determined. Corresponding clustering result; and so on, until the clustering result corresponding to a hypothesis quantity meets the first predetermined condition, the first setting device takes the hypothesis quantity as the selected hypothesis quantity, and stops the operation.
- multiple hypothetical quantities include all natural numbers from 2 to 1000.
- the number of hypotheses selected is 2, and in the case where the number of cluster centers is set to 2, the plurality of spatiotemporal point information is clustered based on a clustering algorithm.
- the first triggering device triggers the first clustering device to repeatedly perform the operation Selecting the hypothesis number "4" of the corresponding clustering result, and determining the clustering result thereof; then, the first triggering device continues to trigger because the clustering result corresponding to "4" does not meet the first predetermined condition
- the first clustering device repeatedly performs an operation; and so on until a hypothetical number "5" meeting the first predetermined condition is obtained, first The setting means takes "5" as the selected hypothetical number.
- the computer device only needs to obtain a hypothetical number that meets the first predetermined condition, and may perform subsequent operations based on the assumed number without traversing and obtaining all hypothesized number of clustering results.
- the selecting means further comprises: using one of the plurality of hypothesis quantities as the current hypothesis quantity, and the clustering center of the clustering algorithm
- the quantity is set to the current hypothesis quantity
- the clustering analysis is performed on the plurality of spatiotemporal point information based on the clustering algorithm to obtain a clustering result corresponding to the current hypothesis quantity (not shown, hereinafter referred to as a “second clustering device”) configured to set a number of cluster centers of the clustering algorithm to a next hypothesis quantity of the current hypothesis quantity, and to the plurality of spatiotemporal points based on the clustering algorithm
- the device performs cluster analysis to obtain a clustering result corresponding to the next hypothesized number (not shown, hereinafter referred to as "third clustering device"), and clustering result corresponding to the next hypothesis quantity
- the current hypothesis quantity is used as the device of the selected hypothesis quantity (not shown, hereinafter
- the second clustering device uses one of the plurality of hypothesis numbers as the current hypothesis quantity, sets the number of cluster centers of the clustering algorithm to the current hypothesis quantity, and based on the clustering algorithm pair
- the plurality of spatiotemporal point information is subjected to cluster analysis to obtain a clustering result corresponding to the current hypothesis quantity.
- a plurality of hypothetical quantities include a plurality of natural numbers that are incremented from 2 to 1000.
- the second clustering device takes "2" as the current hypothesis quantity, sets the number of cluster centers of the clustering algorithm to "2", and clusters the plurality of spatiotemporal point information based on the clustering algorithm. Analysis, obtaining clustering results corresponding to "2".
- the third clustering device sets the number of cluster centers of the clustering algorithm to the next hypothesis quantity of the current hypothesis quantity, and based on the clustering algorithm on the plurality of spatiotemporal points The information is clustered to obtain clustering results corresponding to the next hypothesis number.
- the third clustering device sets the number of cluster centers of the clustering algorithm to the next hypothesis quantity “3” of “2”, and performs cluster analysis on the plurality of space-time point information based on the clustering algorithm. A clustering result corresponding to the next hypothesized number is obtained.
- the second setting means uses the current hypothesis quantity as the selected hypothesis quantity.
- whether the clustering result corresponding to the next hypothesis quantity is worse than the clustering result corresponding to the current hypothesis quantity may be determined according to the dispersion degree of the class in the clustering result and/or the number of spatiotemporal information points included in the class.
- the second setting device may determine that the clustering result corresponding to the next hypothesis quantity is worse than the clustering result corresponding to the current hypothesis quantity; when E 1 is less than E 2 , the second setting device may determine the next hypothesis quantity The corresponding clustering result is better than the clustering result corresponding to the current hypothesis quantity.
- the second triggering device triggers the third clustering device to repeatedly perform the operation.
- the second triggering device will "3" is used as the current hypothesis quantity, and triggers the third clustering device to repeatedly perform the operation to obtain the clustering result of "4"; then, if the clustering result corresponding to "4" is better than the clustering result corresponding to "3", Then, the second triggering device uses “4” as the current hypothesis quantity, and continues to trigger the third clustering device to repeatedly perform the operation; and so on, until the clustering result corresponding to the next hypothesis quantity is worse than the clustering result corresponding to the current hypothesis quantity.
- the second setting means takes the current assumed number as the selected hypothetical number.
- the selecting means further comprising, for each hypothesized number of the plurality of hypothesized quantities, setting the number of clustering centers of the clustering algorithm to the hypothesized number, and based on the clustering algorithm a clustering analysis of time-space point information, obtaining a clustering result corresponding to the number of hypotheses (not shown, hereinafter referred to as "fourth clustering device") and for respectively corresponding to the plurality of hypotheses
- a hypothetical number of devices (not shown, hereinafter referred to as "sub-selection device”) is selected.
- the fourth clustering device For each hypothesized number of the plurality of hypothetical quantities, the fourth clustering device sets the number of cluster centers of the clustering algorithm to the hypothesis quantity, and based on the clustering algorithm, the plurality of spatiotemporal points The information is clustered to obtain clustering results corresponding to the number of hypotheses.
- the fourth clustering device obtains the clustering result when the number of cluster centers is 2, the clustering result when the number of cluster centers is 3, and the cluster when the number of cluster centers is 4, respectively, based on the clustering algorithm.
- the result, and the clustering result when the number of cluster centers is 5.
- the sub-selection means selects a hypothesis quantity according to the plurality of clustering results respectively corresponding to the plurality of hypothesized numbers.
- the sub-selection device selects an implementation method of the hypothesis quantity according to the plurality of clustering results corresponding to the plurality of hypothesis quantities, and selects an implementation of the hypothesis quantity according to the plurality of clustering results respectively corresponding to the plurality of hypothesis quantities in the foregoing selection device.
- the method is similar and will not be described here.
- hypotheses can be expressed in the form of a set, such as a set [2, 3, 4, ..., 1000], and the computer device can directly read the hypothetical quantity from the set.
- any number of hypotheses for all or a portion of a plurality of hypothetical quantities is set to the number of clustering centers of the clustering algorithm, and Performing cluster analysis on the plurality of spatiotemporal point information based on the clustering algorithm, obtaining a clustering result corresponding to the hypothesis quantity, and selecting a hypothetical quantity according to multiple clustering results corresponding to the plurality of hypothesis quantities respectively Implementations are intended to be included within the scope of the present invention.
- the first sub-determination device determines a plurality of resident point information of the mobile user according to the clustering result corresponding to the selected hypothesis quantity.
- the first sub-determining device may determine the plurality of resident point information of the mobile user according to the clustering result corresponding to the selected hypothesis quantity in multiple manners.
- the first sub-determination device may directly use a plurality of classes of clustering results as a plurality of resident point information of the mobile user.
- the first sub-determining device may perform statistical analysis on the class, such as separately counting the spatial position and time point information of all the spatio-temporal point information in the class to determine The resident point information corresponding to this class.
- multiple resident points of the mobile user can be determined by performing cluster analysis on the time and space information of the mobile user, so that the mobile user's activity range and living rules can be more accurately understood.
- FIG. 4 is a schematic structural diagram of an apparatus for determining a resident point information of a mobile user in a computer device according to another embodiment of the present invention.
- the apparatus for determining the resident point information of the mobile user according to the embodiment includes the first obtaining means 1, the first determining means 2, and the determining the plurality of resident point information according to the plurality of resident point information A device of a type of each resident point information (hereinafter referred to as "second determining device 3").
- the first obtaining device 1 and the first determining device 2 have been described in detail in the embodiment shown in FIG. 3, and details are not described herein again.
- the second determining means 3 determines the type of each of the plurality of resident point information based on the plurality of resident point information
- the type of the resident point information is used to indicate the nature of the resident point of the mobile user, such as a home, a restaurant, a entertainment place, a work place, and the like.
- the second determining means 3 determines the type of the resident point information by analyzing the resident point information.
- the second determining means 3 determines that the location range corresponding to the resident point information is within one residential area, and the second determining means 3 determines that the type of the resident point information is home.
- the second determining means 3 further comprises means for acquiring location attribute information and time attribute information of the resident point information (not shown, hereinafter referred to as “second acquiring means") and for The location attribute information and the time attribute information, means for determining the type of the resident point information (not shown, hereinafter referred to as “second sub-determination means").
- second acquiring means means for acquiring location attribute information and time attribute information of the resident point information
- second sub-determination means means for determining the type of the resident point information
- the second obtaining means acquires location attribute information and time attribute information of the resident point information.
- the second obtaining device may obtain the location attribute information and the time attribute information of the resident point information in multiple manners.
- the second obtaining device performs statistical analysis on all the space-time point information in the class to obtain the resident point according to the spatial position in all the space-time point information.
- the location attribute information of the information, and the time attribute information of the resident point information is obtained according to the time point information in all the time and space point information.
- the second acquiring device may directly extract location attribute information of the resident point information from the resident point information. And time attribute information.
- the second sub-determining device determines the basis according to the location attribute information and the time attribute information The type of resident point information.
- the time attribute information of the resident point information indicates that the time range when the mobile user is located at the resident point is concentrated from 9:00 to 18:00 every week from Monday to Friday, and the location attribute information indication of the resident point information is indicated.
- the location of the resident point is an office building, and the second sub-determination device determines that the type of the resident point information is a work place.
- the time attribute information of the resident point information indicates that the time range when the mobile user is located at the resident point is concentrated on the weekend from 21:00 to 24:00, and the location attribute information of the resident point information indicates the resident point.
- the second sub-determination device determines that the type of the resident information is an entertainment venue.
- the type of each resident point of the mobile user may be determined according to the multiple resident point information of the mobile user, and the probability that the user appears in a certain resident point area is predicted to some extent.
- the present invention can be implemented in software and/or a combination of software and hardware.
- the various devices of the present invention can be implemented using an application specific integrated circuit (ASIC) or any other similar hardware device.
- the software program of the present invention may be executed by a processor to implement the steps or functions described above.
- the software program (including related data structures) of the present invention can be stored in a computer readable recording medium such as a RAM memory, a magnetic or optical drive or a floppy disk and the like.
- some of the steps or functions of the present invention may be implemented in hardware, for example, as a circuit that cooperates with a processor to perform various steps or functions.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明提出了一种在计算机设备中确定移动用户的常驻点信息的方法,其中,该方法包括以下步骤:a.获取所述移动用户的多个时空点信息,其中,所述时空点信息用于指示所述移动用户的空间位置以及移动用户位于该空间位置时对应的时间点信息;b.基于聚类算法对所述多个时空点信息进行聚类分析,来确定所述移动用户的多个常驻点信息。根据本发明的方案,可根据移动用户的时空点信息确定移动用户的多个常驻点信息,并确定该等常驻点信息的类型。
Description
本发明涉及计算机技术领域,尤其涉及一种在计算机设备中确定移动用户的常驻点信息的方法和装置。
现有技术中,通常仅会获取移动用户的一个当前位置,例如通过移动用户主动上报或者触发移动用户上报等方式,获取移动用户的当前位置。进而,基于该当前位置来进行诸如定位等操作。
发明内容
本发明的目的是提供一种在计算机设备中确定移动用户的常驻点信息的方法和装置。
根据本发明的一个方面,提供一种在计算机设备中确定移动用户的常驻点信息的方法,其中,该方法包括以下步骤:
a.获取所述移动用户的多个时空点信息,其中,所述时空点信息用于指示所述移动用户的空间位置以及移动用户位于该空间位置时对应的时间点信息;
b.基于聚类算法对所述多个时空点信息进行聚类分析,来确定所述移动用户的多个常驻点信息。
根据本发明的另一个方面,还提供了一种在计算机设备中确定移动用户的常驻点信息的装置,其中,该装置包括以下装置:
用于获取所述移动用户的多个时空点信息的装置,其中,所述时空点信息用于指示所述移动用户的空间位置以及移动用户位于该空间位置时对应的时间点信息;
用于基于聚类算法对所述多个时空点信息进行聚类分析,来确定所述移动用户的多个常驻点信息的装置。
与现有技术相比,本发明具有以下优点:1)可通过对移动用户的时空点信息进行聚类分析,来确定移动用户的多个常驻点,从而可更准确地了解移动用户的活动范围以及生活规律;2)可根据移动用户的多个常驻点信息,确定移动用户的每个常驻点的类型,且在一定程度上预测用户在某常驻点区域出现的概率。
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本发明的其它特征、目的和优点将会变得更明显:
图1为本发明一个实施例的在计算机设备中确定移动用户的常驻点信息的方法的流程示意图;
图2为本发明另一个实施例的在计算机设备中确定移动用户的常驻点信息的方法的流程示意图;
图3为本发明一个实施例的在计算机设备中确定移动用户的常驻点信息的装置的结构示意图;
图4为本发明另一个实施例的在计算机设备中确定移动用户的常驻点信息的装置的结构示意图。
附图中相同或相似的附图标记代表相同或相似的部件。
下面结合附图对本发明作进一步详细描述。
图1为本发明一个实施例的在计算机设备中确定移动用户的常驻点信息的方法的流程示意图。
其中,本实施例的方法主要通过计算机设备来实现;所述计算机设备包括网络设备和用户设备。所述网络设备包括但不限于单个网络服务器、多个网络服务器组成的服务器组或基于云计算(Cloud Computing)的由大量计算机或网络服务器构成的云,其中,云计算
是分布式计算的一种,由一群松散耦合的计算机集组成的一个超级虚拟计算机;所述网络设备所处的网络包括但不限于互联网、广域网、城域网、局域网、VPN网络等。所述用户设备包括但不限于PC机、平板电脑、智能手机、PDA、IPTV等。
需要说明的是,所述计算机设备仅为举例,其他现有的或今后可能出现的计算设备如可适用于本发明,也应包含在本发明保护范围以内,并以引用方式包含于此。
根据本实施例的方法包括步骤S1和步骤S2。
在步骤S1中,计算机设备获取移动用户的多个时空点信息。
其中,所述时空点信息用于指示所述移动用户的空间位置以及移动用户位于该空间位置时对应的时间点信息。优选地,所述时空点信息可具有多种表现形式,包括但不限于:一个度量的向量、多维空间中的一个点等;更优选地,所述时空点信息为四维空间向量。
例如,移动用户的一个时空点信息为四维空间向量α=(a,b,c,d),其中,(a,b,c)为移动用户的空间位置的坐标,d为移动用户位于该空间位置时对应的时间点信息。
具体地,计算机设备可通过多种方式获取移动用户的多个时空点信息。例如,计算机设备接收来自其他计算机设备的、移动用户的多个时空点信息;又例如,移动用户定时向计算机设备上报其时空点信息,则在一段时间内,计算机设备接收移动用户所上报的多个时空点信息等。
需要说明的是,上述举例仅为更好地说明本发明的技术方案,而非对本发明的限制,本领域技术人员应该理解,任何获取移动用户的多个时空点信息的实现方式,均应包含在本发明的范围内。
在步骤S2中,计算机设备基于聚类算法对所述多个时空点信息进行聚类分析,来确定所述移动用户的多个常驻点信息。
其中,所述聚类算法包括任何可用于进行聚类分析的算法,例如,基于密度的聚类算法、EM算法等。优选地,所述聚类算法需要设定聚类中心的数量;更优选地,所述聚类算法为基于密度的聚类算法。
其中,所述常驻点信息包括用于指示移动用户的常驻点的任何信息;优选地,所述常驻点信息包括与移动用户的常驻点相关的任何信息;优选地,可直接将聚类分析获得的聚类结果中的类作为常驻点信息。更优选地,可通过对聚类结果中的类进行统计分析,来确定该类对应的常驻点信息,其中,所述常驻点信息包括位置属性信息和时间属性信息,所述位置属性信息用于指示该常驻点的空间位置或位置范围,所述时间属性信息用于指示移动用户位于该常驻点时的多个时间点信息或时间范围。
具体地,计算机设备基于聚类算法对所述多个时空点信息进行聚类分析,来获得包括多个类的聚类结果,并根据该多个类来确定移动用户的多个常驻点信息。
例如,计算机设备将聚类算法的聚类中心的数量设定为预定数量,如4;计算机设备从所述多个时空点信息中选择4个时空点信息作为聚类中心,对于多个时空点信息中的每个时空点信息,计算机设备计算该时空点信息分别与该4个聚类中心之间的距离,并将该时空点信息归类到最小距离所对应的聚类中心;之后,计算机设备根据聚类结果中的4个类,来确定移动用户的4个常驻点信息。
作为本实施例的一种优选方案,所述聚类算法需要设定聚类中心的数量。
其中,计算机设备基于该需要设定聚类中心的聚类算法,来对所述多个时空点信息进行聚类分析,并确定所述移动用户的多个常驻点信息的方式包括但不限于:
1)聚类算法的聚类中心的数量已预先确定,则计算机设备直接基于已确定的聚类中心的数量,运行聚类算法以对所述多个时空点信息进行聚类分析,并确定所述移动用户的多个常驻点信息。
2)聚类算法的聚类中心的数量并未确定,则在此情况下,计算机设备需要先确定一个合适的聚类中心的数量。
具体地,本实现方式中,计算机设备可从多个假设数量中确定一个合适的假设数量,作为聚类中心的数量。所述步骤S2进一步包括
步骤S21和步骤S22。
在步骤S21中,对于多个假设数量的全部或部分中的每个假设数量,计算机设备将所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该假设数量对应的聚类结果,并根据多个假设数量分别对应的多个聚类结果,选择一个假设数量。
优选地,计算机设备基于以下至少一项,来根据多个假设数量分别对应的多个聚类结果选择一个假设数量:
-假设数量对应的聚类结果中的类包含的时空点信息的数量。
优选地,类中包含的时空点信息的数量越多,则通常聚类结果越好。
-假设数量对应的聚类结果中的类的离散度。
优选地,类的离散度越低,则通常聚类结果越好。
其中,所述离散度用于指示类的密集程度。其中,计算机设备可采用多种方式来确定所述离散度,如,计算机设备根据类中的所有时空点信息确定类的均值,并计算每个时空点信息与该均值之间的极差、平均差或标准差等来表示该类的离散度。
需要说明的是,所述步骤S21可采用多种方式来实现。例如,步骤S21的实现方式包括但不限于:
a)本实现方式中,所述步骤S21进一步包括步骤S2111、步骤S2112和步骤S2113。
在步骤S2111中,对于所述多个假设数量中一个未确定其对应的聚类结果的假设数量,计算机设备将所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该假设数量对应的聚类结果。
在步骤S2112中,当该假设数量对应的聚类结果符合第一预定条件时,计算机设备将该假设数量作为所述所选择的假设数量。
其中,所述第一预定条件包括任何预定的用于选择假设数量的条件。优选地,所述第一预定条件包括但不限于:
-该聚类结果中的类包含的时空点信息的数量超过预定数量阈值。
-该聚类结果中的类的离散度低于预定离散度阈值。
例如,预定数量阈值为100,假设数量对应的聚类结果中包括4个类,该4个类中时空点信息的数量分别为:120、110、108、150。则在步骤S2112中,计算机设备确定假设数量对应的聚类结果的每个类中的时空点信息的数量均超过预定数量阈值,则计算机设备确定该聚类结果符合第一预定条件,且将该假设数量作为所选择的假设数量。
在步骤S2113中,当该假设数量对应的聚类结果不符合所述第一预定条件时,计算机设备重复所述步骤S2111。
具体地,当该假设数量对应的聚类结果不符合第一预定条件时,计算机设备重复步骤S2111,来获得未确定其对应的聚类结果的假设数量所对应的聚类结果;依次类推,直至一个假设数量对应的聚类结果符合第一预定条件时,将该假设数量作为所选择的假设数量,并停止操作。
例如,多个假设数量包括从2至1000的所有自然数。第一次执行步骤S2111时,计算机设备选择的假设数量为2,并在将聚类中心的数量设定为2的情况下,基于聚类算法对所述多个时空点信息进行聚类分析,获得与假设数量“2”对应的聚类结果;接着,计算机设备判断“2”对应的聚类结果不符合所述第一预定条件,在步骤S2113中,计算机设备重复步骤S2111,选择未确定其对应的聚类结果的假设数量“4”,并确定其聚类结果;接着,计算机设备判断“4”对应的聚类结果不符合所述第一预定条件,继续执行步骤S2113;以此类推,直至计算机设备获得符合第一预定条件的假设数量“5”,并执行步骤S2112,将“5”作为所选择的假设数量。
本实现方式中,计算机设备仅需要获得一个符合第一预定条件的假设数量,即可基于该假设数量执行后续操作,而无需遍历并获得所有假设数量的聚类结果。
b)本实现方式中,所述多个假设数量递增或递减,所述步骤S21进一步包括步骤S2121、步骤S2122、步骤S2123和步骤S2124。
在步骤S2121中,计算机设备将所述多个假设数量中的一个假设数量作为当前假设数量,将所述聚类算法的聚类中心的数量设定为该当前假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该当前假设数量对应的聚类结果。
例如,多个假设数量包括从2至1000递增的多个自然数。在步骤S2121中,计算机设备将“2”作为当前假设数量,并将聚类算法的聚类中心的数量设定为“2”,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与“2”对应的聚类结果。
在步骤S2122中,计算机设备将所述聚类算法的聚类中心的数量设定为该当前假设数量的下一个假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该下一个假设数量对应的聚类结果。
例如,计算机设备将聚类算法的聚类中心的数量设定为“2”的下一个假设数量“3”,并基于聚类算法对所述多个时空点信息进行聚类分析,获得与该下一个假设数量对应的聚类结果。
在步骤S2123中,当所述下一个假设数量对应的聚类结果差于该当前假设数量对应的聚类结果时,计算机设备将该当前假设数量作为所述所选择的假设数量。
优选地,可根据聚类结果中类的离散度和/或类包含的时空信息点的数量来确定下一个假设数量对应的聚类结果是否差于该当前假设数量对应的聚类结果。
例如,可计算下一个假设数量对应的聚类结果中类之间的方差E1,以及当前假设数量对应的聚类结果中类之间的方差E2,并比较E1和E2,当E1大于E2,计算机设备可确定下一个假设数量对应的聚类结果差于该当前假设数量对应的聚类结果;当E1小于E2,计算机设备可确定下一个假设数量对应的聚类结果优于该当前假设数量对应的聚类结果。
在步骤S2124中,当所述下一个假设数量对应的聚类结果优于该当前假设数量对应的聚类结果时,计算机设备将该下一个假设数量作为该当前假设数量,重复所述步骤S2122。
例如,当前假设数量为“2”,且“2”的下一个假设数量为“3”,且“3”对应的聚类结果优于“2”对应的聚类结果,则计算机设备将“3”作为当前假设数量,并重复步骤S2122,获得“4”的聚类结果;接着,若“4”对应的聚类结果优于“3”对应的聚类结果,则计算机设备将“4”作为当前假设数量,继续重复步骤S2122;以此类推,直至下一个假设数量对应的聚类结果差于该当前假设数量对应的聚类结果时,在步骤S2123中,计算机设备将该当前假设数量作为所述所选择的假设数量。
由于当多个假设数量呈现出递增或递减关系时,一个最佳的假设数量对应的聚类结果,会优于其相邻两个假设数量对应的聚类结果,因此,本实现方式中,计算机设备可获得最佳的假设数量。并且,由于获得最佳假设数量后即可基于该假设数量执行后续操作,而无需继续获取其他假设数量的聚类结果,故通常情况下,本实现方式无需遍历并获得所有假设数量的聚类结果。
c)所述步骤S21进一步包括步骤S2131和步骤S2132。
在步骤2131中,对于多个假设数量中的每个假设数量,计算机设备将所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该假设数量对应的聚类结果。
例如,存在4个假设数量:2、3、4、5。计算机设备基于所述聚类算法分别获得聚类中心的数量为2时的聚类结果、聚类中心的数量为3时的聚类结果、聚类中心的数量为4时的聚类结果、以及聚类中心的数量为5时的聚类结果。
在步骤2132中,计算机设备根据所述多个假设数量分别对应的多个聚类结果,选择一个假设数量。
其中,计算机设备根据多个假设数量分别对应的多个聚类结果选
择一个假设数量的实现方式已在前文中予以详述,在此不再赘述。
需要说明的是,多个假设数量可表现为集合的形式,如为集合[2,3,4,...,1000],则计算机设备可直接从该集合中读取假设数量。或者,多个假设数量可表现为公式的形式,如k=K+nΔ;其中,k表示假设数量,K为基数(通常K可取2),Δ=1,n=0,1,2,...,998;则计算机设备可通过该公式来计算得到其需要的假设数量。
需要说明的是,上述举例仅为更好地说明本发明的技术方案,而非对本发明的限制,本领域技术人员应该理解,任何对于多个假设数量的全部或部分中的每个假设数量,将所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该假设数量对应的聚类结果,并根据多个假设数量分别对应的多个聚类结果,选择一个假设数量的实现方式,均应包含在本发明的范围内。
在步骤S22中,计算机设备根据所选择的假设数量对应的聚类结果,确定所述移动用户的多个常驻点信息。
其中,计算机设备可采用多种方式根据所选择的假设数量对应的聚类结果,确定所述移动用户的多个常驻点信息。
例如,计算机设备可直接将聚类结果的多个类作为移动用户的多个常驻点信息。
又例如,对于聚类结果中的每个类,计算机设备可通过对该类进行统计分析,如分别对该类中的所有时空点信息的空间位置和时间点信息进行统计,来确定该类对应的常驻点信息。
需要说明的是,上述举例仅为更好地说明本发明的技术方案,而非对本发明的限制,本领域技术人员应该理解,任何根据所选择的假设数量对应的聚类结果,确定所述移动用户的多个常驻点信息的实现方式,均应包含在本发明的范围内。
根据本实施例的方案,可通过对移动用户的时空点信息进行聚类分析,来确定移动用户的多个常驻点,从而可更准确地了解移动用户的活动范围以及生活规律。
图2为本发明另一个实施例的在计算机设备中确定移动用户的常驻点信息的方法的流程示意图。本实施例的方法主要由计算机设备来实现,其中,参照图1所示实施例中对计算机设备所做的任何说明,均以引用的方式包含于此。
根据本实施例的方法包括步骤S1、步骤S2和步骤S3。其中,所述步骤S1和步骤S2已在参照图1所示实施例中予以详述,在此不再赘述。
在步骤S3中,计算机设备根据所述多个常驻点信息,确定所述多个常驻点信息中的每个常驻点信息的类型
其中,所述常驻点信息的类型用于指示移动用户的常驻点的性质,如家、餐厅、娱乐场所、工作地等。
具体地,对于每个常驻点信息,计算机设备通过分析该常驻点信息,确定该常驻点信息的类型。
例如,根据常驻点信息以及地图,计算机设备确定该常驻点信息所对应的位置范围在一个居民区内,则计算机设备确定该常驻点信息的类型为家。
优选地,所述步骤S3进一步包括对所述多个常驻点信息中的每个执行的步骤S31和步骤S32。
在步骤S31中,计算机设备获取该常驻点信息的位置属性信息和时间属性信息。
其中,计算机设备可采用多种方式获取该常驻点信息的位置属性信息和时间属性信息。
例如,当所述常驻点信息为聚类结果中的类时,计算机设备对该类中的所有时空点信息进行统计分析,来根据所有时空点信息中的空间位置获得该常驻点信息的位置属性信息,并根据所有时空点信息中的时间点信息获得该常驻点信息的时间属性信息。
又例如,当所述常驻点信息是通过对聚类结果中的类进行统计分析来得到的时,计算机设备可直接从该常驻点信息中提取该常驻点信息的位置属性信息和时间属性信息。
需要说明的是,上述举例仅为更好地说明本发明的技术方案,而非对本发明的限制,本领域技术人员应该理解,任何获取该常驻点信息的位置属性信息和时间属性信息的实现方式,均应包含在本发明的范围内。
在步骤S32中,计算机设备根据所述位置属性信息和时间属性信息,确定该常驻点信息的类型。
例如,常驻点信息的时间属性信息指示移动用户位于该常驻点时的时间范围集中在每周周一至周五的9:00至18:00,且该常驻点信息的位置属性信息指示该常驻点的位置为一个办公楼,则计算机设备确定该常驻点信息的类型为工作地。
又例如,常驻点信息的时间属性信息指示移动用户位于该常驻点时的时间范围集中在周末的21:00至24:00,且该常驻点信息的位置属性信息指示该常驻点位于商业区附近,则计算机设备确定该常驻点信息的类型为娱乐场所。
需要说明的是,上述举例仅为更好地说明本发明的技术方案,而非对本发明的限制,本领域技术人员应该理解,任何根据所述位置属性信息和时间属性信息,确定该常驻点信息的类型的实现方式,均应包含在本发明的范围内。
根据本实施例的方案,可根据移动用户的多个常驻点信息,确定移动用户的每个常驻点的类型,且在一定程度上预测用户在某常驻点区域出现的概率。
图3为本发明一个实施例的在计算机设备中确定移动用户的常驻点信息的装置的结构示意图。根据本实施例的确定移动用户的常驻点信息的装置包括用于获取所述移动用户的多个时空点信息的装置(以下简称“第一获取装置1”)以及用于基于聚类算法对所述多个时空点信息进行聚类分析,来确定所述移动用户的多个常驻点信息的装置(以下简称“第一确定装置2”)。
第一获取装置1获取移动用户的多个时空点信息。
其中,所述时空点信息用于指示所述移动用户的空间位置以及移动用户位于该空间位置时对应的时间点信息。优选地,所述时空点信息可具有多种表现形式,包括但不限于:一个度量的向量、多维空间中的一个点等;更优选地,所述时空点信息为四维空间向量。
例如,移动用户的一个时空点信息为四维空间向量α=(a,b,c,d),其中,(a,b,c)为移动用户的空间位置的坐标,d为移动用户位于该空间位置时对应的时间点信息。
具体地,第一获取装置1可通过多种方式获取移动用户的多个时空点信息。例如,第一获取装置1接收来自其他计算机设备的、移动用户的多个时空点信息;又例如,移动用户定时向计算机设备上报其时空点信息,则在一段时间内,第一获取装置1接收移动用户所上报的多个时空点信息等。
需要说明的是,上述举例仅为更好地说明本发明的技术方案,而非对本发明的限制,本领域技术人员应该理解,任何获取移动用户的多个时空点信息的实现方式,均应包含在本发明的范围内。
第一确定装置2基于聚类算法对所述多个时空点信息进行聚类分析,来确定所述移动用户的多个常驻点信息。
其中,所述聚类算法包括任何可用于进行聚类分析的算法,例如,基于密度的聚类算法、EM算法等。优选地,所述聚类算法需要设定聚类中心的数量;更优选地,所述聚类算法为基于密度的聚类算法。
其中,所述常驻点信息包括用于指示移动用户的常驻点的任何信息;优选地,所述常驻点信息包括与移动用户的常驻点相关的任何信息;优选地,可直接将聚类分析获得的聚类结果中的类作为常驻点信息。更优选地,可通过对聚类结果中的类进行统计分析,来确定该类对应的常驻点信息,其中,所述常驻点信息包括位置属性信息和时间属性信息,所述位置属性信息用于指示该常驻点的空间位置或位置范围,所述时间属性信息用于指示移动用户位于该常驻点时的多个时间点信息或时间范围。
具体地,第一确定装置2基于聚类算法对所述多个时空点信息进
行聚类分析,来获得包括多个类的聚类结果,并根据该多个类来确定移动用户的多个常驻点信息。
例如,第一确定装置2将聚类算法的聚类中心的数量设定为预定数量,如4;第一确定装置2从所述多个时空点信息中选择4个时空点信息作为聚类中心,对于多个时空点信息中的每个时空点信息,第一确定装置2计算该时空点信息分别与该4个聚类中心之间的距离,并将该时空点信息归类到最小距离所对应的聚类中心;之后,第一确定装置2根据聚类结果中的4个类,来确定移动用户的4个常驻点信息。
作为本实施例的一种优选方案,所述聚类算法需要设定聚类中心的数量。
其中,第一确定装置2基于该需要设定聚类中心的聚类算法,来对所述多个时空点信息进行聚类分析,并确定所述移动用户的多个常驻点信息的方式包括但不限于:
1)聚类算法的聚类中心的数量已预先确定,则第一确定装置2直接基于已确定的聚类中心的数量,运行聚类算法以对所述多个时空点信息进行聚类分析,并确定所述移动用户的多个常驻点信息。
2)聚类算法的聚类中心的数量并未确定,则在此情况下,第一确定装置2需要先确定一个合适的聚类中心的数量。
具体地,本实现方式中,第一确定装置2可从多个假设数量中确定一个合适的假设数量,作为聚类中心的数量。第一确定装置2进一步包括用于对于多个假设数量的全部或部分中的每个假设数量,将所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该假设数量对应的聚类结果,并根据多个假设数量分别对应的多个聚类结果,选择一个假设数量的装置(图未示,以下简称“选择装置”)以及用于根据所选择的假设数量对应的聚类结果,确定所述移动用户的多个常驻点信息的装置(图未示,以下简称“第一子确定装置”)。
对于多个假设数量的全部或部分中的每个假设数量,选择装置将
所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该假设数量对应的聚类结果,并根据多个假设数量分别对应的多个聚类结果,选择一个假设数量。
优选地,选择装置基于以下至少一项,来根据多个假设数量分别对应的多个聚类结果选择一个假设数量:
-假设数量对应的聚类结果中的类包含的时空点信息的数量。
优选地,类中包含的时空点信息的数量越多,则通常聚类结果越好。
-假设数量对应的聚类结果中的类的离散度。
优选地,类的离散度越低,则通常聚类结果越好。
其中,所述离散度用于指示类的密集程度。其中,选择装置可采用多种方式来确定所述离散度,如,计算机设备根据类中的所有时空点信息确定类的均值,并计算每个时空点信息与该均值之间的极差、平均差或标准差等来表示该类的离散度。
需要说明的是,选择装置可采用多种方式来实现。例如,选择装置的实现方式包括但不限于:
a)本实现方式中,选择装置进一步包括用于对于所述多个假设数量中一个未确定其对应的聚类结果的假设数量,将所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该假设数量对应的聚类结果的装置(图未示,以下简称“第一聚类装置”)、用于当该假设数量对应的聚类结果符合第一预定条件时,将该假设数量作为所述所选择的假设数量的装置(图未示,以下简称“第一设定装置”)以及用于当该假设数量对应的聚类结果不符合所述第一预定条件时,触发所述第一聚类装置重复执行操作的装置(图未示,以下简称“第一触发装置”)。
对于所述多个假设数量中一个未确定其对应的聚类结果的假设数量,第一聚类装置将所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获
得与该假设数量对应的聚类结果。
当该假设数量对应的聚类结果符合第一预定条件时,第一设定装置将该假设数量作为所述所选择的假设数量。
其中,所述第一预定条件包括任何预定的用于选择假设数量的条件。优选地,所述第一预定条件包括但不限于:
-该聚类结果中的类包含的时空点信息的数量超过预定数量阈值。
-该聚类结果中的类的离散度低于预定离散度阈值。
例如,预定数量阈值为100,假设数量对应的聚类结果中包括4个类,该4个类中时空点信息的数量分别为:120、110、108、150。则第一设定装置确定假设数量对应的聚类结果的每个类中的时空点信息的数量均超过预定数量阈值,则第一设定装置确定该聚类结果符合第一预定条件,且将该假设数量作为所选择的假设数量。
当该假设数量对应的聚类结果不符合所述第一预定条件时,第一触发装置触发所述第一聚类装置重复执行操作。
具体地,当该假设数量对应的聚类结果不符合第一预定条件时,第一触发装置触发所述第一聚类装置重复执行操作,来获得未确定其对应的聚类结果的假设数量所对应的聚类结果;依次类推,直至一个假设数量对应的聚类结果符合第一预定条件时,第一设定装置将该假设数量作为所选择的假设数量,并停止操作。
例如,多个假设数量包括从2至1000的所有自然数。第一聚类装置第一次执行操作时,选择的假设数量为2,并在将聚类中心的数量设定为2的情况下,基于聚类算法对所述多个时空点信息进行聚类分析,获得与假设数量“2”对应的聚类结果;接着,由于“2”对应的聚类结果不符合所述第一预定条件,第一触发装置触发所述第一聚类装置重复执行操作,选择未确定其对应的聚类结果的假设数量“4”,并确定其聚类结果;接着,由于“4”对应的聚类结果不符合所述第一预定条件,第一触发装置继续触发所述第一聚类装置重复执行操作;以此类推,直至获得符合第一预定条件的假设数量“5”,第一
设定装置将“5”作为所选择的假设数量。
本实现方式中,计算机设备仅需要获得一个符合第一预定条件的假设数量,即可基于该假设数量执行后续操作,而无需遍历并获得所有假设数量的聚类结果。
b)本实现方式中,所述多个假设数量递增或递减,选择装置进一步包括用于将所述多个假设数量中的一个假设数量作为当前假设数量,将所述聚类算法的聚类中心的数量设定为该当前假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该当前假设数量对应的聚类结果的装置(图未示,以下简称“第二聚类装置”)、用于将所述聚类算法的聚类中心的数量设定为该当前假设数量的下一个假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该下一个假设数量对应的聚类结果的装置(图未示,以下简称“第三聚类装置”)、用于当所述下一个假设数量对应的聚类结果差于该当前假设数量对应的聚类结果时,将该当前假设数量作为所述所选择的假设数量的装置(图未示,以下简称“第二设定装置”)以及用于当所述下一个假设数量对应的聚类结果优于该当前假设数量对应的聚类结果时,将该下一个假设数量作为该当前假设数量,触发所述用于获得与该下一个假设数量对应的聚类结果的装置重复执行操作的装置(图未示,以下简称“第二触发装置”)。
第二聚类装置将所述多个假设数量中的一个假设数量作为当前假设数量,将所述聚类算法的聚类中心的数量设定为该当前假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该当前假设数量对应的聚类结果。
例如,多个假设数量包括从2至1000递增的多个自然数。第二聚类装置将“2”作为当前假设数量,并将聚类算法的聚类中心的数量设定为“2”,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与“2”对应的聚类结果。
第三聚类装置将所述聚类算法的聚类中心的数量设定为该当前假设数量的下一个假设数量,并基于所述聚类算法对所述多个时空点
信息进行聚类分析,获得与该下一个假设数量对应的聚类结果。
例如,第三聚类装置将聚类算法的聚类中心的数量设定为“2”的下一个假设数量“3”,并基于聚类算法对所述多个时空点信息进行聚类分析,获得与该下一个假设数量对应的聚类结果。
当所述下一个假设数量对应的聚类结果差于该当前假设数量对应的聚类结果时,第二设定装置将该当前假设数量作为所述所选择的假设数量。
优选地,可根据聚类结果中类的离散度和/或类包含的时空信息点的数量来确定下一个假设数量对应的聚类结果是否差于该当前假设数量对应的聚类结果。
例如,可计算下一个假设数量对应的聚类结果中类之间的方差E1,以及当前假设数量对应的聚类结果中类之间的方差E2,并比较E1和E2,当E1大于E2,第二设定装置可确定下一个假设数量对应的聚类结果差于该当前假设数量对应的聚类结果;当E1小于E2,第二设定装置可确定下一个假设数量对应的聚类结果优于该当前假设数量对应的聚类结果。
当所述下一个假设数量对应的聚类结果优于该当前假设数量对应的聚类结果时,第二触发装置触发第三聚类装置重复执行操作。
例如,当前假设数量为“2”,且“2”的下一个假设数量为“3”,且“3”对应的聚类结果优于“2”对应的聚类结果,则第二触发装置将“3”作为当前假设数量,并触发第三聚类装置重复执行操作,获得“4”的聚类结果;接着,若“4”对应的聚类结果优于“3”对应的聚类结果,则第二触发装置将“4”作为当前假设数量,继续触发第三聚类装置重复执行操作;以此类推,直至下一个假设数量对应的聚类结果差于该当前假设数量对应的聚类结果时,第二设定装置将该当前假设数量作为所述所选择的假设数量。
由于当多个假设数量呈现出递增或递减关系时,一个最佳的假设数量对应的聚类结果,会优于其相邻两个假设数量对应的聚类结果,因此,本实现方式中,可获得最佳的假设数量。并且,由于获得最佳
假设数量后即可基于该假设数量执行后续操作,而无需继续获取其他假设数量的聚类结果,故通常情况下,本实现方式无需遍历并获得所有假设数量的聚类结果。
c)选择装置进一步包括用于对于多个假设数量中的每个假设数量,将所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该假设数量对应的聚类结果的装置(图未示,以下简称“第四聚类装置”)以及用于根据所述多个假设数量分别对应的多个聚类结果,选择一个假设数量的装置(图未示,以下简称“子选择装置”)。
对于多个假设数量中的每个假设数量,第四聚类装置将所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该假设数量对应的聚类结果。
例如,存在4个假设数量:2、3、4、5。第四聚类装置基于所述聚类算法分别获得聚类中心的数量为2时的聚类结果、聚类中心的数量为3时的聚类结果、聚类中心的数量为4时的聚类结果、以及聚类中心的数量为5时的聚类结果。
子选择装置根据所述多个假设数量分别对应的多个聚类结果,选择一个假设数量。
其中,子选择装置根据多个假设数量分别对应的多个聚类结果选择一个假设数量的实现方式,与前文中选择装置根据多个假设数量分别对应的多个聚类结果选择一个假设数量的实现方式相似,在此不再赘述。
需要说明的是,多个假设数量可表现为集合的形式,如为集合[2,3,4,...,1000],则计算机设备可直接从该集合中读取假设数量。或者,多个假设数量可表现为公式的形式,如k=K+nΔ;其中,k表示假设数量,K为基数(通常K可取2),Δ=1,n=0,1,2,...,998;则计算机设备可通过该公式来计算得到其需要的假设数量。
需要说明的是,上述举例仅为更好地说明本发明的技术方案,而
非对本发明的限制,本领域技术人员应该理解,任何对于多个假设数量的全部或部分中的每个假设数量,将所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该假设数量对应的聚类结果,并根据多个假设数量分别对应的多个聚类结果,选择一个假设数量的实现方式,均应包含在本发明的范围内。
第一子确定装置根据所选择的假设数量对应的聚类结果,确定所述移动用户的多个常驻点信息。
其中,第一子确定装置可采用多种方式根据所选择的假设数量对应的聚类结果,确定所述移动用户的多个常驻点信息。
例如,第一子确定装置可直接将聚类结果的多个类作为移动用户的多个常驻点信息。
又例如,对于聚类结果中的每个类,第一子确定装置可通过对该类进行统计分析,如分别对该类中的所有时空点信息的空间位置和时间点信息进行统计,来确定该类对应的常驻点信息。
需要说明的是,上述举例仅为更好地说明本发明的技术方案,而非对本发明的限制,本领域技术人员应该理解,任何根据所选择的假设数量对应的聚类结果,确定所述移动用户的多个常驻点信息的实现方式,均应包含在本发明的范围内。
根据本实施例的方案,可通过对移动用户的时空点信息进行聚类分析,来确定移动用户的多个常驻点,从而可更准确地了解移动用户的活动范围以及生活规律。
图4为本发明另一个实施例的在计算机设备中确定移动用户的常驻点信息的装置的结构示意图。根据本实施例的确定移动用户的常驻点信息的装置包括第一获取装置1、第一确定装置2和用于根据所述多个常驻点信息,确定所述多个常驻点信息中的每个常驻点信息的类型的装置(以下简称“第二确定装置3”)。其中,所述第一获取装置1和第一确定装置2已在参照图3所示实施例中予以详述,在此不再赘述。
第二确定装置3根据所述多个常驻点信息,确定所述多个常驻点信息中的每个常驻点信息的类型
其中,所述常驻点信息的类型用于指示移动用户的常驻点的性质,如家、餐厅、娱乐场所、工作地等。
具体地,对于每个常驻点信息,第二确定装置3通过分析该常驻点信息,确定该常驻点信息的类型。
例如,根据常驻点信息以及地图,第二确定装置3确定该常驻点信息所对应的位置范围在一个居民区内,则第二确定装置3确定该常驻点信息的类型为家。
优选地,所述第二确定装置3进一步包括用于获取该常驻点信息的位置属性信息和时间属性信息的装置(图未示,以下简称“第二获取装置”)以及用于根据所述位置属性信息和时间属性信息,确定该常驻点信息的类型的装置(图未示,以下简称“第二子确定装置”)。
第二获取装置获取该常驻点信息的位置属性信息和时间属性信息。
其中,第二获取装置可采用多种方式获取该常驻点信息的位置属性信息和时间属性信息。
例如,当所述常驻点信息为聚类结果中的类时,第二获取装置对该类中的所有时空点信息进行统计分析,来根据所有时空点信息中的空间位置获得该常驻点信息的位置属性信息,并根据所有时空点信息中的时间点信息获得该常驻点信息的时间属性信息。
又例如,当所述常驻点信息是通过对聚类结果中的类进行统计分析来得到的时,第二获取装置可直接从该常驻点信息中提取该常驻点信息的位置属性信息和时间属性信息。
需要说明的是,上述举例仅为更好地说明本发明的技术方案,而非对本发明的限制,本领域技术人员应该理解,任何获取该常驻点信息的位置属性信息和时间属性信息的实现方式,均应包含在本发明的范围内。
第二子确定装置根据所述位置属性信息和时间属性信息,确定该
常驻点信息的类型。
例如,常驻点信息的时间属性信息指示移动用户位于该常驻点时的时间范围集中在每周周一至周五的9:00至18:00,且该常驻点信息的位置属性信息指示该常驻点的位置为一个办公楼,则第二子确定装置确定该常驻点信息的类型为工作地。
又例如,常驻点信息的时间属性信息指示移动用户位于该常驻点时的时间范围集中在周末的21:00至24:00,且该常驻点信息的位置属性信息指示该常驻点位于商业区附近,则第二子确定装置确定该常驻点信息的类型为娱乐场所。
需要说明的是,上述举例仅为更好地说明本发明的技术方案,而非对本发明的限制,本领域技术人员应该理解,任何根据所述位置属性信息和时间属性信息,确定该常驻点信息的类型的实现方式,均应包含在本发明的范围内。
根据本实施例的方案,可根据移动用户的多个常驻点信息,确定移动用户的每个常驻点的类型,且在一定程度上预测用户在某常驻点区域出现的概率。
需要注意的是,本发明可在软件和/或软件与硬件的组合体中被实施,例如,本发明的各个装置可采用专用集成电路(ASIC)或任何其他类似硬件设备来实现。在一个实施例中,本发明的软件程序可以通过处理器执行以实现上文所述步骤或功能。同样地,本发明的软件程序(包括相关的数据结构)可以被存储到计算机可读记录介质中,例如,RAM存储器,磁或光驱动器或软磁盘及类似设备。另外,本发明的一些步骤或功能可采用硬件来实现,例如,作为与处理器配合从而执行各个步骤或功能的电路。
对于本领域技术人员而言,显然本发明不限于上述示范性实施例的细节,而且在不背离本发明的精神或基本特征的情况下,能够以其他的具体形式实现本发明。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本发明的范围由所附权利要求
而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本发明内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。系统权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一,第二等词语用来表示名称,而并不表示任何特定的顺序。
Claims (23)
- 一种在计算机设备中确定移动用户的常驻点信息的方法,其中,该方法包括以下步骤:a.获取所述移动用户的多个时空点信息,其中,所述时空点信息用于指示所述移动用户的空间位置以及移动用户位于该空间位置时对应的时间点信息;b.基于聚类算法对所述多个时空点信息进行聚类分析,来确定所述移动用户的多个常驻点信息。
- 根据权利要求1所述的方法,其中,所述聚类算法需要设定聚类中心的数量。
- 根据权利要求2所述的方法,其中,所述步骤b包括以下步骤:b1对于多个假设数量的全部或部分中的每个假设数量,将所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该假设数量对应的聚类结果,并根据多个假设数量分别对应的多个聚类结果,选择一个假设数量;b2根据所选择的假设数量对应的聚类结果,确定所述移动用户的多个常驻点信息。
- 根据权利要求3所述的方法,其中,所述步骤b1包括以下步骤:b111对于所述多个假设数量中一个未确定其对应的聚类结果的假设数量,将所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该假设数量对应的聚类结果;b112当该假设数量对应的聚类结果符合第一预定条件时,将该假设数量作为所述所选择的假设数量;b113当该假设数量对应的聚类结果不符合所述第一预定条件时,重复所述步骤b111。
- 根据权利要求3所述的方法,其中,所述多个假设数量递增或递减,所述步骤b1包括以下步骤:b121将所述多个假设数量中的一个假设数量作为当前假设数量,将所述聚类算法的聚类中心的数量设定为该当前假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该当前假设数量对应的聚类结果;b122将所述聚类算法的聚类中心的数量设定为该当前假设数量的下一个假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该下一个假设数量对应的聚类结果;b123当所述下一个假设数量对应的聚类结果差于该当前假设数量对应的聚类结果时,将该当前假设数量作为所述所选择的假设数量;b124当所述下一个假设数量对应的聚类结果优于该当前假设数量对应的聚类结果时,将该下一个假设数量作为该当前假设数量,重复所述步骤b122。
- 根据权利要求3所述的方法,其中,所述步骤b1包括以下步骤:-对于多个假设数量中的每个假设数量,将所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该假设数量对应的聚类结果;-根据所述多个假设数量分别对应的多个聚类结果,选择一个假设数量。
- 根据权利要求3至6中任一项所述的方法,其中,基于以下至少一项,来根据多个假设数量分别对应的多个聚类结果选择一个假设数量:-假设数量对应的聚类结果中的类包含的时空点信息的数量;-假设数量对应的聚类结果中的类的离散度。
- 根据权利要求1至7中任一项所述的方法,其中,该方法还包括以下步骤:x根据所述多个常驻点信息,确定所述多个常驻点信息中的每个常 驻点信息的类型。
- 根据权利要求8所述的方法,其中,所述步骤x包括对所述多个常驻点信息中的每个执行的以下步骤:-获取该常驻点信息的位置属性信息和时间属性信息;-根据所述位置属性信息和时间属性信息,确定该常驻点信息的类型。
- 根据权利要求1至9中任一项所述的方法,其中,所述时空点信息为四维空间向量。
- 一种在计算机设备中确定移动用户的常驻点信息的装置,其中,该装置包括以下装置:用于获取所述移动用户的多个时空点信息的装置,其中,所述时空点信息用于指示所述移动用户的空间位置以及移动用户位于该空间位置时对应的时间点信息;用于基于聚类算法对所述多个时空点信息进行聚类分析,来确定所述移动用户的多个常驻点信息的装置。
- 根据权利要求11所述的装置,其中,所述聚类算法需要设定聚类中心的数量。
- 根据权利要求12所述的装置,其中,所述用于确定多个常驻点信息的装置包括以下装置:用于对于多个假设数量的全部或部分中的每个假设数量,将所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该假设数量对应的聚类结果,并根据多个假设数量分别对应的多个聚类结果,选择一个假设数量的装置;用于根据所选择的假设数量对应的聚类结果,确定所述移动用户的多个常驻点信息的装置。
- 根据权利要求13所述的装置,其中,所述用于选择一个假设数量的装置包括以下装置:用于对于所述多个假设数量中一个未确定其对应的聚类结果的假设 数量,将所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该假设数量对应的聚类结果的装置;用于当该假设数量对应的聚类结果符合第一预定条件时,将该假设数量作为所述所选择的假设数量的装置;用于当该假设数量对应的聚类结果不符合所述第一预定条件时,触发所述用于对于所述多个假设数量中一个未确定其对应的聚类结果的假设数量,获得与该假设数量对应的聚类结果的装置重复执行操作的装置。
- 根据权利要求13所述的装置,其中,所述多个假设数量递增或递减,所述用于选择一个假设数量的装置包括以下装置:用于将所述多个假设数量中的一个假设数量作为当前假设数量,将所述聚类算法的聚类中心的数量设定为该当前假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该当前假设数量对应的聚类结果的装置;用于将所述聚类算法的聚类中心的数量设定为该当前假设数量的下一个假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该下一个假设数量对应的聚类结果的装置;用于当所述下一个假设数量对应的聚类结果差于该当前假设数量对应的聚类结果时,将该当前假设数量作为所述所选择的假设数量的装置;用于当所述下一个假设数量对应的聚类结果优于该当前假设数量对应的聚类结果时,将该下一个假设数量作为该当前假设数量,触发所述用于获得与该下一个假设数量对应的聚类结果的装置重复执行操作的装置。
- 根据权利要求13所述的装置,其中,所述用于选择一个假设数量的装置包括以下装置:用于对于多个假设数量中的每个假设数量,将所述聚类算法的聚类中心的数量设定为该假设数量,并基于所述聚类算法对所述多个时空点信息进行聚类分析,获得与该假设数量对应的聚类结果的装置;用于根据所述多个假设数量分别对应的多个聚类结果,选择一个假设数量的装置。
- 根据权利要求13至16中任一项所述的装置,其中,基于以下至少一项,来根据多个假设数量分别对应的多个聚类结果选择一个假设数量:-假设数量对应的聚类结果中的类包含的时空点信息的数量;-假设数量对应的聚类结果中的类的离散度。
- 根据权利要求11至17中任一项所述的装置,其中,该装置还包括以下装置:用于根据所述多个常驻点信息,确定所述多个常驻点信息中的每个常驻点信息的类型的装置。
- 根据权利要求18所述的装置,其中,所述用于确定常驻点信息的类型的装置包括对所述多个常驻点信息中的每个执行操作的以下装置:用于获取该常驻点信息的位置属性信息和时间属性信息的装置;用于根据所述位置属性信息和时间属性信息,确定该常驻点信息的类型的装置。
- 根据权利要求11至19中任一项所述的装置,其中,所述时空点信息为四维空间向量。
- 一种非易失性计算机存储介质,所述非易失性计算机存储介质存储有计算机指令,当所述计算机指令被执行时,如权利要求1至10中任一项所述的方法被执行。
- 一种计算机程序产品,当所述计算机程序产品被运行时,如权利要求1至10中任一项所述的方法被执行。
- 一种计算机设备,所述计算机设备包括:一个或者多个处理器;存储器;一个或者多个程序,所述一个或者多个程序存储在所述存储器中,当被所述一个或者多个处理器执行时,如权利要求1至10中任一项所述的方法被执行。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410443562.3 | 2014-09-02 | ||
CN201410443562.3A CN104252527B (zh) | 2014-09-02 | 2014-09-02 | 一种确定移动用户的常驻点信息的方法和装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016033901A1 true WO2016033901A1 (zh) | 2016-03-10 |
Family
ID=52187417
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2014/093759 WO2016033901A1 (zh) | 2014-09-02 | 2014-12-12 | 一种确定移动用户的常驻点信息的方法和装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN104252527B (zh) |
WO (1) | WO2016033901A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111797181A (zh) * | 2020-05-26 | 2020-10-20 | 北京城市象限科技有限公司 | 用户职住地的定位方法、装置、控制设备及存储介质 |
CN117992560A (zh) * | 2024-01-08 | 2024-05-07 | 国网湖北省电力有限公司电力科学研究院 | 一种基于poi信息的电动汽车常驻区域生成方法和系统 |
Families Citing this family (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105847310A (zh) * | 2015-01-13 | 2016-08-10 | 中国移动通信集团江苏有限公司 | 一种确定位置的方法及装置 |
CN106153031B (zh) * | 2015-04-13 | 2019-08-30 | 骑记(厦门)科技有限公司 | 运动轨迹表示方法和装置 |
CN104765873B (zh) * | 2015-04-24 | 2019-03-26 | 百度在线网络技术(北京)有限公司 | 用户相似度确定方法和装置 |
CN106294485B (zh) * | 2015-06-05 | 2019-11-01 | 华为技术有限公司 | 确定显著地点的方法及装置 |
CN105307121B (zh) * | 2015-10-16 | 2019-03-26 | 上海晶赞科技发展有限公司 | 一种信息处理方法及装置 |
CN105843943B (zh) * | 2016-04-08 | 2019-03-01 | 深圳广联赛讯有限公司 | 车辆常驻地分析方法 |
CN106897331B (zh) * | 2016-06-07 | 2020-09-11 | 阿里巴巴集团控股有限公司 | 用户关键位置数据获取方法及装置 |
CN106202236A (zh) * | 2016-06-28 | 2016-12-07 | 联想(北京)有限公司 | 一种用户位置预测方法及装置 |
CN106127487A (zh) * | 2016-08-26 | 2016-11-16 | 成都市硕达科技股份有限公司 | 一种防盗刷安全守护系统及其使用方法 |
CN110832918B (zh) | 2017-06-30 | 2021-07-09 | Oppo广东移动通信有限公司 | 用户位置识别方法、装置、存储介质及电子设备 |
CN107515890A (zh) * | 2017-07-04 | 2017-12-26 | 深圳市金立通信设备有限公司 | 一种识别常驻点的方法及终端 |
CN109428929A (zh) * | 2017-08-31 | 2019-03-05 | 阿里巴巴集团控股有限公司 | 目标对象的位置信息的确定方法、服务器及用户客户端 |
CN109543926B (zh) * | 2017-09-21 | 2023-05-02 | 阿里巴巴集团控股有限公司 | 一种任务核销方法、移动终端及服务器 |
CN108174350B (zh) * | 2017-11-30 | 2020-12-11 | 北京三快在线科技有限公司 | 一种定位方法和装置 |
CN108122012B (zh) * | 2017-12-28 | 2020-11-24 | 百度在线网络技术(北京)有限公司 | 常驻点中心点的确定方法、装置、设备及存储介质 |
CN108650632B (zh) * | 2018-04-28 | 2020-05-26 | 广州市交通规划研究院 | 一种基于职住对应关系和时空间核聚类的驻点判断方法 |
CN109672715A (zh) * | 2018-09-13 | 2019-04-23 | 深圳壹账通智能科技有限公司 | 用户常驻地判断方法、装置、设备及计算机可读存储介质 |
CN109934265B (zh) * | 2019-02-15 | 2021-06-11 | 同盾控股有限公司 | 一种常驻地址的确定方法和装置 |
CN111861526B (zh) * | 2019-04-30 | 2024-05-21 | 京东城市(南京)科技有限公司 | 一种分析对象来源的方法和装置 |
CN112218230B (zh) * | 2019-06-24 | 2023-03-24 | 中兴通讯股份有限公司 | 用户常驻位置的获取方法、装置以及计算机可读存储介质 |
CN110493706A (zh) * | 2019-06-27 | 2019-11-22 | 中国移动通信集团海南有限公司 | 移动用户的常驻地确定方法、装置和计算机设备 |
CN112394647B (zh) * | 2019-08-19 | 2024-04-19 | 中国移动通信有限公司研究院 | 家居设备的控制方法、装置、设备及存储介质 |
CN112364907A (zh) * | 2020-11-03 | 2021-02-12 | 北京红山信息科技研究院有限公司 | 待测用户常驻地普查方法、系统、服务器和存储介质 |
CN113256405B (zh) * | 2021-06-22 | 2021-10-12 | 平安科技(深圳)有限公司 | 欺诈用户集中区域的预测方法、装置、设备及存储介质 |
CN113688197A (zh) * | 2021-08-26 | 2021-11-23 | 沈阳美行科技有限公司 | 一种常驻点标签确定方法、装置、设备及存储介质 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102629297A (zh) * | 2012-03-06 | 2012-08-08 | 北京建筑工程学院 | 一种基于行程识别的出行者活动规律分析方法 |
CN103116696A (zh) * | 2013-01-16 | 2013-05-22 | 上海美慧软件有限公司 | 基于稀疏采样的手机定位数据的人员常驻地点识别方法 |
CN103593361A (zh) * | 2012-08-14 | 2014-02-19 | 中国科学院沈阳自动化研究所 | 感应网络环境下移动时空轨迹分析方法 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011046113A1 (ja) * | 2009-10-14 | 2011-04-21 | 日本電気株式会社 | 行動類型抽出システム、装置、方法、プログラムを記憶した記録媒体 |
CN103218442A (zh) * | 2013-04-22 | 2013-07-24 | 中山大学 | 一种基于移动设备传感器数据的生活模式分析方法及系统 |
-
2014
- 2014-09-02 CN CN201410443562.3A patent/CN104252527B/zh active Active
- 2014-12-12 WO PCT/CN2014/093759 patent/WO2016033901A1/zh active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102629297A (zh) * | 2012-03-06 | 2012-08-08 | 北京建筑工程学院 | 一种基于行程识别的出行者活动规律分析方法 |
CN103593361A (zh) * | 2012-08-14 | 2014-02-19 | 中国科学院沈阳自动化研究所 | 感应网络环境下移动时空轨迹分析方法 |
CN103116696A (zh) * | 2013-01-16 | 2013-05-22 | 上海美慧软件有限公司 | 基于稀疏采样的手机定位数据的人员常驻地点识别方法 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111797181A (zh) * | 2020-05-26 | 2020-10-20 | 北京城市象限科技有限公司 | 用户职住地的定位方法、装置、控制设备及存储介质 |
CN111797181B (zh) * | 2020-05-26 | 2023-09-05 | 北京城市象限科技有限公司 | 用户职住地的定位方法、装置、控制设备及存储介质 |
CN117992560A (zh) * | 2024-01-08 | 2024-05-07 | 国网湖北省电力有限公司电力科学研究院 | 一种基于poi信息的电动汽车常驻区域生成方法和系统 |
Also Published As
Publication number | Publication date |
---|---|
CN104252527B (zh) | 2018-04-20 |
CN104252527A (zh) | 2014-12-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2016033901A1 (zh) | 一种确定移动用户的常驻点信息的方法和装置 | |
US10915830B2 (en) | Multiscale method for predictive alerting | |
CN110115015A (zh) | 通过监测其行为检测未知IoT设备的系统和方法 | |
US20190228022A1 (en) | System for detecting and characterizing seasons | |
CN108292296B (zh) | 用于利用复发性模式创建时间序列数据的时段分布图的方法 | |
US10885461B2 (en) | Unsupervised method for classifying seasonal patterns | |
WO2015188324A1 (zh) | 移动终端位置预测方法及装置 | |
US10291463B2 (en) | Large-scale distributed correlation | |
Wang et al. | Dynamic poisson autoregression for influenza-like-illness case count prediction | |
US10019190B2 (en) | Real-time abnormal change detection in graphs | |
CN109104688B (zh) | 使用聚集技术生成无线网络接入点模型 | |
US20160308725A1 (en) | Integrated Community And Role Discovery In Enterprise Networks | |
US20170364818A1 (en) | Automatic condition monitoring and anomaly detection for predictive maintenance | |
US12001926B2 (en) | Systems and methods for detecting long term seasons | |
US20170249562A1 (en) | Supervised method for classifying seasonal patterns | |
Lu et al. | Robust occupancy inference with commodity WiFi | |
US20200125471A1 (en) | Systems And Methods For Forecasting Time Series With Variable Seasonality | |
WO2018040671A1 (zh) | 活动目标群组的分类方法和电子设备 | |
EP3494525B1 (en) | Realtime busyness for places | |
Jiang et al. | Predicting human mobility based on location data modeled by Markov chains | |
CN115329265A (zh) | 图码轨迹关联度确定方法、装置、设备及存储介质 | |
Yao et al. | Integrating AI into CCTV Systems: A Comprehensive Evaluation of Smart Video Surveillance in Community Space | |
US20150154279A1 (en) | Apparatus and method for building relation model based on resource management architecture | |
Yadav et al. | Providing occupancy as a service with databox | |
Bamis et al. | Exploiting human state information to improve gps sampling |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14901068 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14901068 Country of ref document: EP Kind code of ref document: A1 |