WO2013186552A2 - Aggregated mobility profiling - Google Patents

Aggregated mobility profiling Download PDF

Info

Publication number
WO2013186552A2
WO2013186552A2 PCT/GB2013/051534 GB2013051534W WO2013186552A2 WO 2013186552 A2 WO2013186552 A2 WO 2013186552A2 GB 2013051534 W GB2013051534 W GB 2013051534W WO 2013186552 A2 WO2013186552 A2 WO 2013186552A2
Authority
WO
WIPO (PCT)
Prior art keywords
wireless communications
mobile wireless
mobility data
communications device
mobility
Prior art date
Application number
PCT/GB2013/051534
Other languages
French (fr)
Other versions
WO2013186552A3 (en
Inventor
Abdelmalik Bachir
Kin Leung
Original Assignee
Imperial Innovations Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Imperial Innovations Limited filed Critical Imperial Innovations Limited
Publication of WO2013186552A2 publication Critical patent/WO2013186552A2/en
Publication of WO2013186552A3 publication Critical patent/WO2013186552A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W64/00Locating users or terminals or network equipment for network management purposes, e.g. mobility management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services

Definitions

  • the invention relates to using devices to record, analyse and summarize mobility data, and particularly relates to the use of mobile phones (or Wifi equipped devices) and small cells (or Wifi access points) to collect and aggregate the mobility data, and the representation of aggregated mobility profiles for groups of customers.
  • Collecting mobility data is the first operation that needs to be carried out.
  • the collected data should be statistically significant. Its volume needs to be large and it should represent all categories of customers. Therefore, the technology used for this operation is crucially important.
  • Techniques used for mobility data collection in previous research include: manual following and tracking of customers movement, attachment of RFIDs or cameras to shopping trolleys, WiFi-equipped wristbands, etc. With these technologies, the collected mobility data may be limited in size, inaccurate, biased, low resolution, invasive of individual privacy or may not represent the mobility profile of a wide set of all possible customers.
  • Another important factor for consideration is privacy and anonymity of customers reported in the collected mobility data. It is particularly important because the collected data shows the time, locations and the actual movement of the associated customer from one location to another. Typically, customers may not prefer their mobility to be tracked for this reason.
  • a system for collecting and characterising mobility data of a plurality of mobile wireless communications devices comprises at least a first and a second detector having a respective first and second detection range. Each of the detectors is arranged to detect entry and departure of a mobile wireless
  • the system further comprises an aggregator arranged to aggregate mobility data received from the detectors.
  • a system for collecting and characterising mobility data of at least one mobile wireless communications device comprises at least a first and a second detector having a respective first and second detection range. Each of the detectors is arranged to detect entry and departure of a mobile wireless communications device into the detection range and store associated mobility data.
  • the system further comprises a processor arranged to receive mobility data from the detectors and correlate the mobility data obtained in at least two detection ranges.
  • Figure 1 shows a diagram of an exemplary system for collecting mobility data.
  • Figure 2 is a flow chart of an example interaction between a mobile wireless communications device and a base station of figure 1.
  • Figure 3 shows a diagram of an exemplary system for collecting mobility data comprising more than one base station.
  • Figure 4 is a flow chart of an example interaction between a mobile wireless communications device and the base stations of figure 3.
  • Figure 5 is a flow chart of one embodiment for analysing mobility data.
  • Figure 6 shows a diagram of an exemplary network for collecting and analysing mobility data records.
  • Figure 7a shows an example tracking area where small cells are deployed.
  • Figure 7b represents an example spatial-mobility graph of customers.
  • Figure 7c represents a matrix representation of the graph of Figure 7b.
  • the approach provides an efficient representation of both spatial and temporal mobility patterns.
  • the spatial representation of mobility is based on graph models where each node in the graph represents a cell (which is adequately covered and served by a "base station" such as a femtocell or WiFi access point) of the given service area and a directed edge connecting two nodes in the graph indicates the possible movement of a customer from one of the nodes to another.
  • the weight of an edge in the graph corresponds to the probability for an arbitrary customer moving from one cell to another.
  • the mobility graph is augmented by three sets of parameters, two of them characterize temporal mobility, namely the distribution of the amount of time a customer spends in each cell and the correlations between these times, and the third parameter characterizes the frequency of visits to each cell. Further still, aggregation of the data allows customer behaviour analysis without the need to store potentially sensitive underlying data.
  • mobility data is collected and processed continuously. Specifically, the spatial and temporal representation of the mobility patterns are constructed and updated by an iterative process using the collected data. Once the graph model parameters are updated by processing new mobility data, the latter can be discarded permanently. Since mobility data can be deleted immediately after processing and since the model parameters only show aggregated profiling information, the proposed method ensures privacy and anonymity of individual customers and their mobility patterns. Furthermore, disposing of the collected data has another important advantage of reducing cost and risk related to excessive data hoarding.
  • the invention provides a system combining a suitable technology to collect mobility data and algorithms to process and represent a mobility profile efficiently.
  • the approach is developed for service areas deployed with small cells, such as femtocells or WiFi access points, to enable efficient collection of mobility data from a large set of potential customers.
  • small cells such as femtocells or WiFi access points
  • Appropriate networks will be well known to the skilled reader such that a detailed description is not required here.
  • femtocells are deployed at fixed positions throughout a service area (e.g., a large shopping mall).
  • the approximate location of a given customer can be determined when the customer's mobile phone can access a particular femtocell. That is, the detection of the mobile phone by the femtocell reveals the location of the associated customer. As a result, the corresponding femtocell can record the time when the customer enters its communication range (which enables the presence detection of the mobile phone and thus starts of the association) and when he/she leaves.
  • Such timing, mobile phone and femtocell (or location) identity information from the mobility data that can be processed by the proposed techniques to yield the aggregated mobility profile for a large group of customers. It should be clear that the collected mobility data, although representing approximate customer locations, is sufficient for a range of applications (e.g., characterization of shopper movement patterns) where location accuracy is of the order of the communication range of the small cell technology in use.
  • Another example of small-cell networks is the deployment of WiFi access points throughout a service or tracking area where the communication range of the access points is on the order to a few ten of meters, comparable to that of the femtocells.
  • the first step is to collect the mobility data.
  • Figure 1 shows an example system for collecting mobility data.
  • the example mobility data collection system comprises a base station 102 having a detection range 104, a mobile wireless communications device 106, a detection zone 108 and a non-detection zone 110.
  • the base station 102 is a wireless communications device typically installed at a fixed location.
  • the base station 102 has detection zone 108 extending from the base station 102 to a detection range 104 of the base station 102. Signals can be received by the base station 102 from devices that emit signals inside the detection zone 108.
  • An example of one such device is the mobile wireless communications device 106.
  • the mobile wireless communications device 106 is in the non-detection zone 110 which lies outside the detection zone 108. Therefore, in figure 1 signals from the mobile wireless communications device 106 cannot be detected by the base station 102.
  • the signal emitted by the mobile wireless communications device 106 and detected by the base station 102 includes an identifier.
  • the identifier serves to identify one mobile wireless communications device 106 from another, and in one embodiment may be an International Mobile Subscriber Identity (IMSI).
  • IMSI International Mobile Subscriber Identity
  • the IMSI is stored inside the mobile wireless communications device 106. Although an IMSI is specifically mentioned, the skilled person would recognise that other identifiers could be used to distinguish one mobile wireless communications device 106 from another.
  • a suitable type of base station 102 is a femtocell.
  • Femtocells have a detection range 104 that is greatly reduced compared to many other types of base station 102, and can have detection ranges 104 as small as a few metres.
  • femtocells have the capability of detecting: (a) the presence of mobile wireless communications devices 106 when they come within the detection range 104, (b) activities performed by mobile wireless communications devices 106 while they are within the detection range 104 (i.e., inside the detection zone 108) such as telephone and data calls, and (c) departures of the mobile wireless communications devices 106 from a detection zone 108 to a non-detection zone 110.
  • An example of a typical femtocell used by Alcatel-Lucent is available at the internet page addressed as www. alcatel-lucent, com.
  • FIG. 2 shows the interaction between a mobile wireless communications device 106 and a base station 102.
  • the mobile wireless communications device 106 begins at step 200, where it is outside of the detection range 104 of the base station 102 and is therefore in the non-detection zone 110. At this point, signals emitted by the mobile wireless communications device 106 are not received by the base station 102.
  • the mobile wireless communications device 106 enters the detection range 104 of the base station 102 and is therefore now in the detection zone 108 of the base station 102. At this point, signals emitted by the mobile wireless communications device 106 are received by the base station 102.
  • the base station 102 "associates" with the mobile wireless communications device 106 and records the following information: “nodelD " the identity of the base station 102, "customerlD” the identifier of the mobile wireless communications device 106, and the "entryTime " as the time when the base station 102 received a first signal from the mobile wireless communications device 106.
  • Algorithm 1 details the algorithm corresponding to the mobile wireless communications device 106 "associating" with the base station 102:
  • Algorithm 1 When a mobile wireless communications device "associates" with a base station j, create a mobility data record R as follows:
  • R.departureTime infinity
  • R.field found in the algorithm above corresponds to a data field of the data record R (i.e., customer ID, node ID, entry time and departure time).
  • the mobile wireless communications device 106 moves outside of the detection range 104, and therefore moves from the detection zone 108 to the non- detection zone 110. At this point, any signals emitted by the mobile wireless communications device 106 will not be received by the base station 102 since the mobile wireless communications device 106 is outside of the detection zone 108.
  • the base station 102 now no longer receiving signals from the mobile wireless communications device 106, "disassociates" with the mobile wireless communications device 106 by recording a "departureTime” as the time when the last signal is received from the mobile wireless communications device 106 at the base station 102.
  • Algorithm 2 below details the algorithm corresponding to the mobile wireless communications device 106 "disassociating" with the base station 102:
  • Algorithm 2 When a mobile wireless communications device disassociates with a base station j, perform the following:
  • the data record R comprising the "nodelD”, the "customerlD”, the "entryTime” and the “departureTime” is stored in a memory, either at the base station 102 or at an external location to which the base station 102 has communicated the information.
  • the system comprises first and second base stations (302a, 302b), including first and second detection ranges (304a, 304b) and first and second detection zones (308a, 308b).
  • the system further comprises a mobile wireless communications device 306, a non-detection zone 310 and an overlap zone 300.
  • Figure 4 shows a flow diagram of an example system comprising first and second base stations (302a, 302b) with respective first and second detection ranges 304a and 304b.
  • the mobile wireless communications device 306 may begin in the non-detection zone 310, located outside of the first and second detection ranges 304a and 304b of first and second base stations 302a and 302b.
  • the cells are distributed to cover the entire service area to allow
  • the mobile wireless communications device 306 When the mobile wireless communications device 306 enters one of the detection ranges, for example the first detection range 304a of the first base station 302a at step 402, it is located in the first detection zone 308a. At this point, signals emitted by the mobile wireless communications device 306 are received by the first base station 302a. As with step 204 of figure 2, at step 404 the first base station 302a "associates" with the mobile wireless communications device 306 and records the "node ID" of the first base station 302a, the "customerlD" of the mobile wireless communications device 306 and the "entryTime".
  • the mobile wireless communications device 306 may enter the second detection range 304b of the second base station 302b, and therefore may simultaneously be in the first detection zone 308a and the second detection zone 308b, as shown at step 406 of figure 4. This point is referred to in figure 3 as the overlap zone 300. While in the overlap zone, the mobile wireless communications device 306 associates with the base station which is receiving the strongest signal from the mobile wireless communications device 306.
  • the mobile wireless communications device 306 continues to be associated with the first base station 302a, as shown at step 408.
  • the first base station 302a in the case where a simple transition to a new detection zone takes place or where the strongest received signal is at another base station, for example the second base station 302b, the first base station 302a "disassociates" with the mobile wireless communications device 306 by recording a "departureTime" associated with the first base station 302a.
  • the second base station 302b "associates" with the mobile wireless communications device 306 and records: the "node ID" of the second base station 302b, the "customerlD” of the mobile wireless communications device 306 and the "entryTime”.
  • the process of "disassociation" of the current base station and subsequent “association” of a new base station with the mobile wireless communications device 306 repeats if the new base station receives a stronger signal from the mobile wireless
  • the "nodelD" for that base station, the " customerlD” ', the "entryTime” and the “departure Time” are stored in a memory, either at the base station or at an external location to which the base station has communicated the information.
  • Each base station may have its own memory, or they may all link to a central memory.
  • the data record R is populated with data as a mobile wireless communications device moves around a network of base stations.
  • the base stations are femtocells or WiFi access points
  • the location of each mobile wireless communications device is known to an accuracy of a few to a couple of tens of metres.
  • the fact that a mobile wireless communications device can be "associated" with a particular femtocell or access point is enough to determine the location of the mobile wireless communications device. Therefore, this method allows the location, time spent at said location, and information on transitions between cells of a mobile wireless communications device to be known as it moves through a network of femtocells and/or access points. Analysing the Collected Mobility Data
  • the method captures the temporal characteristics of mobility by providing the probability distributions of sojourn times for each node of the graph where the sojourn time is the time duration in which an arbitrary customer stays associated with a given node.
  • the technique provides the correlation coefficient between the sojourn times of an arbitrary customer in any given pair of nodes in the graph.
  • the approach provides the relative frequency (between 0 and 1) of visits made by an arbitrary customer to a given node of the graph compared to the other nodes of the graph.
  • the graph model is established for a given deployment of small cells (which are also referred to as small base stations such as femtocells and/or WiFi access points) in a service area.
  • small cells which are also referred to as small base stations such as femtocells and/or WiFi access points
  • the service area is also referred to as the tracking area.
  • New nodes are added to the graph as additional cells are deployed.
  • the proposed method collects the mobility data as customers move from cell to cell.
  • the data is processed and aggregated into the edge weights (i.e., branching probabilities) in the graph model, the sojourn-time
  • the model parameters including edge weights, sojourn-time distributions and the correlation coefficients matrix elements, as well as the relative frequency distribution, which represent the aggregated mobility profile can be obtained and revised based on actual customer mobility data.
  • the latter can be deleted permanently from the system as a means to ensure privacy and anonymity of customers.
  • disposing of the collected mobility data has another important advantage of reducing costs and risks related to excessive data hoarding.
  • Figure 5 provides a summary flow diagram of one embodiment for analysing the data.
  • the communications device enters a tracking area containing at least one cell capable of "associating" with the first mobile wireless communications device when the first mobile wireless communications device enters the detection range of a cell.
  • each time the first mobile wireless communications device "associates” and subsequently “disassociates” with a cell the mobility data for that cell ("nodelD", "entryTime” and “departureTime") and the "customerlD" of the first mobile wireless communications device is sent to a database which may be located on a processing unit.
  • a transition between two cells for a given device having a customer ID can be identified by sequentially tracking and correlating departure time from a first cell with the temporally closest entry time to another cell in one simple implementation - even if there is spatial/temporal discontinuity between cells.
  • a transition between two cells for a given device having a customer ID can be identified by sequentially tracking and correlating departure time from a first cell with the temporally closest entry time to another cell in one simple implementation - even if there is spatial/temporal discontinuity between cells.
  • a specific delay threshold e.g., several seconds
  • the delay threshold for a given pair of neighbouring cells can be determined and calibrated based on actual measurements from the deployed network.
  • the processing unit containing the database updates a field in the database corresponding to the " customerlD" of the first mobile wireless communications device with the mobility data.
  • a second mobile wireless communications device enters the same tracking area containing at least one cell capable of "associating" with the second mobile wireless communications device when the second mobile wireless communications device enters the detection range of a cell.
  • step 508 each time the second mobile wireless communications device "associates” and subsequently “disassociates” with a cell, the mobility data for that cell ("nodelD", "entryTime” and “departureTime") and the "customerlD" of the second mobile wireless communications device is sent to a database which may be located on a processing unit.
  • the processing unit containing the database updates a field in the database corresponding to the "customerlD ' " of the second mobile wireless communications device with the mobility data.
  • an analyser analyses the database fields corresponding to the "customerlDs" of the first and second mobile wireless communications devices.
  • the analysis includes determining probabilities, correlation coefficients or any other parameter.
  • one parameter is the "spatial mobility", i.e. the movement from one cell to another. This can be represented by a directed graph in which each node corresponds to one cell and an edge pointing from node A to another node B if a mobile wireless communications device can possibly move from A to B directly (i.e., cells A and B are expected to be neighbour to each other).
  • a mobile wireless communications device is considered to be located within the detection range of a cell (node) if the mobile wireless communications device is associated with that cell.
  • the data is updated for each transition to increase the probability value for the particular transition and decrease that for all other transitions, as set out below. This provides a simple and automated manner of continually or dynamically updating the data as more data is added.
  • Figure 7a shows an example tracking area comprising a plurality of cells 1 to 12.
  • Figure 7b shows an example spatial-mobility graph corresponding to the tracking area of figure 7a.
  • the spatial mobility of a mobile wireless communications device in the tracking area of figure 7a is modelled by a graph G(TU W, E) where ?U wis the set of nodes 1 to 12.
  • Each node 1 to 12 in V represents a cell deployed in the tracking area of figure 7a.
  • the set W has two fictitious nodes a and b, which are used to represent the "beginning" and the "end" of the journey of customers; node a is called source and node b is called sink.
  • node a is represented by label 0 and node b is represented by label 13.
  • the journey of a given customer is deemed to have started when its presence is detected by any of the cells in the tracking area. Also, the journey of a given customer is deemed to have ended when no cell in the tracking area can detect the presence of that customer for certain amount of time.
  • Each edge e(i, j) in E shows the possibility of a customer moving directly from node to node j.
  • Each edge e(i, j) has a weight between 0 and 1 that corresponds to the probability of an arbitrary mobile wireless communications device in node to move to node j as its next movement.
  • the edge (i,j) represents the probability that the journey of the customer starts at node j. Also, when the node j is the fictitious node b, the edge (i,j) represents the probability that the journey of the customer ends right after visiting node i.
  • the graph G is directional because the probability of moving from node to node j is not necessarily equal to that of moving from node j to node .
  • K ⁇ V ⁇ .
  • Each element m l ⁇ is the probability for an arbitrary mobile wireless communications device in node to move directly to node j.
  • node 0 is the fictitious cell (node) representing the source
  • node K+1 is the fictitious cell (node) representing the sink.
  • the entry of a customer into the sink node is regarded as the end of the customer's journey in the tracking area (service area).
  • the matrix Mis stochastic, i.e., each my element lies between 0 and 1 and the row sum is equal to 1.
  • the matrix M is initialized in a way that ensures that all transitions between
  • figure 7b shows only edges from the source node to nodes deployed at the entrance of the tracking area, edges are also possible from the source node to any other node in the deployment area. This is to capture the fact that a given customer may have not been detected by nodes deployed at the entrance of tracking area. This may be caused by the fact that the customer may have had his/her wireless communications mobile device switched off at the time he/she entered the tracking area.
  • figure 7b shows only p Hges from the nodes deployed at the exit of the tracking area to the sink node, edges are also possible from any other node in tracking area to the sink node. This is to capture the fact that a customer may not have been detected by nodes deployed at the exit of tracking area.
  • Equation (1) uses exponential smoothing to iteratively update the values of the branching probabilities.
  • the value of a lies between 0 and 1, and is generally taken to be larger than 0.9.
  • the most appropriate value of a for a given tracking area can be determined experimentally.
  • all ⁇ 3 ⁇ 4 values are set to 0 to represent the mobility from a node to itself.
  • node means any node in the set V.
  • the distribution of the relative frequency of visits to each node is defined as the relative percentage of the number of associations/disassociations made by a wireless communication device with a given node compared to those made with the other nodes.
  • f K be a vector that describes the relative frequency of visits to all nodes.
  • F (n) is the value oiF after processing n mobility data records.
  • R mobility data record
  • fk (n+1) ⁇ fk (n) + 1 - ⁇ where ⁇ is properly chosen between 0 and 1 else
  • the sojourn time is defined as the amount of time a mobile wireless communications device spends in any given node.
  • the probability distribution characterizing the sojourn time is chosen as a way to limit the number of model parameters in the representation of mobile wireless communications device mobility and to enable the deletion of mobility data after it is processed and aggregated into the model parameters.
  • the deletion of raw data from the system after processing can help preserve the privacy and anonymity of customers using mobile wireless
  • Umax be the maximum amount of time a mobile wireless communications device may spend at a particular node.
  • P j (l) be the probability that the sojourn time of an arbitrary mobile wireless communications device in node j is less than or equal to for each node j from
  • the values of P j (l) for all / represent an approximate cumulative probability function (CDF) for the sojourn time of an specific mobile wireless communications device in node j.
  • the method receives a mobility data record associated with node j, it invokes the following Algorithm 4 to update P ⁇ (1) for all time intervals /.
  • the algorithm is run at the analyser each time the database is updated with a new mobility data record.
  • the updating of P 3 is performed iteratively and Pj n) (l) is the value oiPfl) after processing n mobility data records for node j.
  • R mobility data record
  • R.departureTime - R.entryTime is the smallest integer that is larger than or equal to x.
  • R(i) denotes the set of mobility data records of mobile wireless communications device during its journey.
  • ,N represents the mobility data record created and completed after a mobile wireless communications device disassociates with the last node and leaves the tracking (service) area.
  • the mobile wireless communications device is considered to have left the tracking area if no mobility record is created for it for a certain amount of time. This time also takes into account the mobile wireless communications devices which are no longer detectable due to other causes such as (switched off by customers, battery shortage, etc).
  • the index for the notation has been dropped for brevity.
  • the sojourn times of the mobile wireless communications device per visit to a node can be directly obtained from the set of records R by subtracting the value of the entryTime field from the value of the departureTime field of each mobility data record.
  • S ⁇ S lt ... , S v , ... , S N j
  • the calculation of the correlation coefficients involves the calculation of the covariance a jk between the sojourn times of an arbitrary mobile wireless
  • ⁇ ( ⁇ +1) ⁇ ⁇ ( ⁇ ) + (] _ ⁇ ) _ ⁇ ( ⁇ +1) j (3 )
  • is a properly chosen parameter between 0 and 1 and
  • denotes the absolute
  • Equation (3) shows that the updating of Oj is performed iteratively and (7, is the value of Oj after processing n mobility data records in which the nodelD field is equal to j.
  • 7. is the value of Oj after processing n mobility data records in which the nodelD field is equal to j.
  • the sojourn times S, and S w extracted from R v and R w respectively are used to update the covariance a jk between the sojourn times at nodes j and k, according to the following equation:
  • Equation (4) shows that the updating of ⁇ is performed iteratively and ⁇ (n) is the value of ⁇ after processing n pairs of mobility data records (R v , R w ) in which
  • the set R of mobility data records can be disposed of permanently. Hence mobility data is obtained and aggregated, allowing the identification of trends such as the correlation between residence in central locations and their respective timing, allowing customer behaviour patterns to be developed.
  • the set R of mobility data records can be disposed of permanently after all the pairs of mobility data have been processed, data linking individual mobile wireless communications devices to locations and times is not maintained. As a result, only the mobility of groups of mobile wireless communications devices is stored, therefore the privacy of individual customers of mobile wireless communications devices is maintained. Furthermore, the disposal of mobility data records R after processing reduces the cost of data storage and minimises the security threats related to the stored data.
  • data can be aggregated by customer type, for example customers can be categorized based on data associated with their identification. Such data may be extracted, for example, from data available to a network provider and can allow categorization of a customer at a generic level, again without requiring privacy related data. Hence, in addition to an overall aggregated data set, customer-categorized aggregated sub-sets can also be developed and stored.
  • FIG. 6 shows an exemplary network for collecting and analysing mobility data records.
  • Femtocells 600 are located in a tracking area.
  • the cells may have a coverage that includes the entire tracking area, or may only include a portion of it.
  • the tracking area could be a supermarket, a shopping mall, or any other area where it might be desirable to track customer movement.
  • the cells 600 associate with mobile wireless communications devices 610, and
  • the mobile wireless communications devices may be mobile phones held by customers, for example. As such the described method tracks the movement of customers in a tracking area using mobile phones that they carry about their person.
  • Mobility data collected by the cells is transmitted, via the internet 604, to a processing unit which may be a gateway 606.
  • the gateway 606 collects presence information from the presence server, and creates and completes customer mobility data records. It uses these records to calculate the branching probabilities, the distributions of the sojourn times, and the correlation coefficients between the sojourn times at different cells.
  • the processing unit updates a database field with the mobility data for each mobile wireless communications device corresponding to the "customerlD" and the "nodelD".
  • the gateway may also include the analyser which calculates the parameters to characterise mobility, as mentioned above.
  • the gateway may interact with the network of a service provider. In doing so, the gateway may request further information from the service provider related to a
  • the gateway may request the age of a customer using the specific mobile wireless communications device.
  • the gateway is able to calculate parameters for an individual age range of customers, for example.
  • other data related to the customer could be requested from the service provider to provide more detailed mobility parameters.
  • the disclosed approach can operate in real-time or non-real-time basis.
  • the gateway 606 may ask, for example, the presence server 602 or other appropriate entity, to accumulate the data locally and transmit the accumulated data from time to time to reduce the overhead and frequency of interaction between the gateway and the presence server, thereby improving the overall scalability of the approach.
  • mobility data is stored in a database and each customer can be identified by a unique identity (ID) such as IMSI, TMSI IMEI, and MAC address.
  • ID a unique identity
  • the infrastructure i.e., the deployed femtocells, makes it possible to communicate with customers. Therefore, a text message for example can be sent to customers to inform them about tracking and its options (anonymous or not).
  • customers can be offered with the options of agreeing or disagreeing with the mobility tracking and aggregated profiling. They can then make their choices and communicate their decisions to the system.
  • mobility data of every individual customer is processed and aggregated in real time with previous mobility data, and is cleared right after that. Therefore, customers remain completely anonymous. Additionally, in this embodiment it may be the presence server instead of the gateway that communicates with a service provider to request further customer data.
  • customer data may include customer age, gender, residential area, etc. so that the data can be used to establish categories of aggregated mobility profiles for various customer types.
  • each node in the graph represents a cell, but can be extended so that each node in the graph represents a set of neighbouring cells.
  • sojourn time is the time a customer spends at a node per visit to the node
  • the definition can be extended to include repeated and separate visits to the same node during a given journey.
  • the sojourn time can also be defined as the cumulative time a customer spends at the same node in the whole journey.
  • the approach can also be extended to establish/use different graph models and
  • customer information may be needed form the service providers so that the approach can process and aggregate a given mobility data record into the corresponding graph model and distributions associated with the given customer, after which the private information is discarded and irretrievable.
  • additional cells can also be added dynamically and automatically and their values populated as the data set grows by simple extension of the data and without requiring a rewrite of the existing structure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Arrangements For Transmission Of Measured Signals (AREA)

Abstract

A system for collecting and characterising mobility data of a plurality of mobile wireless communications devices is disclosed, the system comprising: at least a first and a second detector having a respective first and second detection range and each arranged to detect entry and departure of a mobile wireless communications device into the detection range and store associated mobility data; and an aggregator arranged to aggregate mobility data received from the detectors.

Description

Aggregated Mobility Profiling
Field
The invention relates to using devices to record, analyse and summarize mobility data, and particularly relates to the use of mobile phones (or Wifi equipped devices) and small cells (or Wifi access points) to collect and aggregate the mobility data, and the representation of aggregated mobility profiles for groups of customers.
Background
The automated collection and processing of data regarding individual behaviour presents significant technical challenges of accuracy, ease of analysis and data security. For example, understanding customer shopping behaviours is one of the used tools required by business owners to evolve and sell their products in a way that improves customer satisfaction and increases profit. In the past decades, researchers have been focusing on following customers, recording their behaviours, establishing models to classify them into groups and trying to figure out the motives for their purchase actions.
Among many parameters defining the behaviours of customers, mobility is the one that has attracted extensive research efforts. Study of customer mobility requires the accomplishment of two operations: (i) collecting mobility data, and (ii) analyzing the collected data to establish models that help business owners make appropriate decisions.
Collecting mobility data is the first operation that needs to be carried out. The collected data should be statistically significant. Its volume needs to be large and it should represent all categories of customers. Therefore, the technology used for this operation is crucially important. Techniques used for mobility data collection in previous research include: manual following and tracking of customers movement, attachment of RFIDs or cameras to shopping trolleys, WiFi-equipped wristbands, etc. With these technologies, the collected mobility data may be limited in size, inaccurate, biased, low resolution, invasive of individual privacy or may not represent the mobility profile of a wide set of all possible customers.
Processing and analysing the collected mobility data to adequately characterize the aggregated mobility profile of a large number of customers is the second operation that needs to be performed. Most of the existing research in this direction has focused on using statistical techniques to extract the major mobility patterns and classifying customers accordingly. This can provide over-generalized results; conversely where customers with similar mobility patterns are classified in the same category, different mobility models have been established to represent each category. The representation of mobility has often been restricted to the path taken by customers during their shopping journeys. Although this provides some information about customer mobility, it can be inaccurate and the analysis available may be limited, and many techniques rely on trial and error to develop models.
Another important factor for consideration is privacy and anonymity of customers reported in the collected mobility data. It is particularly important because the collected data shows the time, locations and the actual movement of the associated customer from one location to another. Typically, customers may not prefer their mobility to be tracked for this reason.
Additionally, the technologies in use are often unable to collect mobility data from a wide variety of customers without disturbing their original behaviours, and require significant amounts of computation and mobility data to be saved and stored for subsequent processing when the established mobility models are updated or revised upon availability of new data. Again, storing such data can contribute to the breaching of customer privacy and anonymity, and incur excessive costs and risks related to the data hoarding. The invention is set out in the claims.
Embodiments of the invention will now be described. According to an embodiment, a system for collecting and characterising mobility data of a plurality of mobile wireless communications devices is disclosed. The system comprises at least a first and a second detector having a respective first and second detection range. Each of the detectors is arranged to detect entry and departure of a mobile wireless
communications device into the detection range and store associated mobility data. The system further comprises an aggregator arranged to aggregate mobility data received from the detectors.
According to another embodiment, a system for collecting and characterising mobility data of at least one mobile wireless communications device is disclosed. The system comprises at least a first and a second detector having a respective first and second detection range. Each of the detectors is arranged to detect entry and departure of a mobile wireless communications device into the detection range and store associated mobility data. The system further comprises a processor arranged to receive mobility data from the detectors and correlate the mobility data obtained in at least two detection ranges. Figure Listing
Figure 1 shows a diagram of an exemplary system for collecting mobility data.
Figure 2 is a flow chart of an example interaction between a mobile wireless communications device and a base station of figure 1. Figure 3 shows a diagram of an exemplary system for collecting mobility data comprising more than one base station.
Figure 4 is a flow chart of an example interaction between a mobile wireless communications device and the base stations of figure 3.
Figure 5 is a flow chart of one embodiment for analysing mobility data.
Figure 6 shows a diagram of an exemplary network for collecting and analysing mobility data records.
Figure 7a shows an example tracking area where small cells are deployed.
Figure 7b represents an example spatial-mobility graph of customers.
Figure 7c represents a matrix representation of the graph of Figure 7b.
Detailed Description
In overview, the approach provides an efficient representation of both spatial and temporal mobility patterns. The spatial representation of mobility is based on graph models where each node in the graph represents a cell (which is adequately covered and served by a "base station" such as a femtocell or WiFi access point) of the given service area and a directed edge connecting two nodes in the graph indicates the possible movement of a customer from one of the nodes to another. The weight of an edge in the graph corresponds to the probability for an arbitrary customer moving from one cell to another. In a further development, the mobility graph is augmented by three sets of parameters, two of them characterize temporal mobility, namely the distribution of the amount of time a customer spends in each cell and the correlations between these times, and the third parameter characterizes the frequency of visits to each cell. Further still, aggregation of the data allows customer behaviour analysis without the need to store potentially sensitive underlying data.
As customers move through various cells of the service area, mobility data is collected and processed continuously. Specifically, the spatial and temporal representation of the mobility patterns are constructed and updated by an iterative process using the collected data. Once the graph model parameters are updated by processing new mobility data, the latter can be discarded permanently. Since mobility data can be deleted immediately after processing and since the model parameters only show aggregated profiling information, the proposed method ensures privacy and anonymity of individual customers and their mobility patterns. Furthermore, disposing of the collected data has another important advantage of reducing cost and risk related to excessive data hoarding.
To overcome the shortcomings of existing techniques, the invention provides a system combining a suitable technology to collect mobility data and algorithms to process and represent a mobility profile efficiently. In one embodiment the approach is developed for service areas deployed with small cells, such as femtocells or WiFi access points, to enable efficient collection of mobility data from a large set of potential customers. Appropriate networks will be well known to the skilled reader such that a detailed description is not required here. For example, in known networks such as those described at www.smallcellforum.org, incorporated herein by reference, femtocells are deployed at fixed positions throughout a service area (e.g., a large shopping mall). Due to the relatively short communication range of femtocells, the approximate location of a given customer can be determined when the customer's mobile phone can access a particular femtocell. That is, the detection of the mobile phone by the femtocell reveals the location of the associated customer. As a result, the corresponding femtocell can record the time when the customer enters its communication range (which enables the presence detection of the mobile phone and thus starts of the association) and when he/she leaves. Such timing, mobile phone and femtocell (or location) identity information from the mobility data that can be processed by the proposed techniques to yield the aggregated mobility profile for a large group of customers. It should be clear that the collected mobility data, although representing approximate customer locations, is sufficient for a range of applications (e.g., characterization of shopper movement patterns) where location accuracy is of the order of the communication range of the small cell technology in use.
Another example of small-cell networks is the deployment of WiFi access points throughout a service or tracking area where the communication range of the access points is on the order to a few ten of meters, comparable to that of the femtocells.
Collecting Mobility Data
Turning to one embodiment of the detailed implementation, the first step is to collect the mobility data. Figure 1 shows an example system for collecting mobility data. The example mobility data collection system comprises a base station 102 having a detection range 104, a mobile wireless communications device 106, a detection zone 108 and a non-detection zone 110. The base station 102 is a wireless communications device typically installed at a fixed location. The base station 102 has detection zone 108 extending from the base station 102 to a detection range 104 of the base station 102. Signals can be received by the base station 102 from devices that emit signals inside the detection zone 108. An example of one such device is the mobile wireless communications device 106. In figure 1, the mobile wireless communications device 106 is in the non-detection zone 110 which lies outside the detection zone 108. Therefore, in figure 1 signals from the mobile wireless communications device 106 cannot be detected by the base station 102. The signal emitted by the mobile wireless communications device 106 and detected by the base station 102 includes an identifier. The identifier serves to identify one mobile wireless communications device 106 from another, and in one embodiment may be an International Mobile Subscriber Identity (IMSI). The IMSI is stored inside the mobile wireless communications device 106. Although an IMSI is specifically mentioned, the skilled person would recognise that other identifiers could be used to distinguish one mobile wireless communications device 106 from another.
As discussed above, in one embodiment a suitable type of base station 102 is a femtocell. Femtocells have a detection range 104 that is greatly reduced compared to many other types of base station 102, and can have detection ranges 104 as small as a few metres. Typically, femtocells have the capability of detecting: (a) the presence of mobile wireless communications devices 106 when they come within the detection range 104, (b) activities performed by mobile wireless communications devices 106 while they are within the detection range 104 (i.e., inside the detection zone 108) such as telephone and data calls, and (c) departures of the mobile wireless communications devices 106 from a detection zone 108 to a non-detection zone 110. An example of a typical femtocell used by Alcatel-Lucent is available at the internet page addressed as www. alcatel-lucent, com.
Figure 2 shows the interaction between a mobile wireless communications device 106 and a base station 102. The mobile wireless communications device 106 begins at step 200, where it is outside of the detection range 104 of the base station 102 and is therefore in the non-detection zone 110. At this point, signals emitted by the mobile wireless communications device 106 are not received by the base station 102.
At step 202, the mobile wireless communications device 106 enters the detection range 104 of the base station 102 and is therefore now in the detection zone 108 of the base station 102. At this point, signals emitted by the mobile wireless communications device 106 are received by the base station 102.
At step 204 the base station 102 "associates" with the mobile wireless communications device 106 and records the following information: "nodelD " the identity of the base station 102, "customerlD " the identifier of the mobile wireless communications device 106, and the "entryTime " as the time when the base station 102 received a first signal from the mobile wireless communications device 106.
Using this information, a weighted, directed graph can be developed. Algorithm 1 below details the algorithm corresponding to the mobile wireless communications device 106 "associating" with the base station 102:
Algorithm 1 : When a mobile wireless communications device "associates" with a base station j, create a mobility data record R as follows:
R.customerlD =
R.nodelD = j
R.entryTime = current time
R.departureTime = infinity The R.field found in the algorithm above corresponds to a data field of the data record R (i.e., customer ID, node ID, entry time and departure time).
At step 206, the mobile wireless communications device 106 moves outside of the detection range 104, and therefore moves from the detection zone 108 to the non- detection zone 110. At this point, any signals emitted by the mobile wireless communications device 106 will not be received by the base station 102 since the mobile wireless communications device 106 is outside of the detection zone 108.
At step 208, the base station 102, now no longer receiving signals from the mobile wireless communications device 106, "disassociates" with the mobile wireless communications device 106 by recording a "departureTime" as the time when the last signal is received from the mobile wireless communications device 106 at the base station 102. Algorithm 2 below details the algorithm corresponding to the mobile wireless communications device 106 "disassociating" with the base station 102: Algorithm 2: When a mobile wireless communications device disassociates with a base station j, perform the following:
R = findRecord(zj) where departureTime=infinity
R.nodelD = R.nodelD (unchanged)
R.customerlD = R.customerlD (unchanged)
R.entryTime = R.entryTime (unchanged)
R.departureTime = current time
The data record R comprising the "nodelD", the "customerlD", the "entryTime" and the "departureTime" is stored in a memory, either at the base station 102 or at an external location to which the base station 102 has communicated the information.
Hence it will be seen that data relating to the entry and departure time for a given customer or equivalently device is collected automatically and efficiently allowing capture of behavioural data relating to sojourn time at the cell, as well as potentially linking movement with other cells.
With reference to figure 3, according to another embodiment the system comprises first and second base stations (302a, 302b), including first and second detection ranges (304a, 304b) and first and second detection zones (308a, 308b). The system further comprises a mobile wireless communications device 306, a non-detection zone 310 and an overlap zone 300.
As discussed above, it is further desirable to capture data regarding transitions between cells. Figure 4 shows a flow diagram of an example system comprising first and second base stations (302a, 302b) with respective first and second detection ranges 304a and 304b. As with the system in figure 2, at step 400 the mobile wireless communications device 306 may begin in the non-detection zone 310, located outside of the first and second detection ranges 304a and 304b of first and second base stations 302a and 302b. Preferably the cells are distributed to cover the entire service area to allow
establishment of an uninterrupted mobility pattern. Where there is clear delineation between detection zones, movement may be tracked by monitoring appropriate association and disassociation as set out below. However, the approach described can also provide reliable data where detection zones overlap.
When the mobile wireless communications device 306 enters one of the detection ranges, for example the first detection range 304a of the first base station 302a at step 402, it is located in the first detection zone 308a. At this point, signals emitted by the mobile wireless communications device 306 are received by the first base station 302a. As with step 204 of figure 2, at step 404 the first base station 302a "associates" with the mobile wireless communications device 306 and records the "node ID" of the first base station 302a, the "customerlD" of the mobile wireless communications device 306 and the "entryTime". As the mobile wireless communications device 306 moves around within the first detection zone 308a, it may enter the second detection range 304b of the second base station 302b, and therefore may simultaneously be in the first detection zone 308a and the second detection zone 308b, as shown at step 406 of figure 4. This point is referred to in figure 3 as the overlap zone 300. While in the overlap zone, the mobile wireless communications device 306 associates with the base station which is receiving the strongest signal from the mobile wireless communications device 306.
In the case where the strongest received signal is received at the first base station 302a, the mobile wireless communications device 306 continues to be associated with the first base station 302a, as shown at step 408. At step 410, in the case where a simple transition to a new detection zone takes place or where the strongest received signal is at another base station, for example the second base station 302b, the first base station 302a "disassociates" with the mobile wireless communications device 306 by recording a "departureTime" associated with the first base station 302a. (Note that similar to existing cell handoffs in cellular wireless networks, elaborated algorithms such as use of hysteresis can be used to determine transition of a mobile device from the detection zone of one base station to a neighbouring one.) Simultaneously, the second base station 302b "associates" with the mobile wireless communications device 306 and records: the "node ID" of the second base station 302b, the "customerlD" of the mobile wireless communications device 306 and the "entryTime". Similarly, if a new transition or overlap zone is encountered, the process of "disassociation" of the current base station and subsequent "association" of a new base station with the mobile wireless communications device 306 repeats if the new base station receives a stronger signal from the mobile wireless
communications device 306 than the current base station.
With each disassociation of a base station with a mobile wireless communications device, the "nodelD" for that base station, the " customerlD" ', the "entryTime" and the "departure Time" are stored in a memory, either at the base station or at an external location to which the base station has communicated the information. Each base station may have its own memory, or they may all link to a central memory.
It can be seen that, by using this method, the data record R is populated with data as a mobile wireless communications device moves around a network of base stations. In the case where the base stations are femtocells or WiFi access points, the location of each mobile wireless communications device is known to an accuracy of a few to a couple of tens of metres. The fact that a mobile wireless communications device can be "associated" with a particular femtocell or access point is enough to determine the location of the mobile wireless communications device. Therefore, this method allows the location, time spent at said location, and information on transitions between cells of a mobile wireless communications device to be known as it moves through a network of femtocells and/or access points. Analysing the Collected Mobility Data
To complement the spatial mobility as reflected by the graph model, the method captures the temporal characteristics of mobility by providing the probability distributions of sojourn times for each node of the graph where the sojourn time is the time duration in which an arbitrary customer stays associated with a given node.
Furthermore, to capture the interdependence between the amount of time a customer stays in one node and that in a second node, the technique provides the correlation coefficient between the sojourn times of an arbitrary customer in any given pair of nodes in the graph. In addition, to capture the frequency of visits to each node, the approach provides the relative frequency (between 0 and 1) of visits made by an arbitrary customer to a given node of the graph compared to the other nodes of the graph.
The graph model is established for a given deployment of small cells (which are also referred to as small base stations such as femtocells and/or WiFi access points) in a service area. For the purpose of mobility profiling, the service area is also referred to as the tracking area. New nodes are added to the graph as additional cells are deployed. As discussed below, the proposed method collects the mobility data as customers move from cell to cell. The data is processed and aggregated into the edge weights (i.e., branching probabilities) in the graph model, the sojourn-time
distributions and the correlation coefficients matrix for the sojourn times, as well as the relative frequency distribution of visits to each node. As a result, the model parameters including edge weights, sojourn-time distributions and the correlation coefficients matrix elements, as well as the relative frequency distribution, which represent the aggregated mobility profile, can be obtained and revised based on actual customer mobility data. However, after updating these model parameters by processing a piece of mobility data, the latter can be deleted permanently from the system as a means to ensure privacy and anonymity of customers. Furthermore, disposing of the collected mobility data has another important advantage of reducing costs and risks related to excessive data hoarding.
Turning to one detailed implementation, Figure 5 provides a summary flow diagram of one embodiment for analysing the data. At step 500, a first mobile wireless
communications device enters a tracking area containing at least one cell capable of "associating" with the first mobile wireless communications device when the first mobile wireless communications device enters the detection range of a cell.
At step 502, each time the first mobile wireless communications device "associates" and subsequently "disassociates" with a cell, the mobility data for that cell ("nodelD", "entryTime" and "departureTime") and the "customerlD" of the first mobile wireless communications device is sent to a database which may be located on a processing unit. A transition between two cells for a given device having a customer ID can be identified by sequentially tracking and correlating departure time from a first cell with the temporally closest entry time to another cell in one simple implementation - even if there is spatial/temporal discontinuity between cells. As an exemplary
embodiments, if two mobility data records associated with the same customer ID reveal that a time gap between the departure time from one cell A and the entry time of a neighbouring cell B is less than a specific delay threshold (e.g., several seconds), then the transition from cell A to B for the said customer is considered to have taken place. Naturally, the delay threshold for a given pair of neighbouring cells can be determined and calibrated based on actual measurements from the deployed network.
In order to aggregate behaviour for multiple customers, at step 504 the processing unit containing the database updates a field in the database corresponding to the " customerlD" of the first mobile wireless communications device with the mobility data.
At step 506, a second mobile wireless communications device enters the same tracking area containing at least one cell capable of "associating" with the second mobile wireless communications device when the second mobile wireless communications device enters the detection range of a cell.
In a similar manner to step 502, at step 508 each time the second mobile wireless communications device "associates" and subsequently "disassociates" with a cell, the mobility data for that cell ("nodelD", "entryTime" and "departureTime") and the "customerlD" of the second mobile wireless communications device is sent to a database which may be located on a processing unit. In a similar manner to step 504, at step 510 the processing unit containing the database updates a field in the database corresponding to the "customerlD'" of the second mobile wireless communications device with the mobility data.
At step 512, and as discussed in more detail below, an analyser analyses the database fields corresponding to the "customerlDs" of the first and second mobile wireless communications devices. The analysis includes determining probabilities, correlation coefficients or any other parameter. For example, one parameter is the "spatial mobility", i.e. the movement from one cell to another. This can be represented by a directed graph in which each node corresponds to one cell and an edge pointing from node A to another node B if a mobile wireless communications device can possibly move from A to B directly (i.e., cells A and B are expected to be neighbour to each other). A mobile wireless communications device is considered to be located within the detection range of a cell (node) if the mobile wireless communications device is associated with that cell. In one embodiment, the data is updated for each transition to increase the probability value for the particular transition and decrease that for all other transitions, as set out below. This provides a simple and automated manner of continually or dynamically updating the data as more data is added.
Figure 7a shows an example tracking area comprising a plurality of cells 1 to 12. Figure 7b shows an example spatial-mobility graph corresponding to the tracking area of figure 7a. Referring now to figures 7a and 7b, the spatial mobility of a mobile wireless communications device in the tracking area of figure 7a is modelled by a graph G(TU W, E) where ?U wis the set of nodes 1 to 12. Each node 1 to 12 in V represents a cell deployed in the tracking area of figure 7a. The set W has two fictitious nodes a and b, which are used to represent the "beginning" and the "end" of the journey of customers; node a is called source and node b is called sink. In figure 7b, node a is represented by label 0 and node b is represented by label 13. The journey of a given customer is deemed to have started when its presence is detected by any of the cells in the tracking area. Also, the journey of a given customer is deemed to have ended when no cell in the tracking area can detect the presence of that customer for certain amount of time. Each edge e(i, j) in E shows the possibility of a customer moving directly from node to node j. Each edge e(i, j) has a weight between 0 and 1 that corresponds to the probability of an arbitrary mobile wireless communications device in node to move to node j as its next movement. When the node i is the fictitious node a, the edge (i,j) represents the probability that the journey of the customer starts at node j. Also, when the node j is the fictitious node b, the edge (i,j) represents the probability that the journey of the customer ends right after visiting node i. The graph G is directional because the probability of moving from node to node j is not necessarily equal to that of moving from node j to node .
Referring now to figure 7c, the edge weights in the graph G can be represented by a square matrix M= [niy] ij=0, ... ,K+I where K = \ V\ . Each element ml} is the probability for an arbitrary mobile wireless communications device in node to move directly to node j. Note that node 0 is the fictitious cell (node) representing the source and node K+1 is the fictitious cell (node) representing the sink. The entry of a customer into the sink node is regarded as the end of the customer's journey in the tracking area (service area). The matrix Mis stochastic, i.e., each my element lies between 0 and 1 and the row sum is equal to 1. Using the mobility data collected above, all the elements of the matrix M are updated iteratively. Specifically, this method examines the database field associated with a specific " ' customerlD" of a mobile wireless communications device. When it identifies a movement of the mobile wireless communications device from node / to node j, it updates the elements mlk for k = Ο, .,. ,Κ + 1 of the matrix M
according to the following equation:
=απι η) (*) (l) where l}(k) is the indicator function equal to 1 if k =j and 0 elsewhere. That is, lj(k)=\ for k=j is to indicate the fact that the device moves from cell to cell k, including the possibility of cell k being the source node 0 or the sink node K+1.
The matrix M is initialized in a way that ensures that all transitions between
neighbouring nodes (adjacent cells) are equiprobable, i.e., for = 0, ... ,K, if node has η neighbours, then ml} = l/η if node j is neighbour to node and ml} = 0 is node j is not a neighbour to node . For = K +1 and j=0, K, set ml} = 0 and m„ = 1. Note that, by convention, the source node (node 0) is initially neighbour to all nodes representing cells deployed at the entrance of the tracking area. Note that although figure 7b shows only edges from the source node to nodes deployed at the entrance of the tracking area, edges are also possible from the source node to any other node in the deployment area. This is to capture the fact that a given customer may have not been detected by nodes deployed at the entrance of tracking area. This may be caused by the fact that the customer may have had his/her wireless communications mobile device switched off at the time he/she entered the tracking area. Similarly, although figure 7b shows only pHges from the nodes deployed at the exit of the tracking area to the sink node, edges are also possible from any other node in tracking area to the sink node. This is to capture the fact that a customer may not have been detected by nodes deployed at the exit of tracking area. This may be caused by the fact that the customer may have had his/her wireless communications mobile device switched off before he/she leaves the tracking area. According to the example provided in figure 7b, the entrance is also used to exit the tracking area; therefore, nodes 5 and 8 represent the cells deployed at the entrance and the exit of the tracking area at the same time. The updating of matrix M is performed according to equation (1) which shows how to calculate M at iteration (n+1) in function of M at interation (n) iteratively based on the movements of the first (n+1) mobile wireless communications devices leaving node as a function of that using the movements of the first (n) mobile wireless communications devices. Hence the probability value for the detected transition is incremented, and all others decremented. Essentially, Equation (1) uses exponential smoothing to iteratively update the values of the branching probabilities. The value of a lies between 0 and 1, and is generally taken to be larger than 0.9. The most appropriate value of a for a given tracking area can be determined experimentally. By convention, all η¾ values are set to 0 to represent the mobility from a node to itself.
Other example parameters that can be determined by the analyser include the probability distribution of the sojourn time of a specific mobile wireless
communications device in any given node and the correlations between the sojourn times in any pair of nodes. These two parameters characterise the temporal aspects of mobile wireless communications device mobility. These parameters provide information about the amount of time mobile wireless communications devices spend in each node and reveal the degree of correlation for the time durations a mobile wireless communications device spends in any two nodes. Only nodes representing cells that are deployed in the tracking area are considered the fictitious cell is excluded. Therefore, in the rest of the description, the term node means any node in the set V. The distribution of the relative frequency of visits to each node is defined as the relative percentage of the number of associations/disassociations made by a wireless communication device with a given node compared to those made with the other nodes. Let F = (fl5 . . ., fK) be a vector that describes the relative frequency of visits to all nodes. Each element fj j=l, . . ., K describes the relative frequency of visits to node j. Therefore 0 <= fj <= 1 for j=l,. . .,K and fi + . . . + fK = 1. When the method receives a mobility data record associated with node j, it invokes the following Algorithm 3 to update =(fl5 . . ., fK). The algorithm is run at the analyser each time the database is updated with a new mobility data record. The updating of F is performed iteratively and F(n) is the value oiF after processing n mobility data records. Initially, this method sets all values of F to 1/K to start with an equiprobable distribution, i.e. Jj (0) = FK, for ally.
Algorithm 3
Data: R: mobility data record
j = R.nodeID
for k = 1, K do
if k =j then
fk(n+1) = β fk(n) + 1 - β where β is properly chosen between 0 and 1 else
r (n+l) _ o f (n)
Jk - P Jk
end
end
The sojourn time is defined as the amount of time a mobile wireless communications device spends in any given node. The probability distribution characterizing the sojourn time is chosen as a way to limit the number of model parameters in the representation of mobile wireless communications device mobility and to enable the deletion of mobility data after it is processed and aggregated into the model parameters. The deletion of raw data from the system after processing can help preserve the privacy and anonymity of customers using mobile wireless
communications devices and avoid memory costs and risks associated with storage of a huge volume of data. For example, a technique similar to the one presented by K. K. Leung in "Power Control by Kalman Filter With Error Margin for Wireless IP
Networks", in Proceeding of IEEE WCNC, Chicago, IL, September 2000 can be used to obtain the sojourn-time distribution for each node as follows.
Let Umax be the maximum amount of time a mobile wireless communications device may spend at a particular node. For ease of computation, we divide
Figure imgf000021_0001
into L time intervals of equal duration. Let the time intervals be indexed by / = 1, 2, ... , L so that for a given interval /, the sojourn time ranges from xmax(/ - 1)/L to rma /L
Let Pj(l) be the probability that the sojourn time of an arbitrary mobile wireless communications device in node j is less than or equal to for each node j from
1 to K and every time interval / from 1 to L. For a given node j, the values of Pj(l) for all / represent an approximate cumulative probability function (CDF) for the sojourn time of an specific mobile wireless communications device in node j. When the method receives a mobility data record associated with node j, it invokes the following Algorithm 4 to update P}(1) for all time intervals /. The algorithm is run at the analyser each time the database is updated with a new mobility data record. The updating of P3 is performed iteratively and Pjn)(l) is the value oiPfl) after processing n mobility data records for node j. Initially, this method sets all values of Pfl) to 7, i.e. Pj (0)(l) = 1, for all / and j.
Algorithm 4
Data: R: mobility data record
j = R.nodeID
r = R.departureTime - R.entryTime is the smallest integer that is larger than or equal to x.
Figure imgf000022_0001
for k = 1, .. , 1 - 1 do
Figure imgf000022_0002
yPfn) (k) where y is properly chosen between 0 and 1
end
for k = I, , L do
Figure imgf000022_0003
end
To capture the possible dependencies between the sojourn times of mobile wireless communications devices in two different nodes, the correlations between the sojourn times in these nodes is calculated. This calculation requires a consideration of the entire journey of each mobile wireless communications device from the entry point of the tracking area to the exit point of the tracking area. The sojourn times of mobile wireless communications device in the sequence of nodes for the same journey are obtained from R(i), which denotes the set of mobility data records of mobile wireless communications device during its journey. Specifically, for the mobile wireless communications device , let R = {Rj, R2, ... ,RN} where Rv, v = 1, ... ,Nrepresents the mobility data record created and completed after a mobile wireless communications device disassociates with the last node and leaves the tracking (service) area. The mobile wireless communications device is considered to have left the tracking area if no mobility record is created for it for a certain amount of time. This time also takes into account the mobile wireless communications devices which are no longer detectable due to other causes such as (switched off by customers, battery shortage, etc). The index for the notation has been dropped for brevity.
The sojourn times of the mobile wireless communications device per visit to a node can be directly obtained from the set of records R by subtracting the value of the entryTime field from the value of the departureTime field of each mobility data record. We use S to represent these sojourn times. We have, S = {Slt ... , Sv, ... , SNj where Sv = Rv.departureTime -Rv.entryTime for v = 1, ... ,N.
The calculation of the correlation coefficients involves the calculation of the covariance ajk between the sojourn times of an arbitrary mobile wireless
communications device for any pair of nodes j, k = Ι, .,. ,Κ. The calculation of the covariance requires the calculation of the mean and the standard deviation sojourn times at each node, which can be calculated by examining every mobility data record of the set R. For every mobility data record Rv, v = 1, ... ,N, the node identifier is obtained by accessing the nodelD field. Let j be the node visited during the visit in which Rv was created and completed, i.e., = Rv.nodeID. Sv is used to update the value of the mean sojourn time μ} at node j according to the following equation: μ/η+Ι) = δμ/η) + (1 - S)SV (2)
where δ is a properly chosen parameter between 0 and 1. Equation (2) shows that the updating of μ3 is performed iteratively and μ η) is the value of μ3 after processing n mobility data records in which the nodelD field is equal to j. Initially, all the mean sojourn time values are equal to 0, i.e., μ 0) = 0 for j = 1, ... ,K.
The calculation of σ} the standard deviation of the sojourn time at node j is calculated according to the following equation:
σ (η+1) = ξ σ (η) + (] _ ξ) _ μ (η+1) j (3 ) where ξ is a properly chosen parameter between 0 and 1 and |x| denotes the absolute
(n) value of x. Eqution (3) shows that the updating of Oj is performed iteratively and (7, is the value of Oj after processing n mobility data records in which the nodelD field is equal to j. Initially, all the standard deviations of sojourn time values are equal to 0,
Figure imgf000023_0001
The calculation of the correlation coefficient requires the consideration of all pairs of mobility data records (Rv, Rw) for v, w = 1, ... ,N and v < w. Formally, let j = Rv.nodeID and k = Rw.nodeID for each pair of mobility data records (Rv, Rw). The sojourn times S, and Sw extracted from Rv and Rw respectively are used to update the covariance ajk between the sojourn times at nodes j and k, according to the following equation:
¾ _ (n+l) _ - o < , (n) , (n) (n) > + // Jf\ o o - μ ..3 (n+1) μn^ (n+1) ίΛ
(Ojk ) + (1 - o) W (4)
Equation (4) shows that the updating of Οβ is performed iteratively and Οβ (n) is the value of Ο after processing n pairs of mobility data records (Rv, Rw) in which
Rv.nodeID = j and Rw.nodeId = k. Initially, all the covariance values are equal to 0, i.e., Ojk (0) = 0 for j, k = 1, ... ,K. Finally, the value of the correlation coefficient Pjk(n+1) is obtained according to equation (5):
Figure imgf000024_0001
After the processing of all the pairs of mobility data records (Rv, Rw), v < w = 1, ... ,N, the set R of mobility data records can be disposed of permanently. Hence mobility data is obtained and aggregated, allowing the identification of trends such as the correlation between residence in central locations and their respective timing, allowing customer behaviour patterns to be developed.
Since the set R of mobility data records can be disposed of permanently after all the pairs of mobility data have been processed, data linking individual mobile wireless communications devices to locations and times is not maintained. As a result, only the mobility of groups of mobile wireless communications devices is stored, therefore the privacy of individual customers of mobile wireless communications devices is maintained. Furthermore, the disposal of mobility data records R after processing reduces the cost of data storage and minimises the security threats related to the stored data.
Yet further, data can be aggregated by customer type, for example customers can be categorized based on data associated with their identification. Such data may be extracted, for example, from data available to a network provider and can allow categorization of a customer at a generic level, again without requiring privacy related data. Hence, in addition to an overall aggregated data set, customer-categorized aggregated sub-sets can also be developed and stored.
Figure 6 shows an exemplary network for collecting and analysing mobility data records. Femtocells 600 are located in a tracking area. The cells may have a coverage that includes the entire tracking area, or may only include a portion of it. As an example, the tracking area could be a supermarket, a shopping mall, or any other area where it might be desirable to track customer movement.
The cells 600 associate with mobile wireless communications devices 610, and
"associate" and "disassociate" with them, as described above in relation to figures 1-5. The mobile wireless communications devices may be mobile phones held by customers, for example. As such the described method tracks the movement of customers in a tracking area using mobile phones that they carry about their person.
Mobility data collected by the cells is transmitted, via the internet 604, to a processing unit which may be a gateway 606. The gateway 606 collects presence information from the presence server, and creates and completes customer mobility data records. It uses these records to calculate the branching probabilities, the distributions of the sojourn times, and the correlation coefficients between the sojourn times at different cells. As mentioned above, the processing unit updates a database field with the mobility data for each mobile wireless communications device corresponding to the "customerlD" and the "nodelD". Furthermore, the gateway may also include the analyser which calculates the parameters to characterise mobility, as mentioned above.
The gateway may interact with the network of a service provider. In doing so, the gateway may request further information from the service provider related to a
"customerlD" of an individual mobile wireless communications device. For example, the gateway may request the age of a customer using the specific mobile wireless communications device. In this way, the gateway is able to calculate parameters for an individual age range of customers, for example. Other than the age range of customers, other data related to the customer could be requested from the service provider to provide more detailed mobility parameters.
In one embodiment, the disclosed approach can operate in real-time or non-real-time basis. When the approach does not require real time customer mobility data, the gateway 606 may ask, for example, the presence server 602 or other appropriate entity, to accumulate the data locally and transmit the accumulated data from time to time to reduce the overhead and frequency of interaction between the gateway and the presence server, thereby improving the overall scalability of the approach.
In the embodiment discussed, mobility data is stored in a database and each customer can be identified by a unique identity (ID) such as IMSI, TMSI IMEI, and MAC address. This ID could make it possible to identify who the customer is and obtain further personal information, but the use of femtocells makes it possible to overcome these limitations and provides additional options for new services. Specifically, the infrastructure, i.e., the deployed femtocells, makes it possible to communicate with customers. Therefore, a text message for example can be sent to customers to inform them about tracking and its options (anonymous or not). Specifically, customers can be offered with the options of agreeing or disagreeing with the mobility tracking and aggregated profiling. They can then make their choices and communicate their decisions to the system. In any case, mobility data of every individual customer is processed and aggregated in real time with previous mobility data, and is cleared right after that. Therefore, customers remain completely anonymous. Additionally, in this embodiment it may be the presence server instead of the gateway that communicates with a service provider to request further customer data. Such customer data may include customer age, gender, residential area, etc. so that the data can be used to establish categories of aggregated mobility profiles for various customer types.
It will be seen that the approach can be implemented in a network/location of any type or scale and using any appropriate type of wireless technology allowing the desired level of resolution and data exchange. Any type of ID can be relied on, which can be inherent to the technology adapted or an add-on.
Further, in the description above each node in the graph represents a cell, but can be extended so that each node in the graph represents a set of neighbouring cells.
Although the described sojourn time is the time a customer spends at a node per visit to the node, the definition can be extended to include repeated and separate visits to the same node during a given journey. Thus, the sojourn time can also be defined as the cumulative time a customer spends at the same node in the whole journey. The approach can also be extended to establish/use different graph models and
distributions for various customer categories which are defined according to customer's age, gender, residential area, etc. In this case, customer information may be needed form the service providers so that the approach can process and aggregate a given mobility data record into the corresponding graph model and distributions associated with the given customer, after which the private information is discarded and irretrievable.
Further, because of the storage of aggregated data and update of all values associated with the mobility profiling each time data is collected, received and processed, additional cells can also be added dynamically and automatically and their values populated as the data set grows by simple extension of the data and without requiring a rewrite of the existing structure.

Claims

Claims
A system for collecting and characterising mobility data of a plurality of mobile wireless communications devices, the system comprising:
at least a first and a second detector having a respective first and second detection range and each arranged to detect entry and departure of a mobile wireless communications device into the detection range and store associated mobility data; and
an aggregator arranged to aggregate mobility data received from the detectors.
The system of claim 1 wherein the mobility data comprises at least one of an entry time defined as the time when a mobile wireless communications device enters the detection range, a mobile wireless communications device identifier, a departure time defined as the time when a mobile wireless communications device departs from the detection range and a detector identifier.
The system of any preceding claim wherein the mobility data is used to construct a mobility profile.
The system of claim 3 wherein the mobility profile contains information regarding at least one of: the probability of a user of a mobile wireless communications device moving from one cell to another, the distribution of the amount of time a customer spends in each cell and the correlations between these times.
The system of any preceding claim wherein the aggregator aggregates mobility data in the form of coefficient values in a data set.
6. The system of any preceding claim wherein the aggregator updates coefficient values in a data set as additional mobility data is received.
7. The system of claim 5 or 6 wherein the coefficient value comprises at least one of a probability of transition between two detection ranges, a probability of time spent in a detection range and the correlation between times spent in two detection ranges.
8. The system of claim 5 or claim 6 wherein, after the aggregator has updated the mobility coefficients, the mobility data is deleted.
9. The system of claim 2 wherein the time spent in the detection range is calculated from the entry time and the departure time.
10. The system of any preceding claim wherein the detector is a femtocell or a
WiFi access point base unit.
11. The system of claim 2 wherein the mobile wireless communications device identifier is one of an International Mobile Subscriber Identity (IMSI), a Temporary Mobile Subscriber Identity (TMSI) or an International Mobile
Equipment Identity (IMEI).
12. A method for collecting and characterising mobility data of a plurality of mobile wireless communications devices, the method comprising:
detecting the entry and departure of a mobile wireless communications device into a first or second detection range of respective first and second detectors,
storing associated mobility data on the first or second detector, receiving the mobility data at an aggregator and aggregating the mobility data.
13. The method of claim 12 for use in the system of any one of claims 2 to 11.
14. A computer readable medium comprising instructions for performing the method of claims 12 or 13.
15. A computer processor arranged to operate under the instructions of the computer readable medium of claim 14.
16. A system for collecting and characterising mobility data of at least one mobile wireless communications device, the system comprising:
at least a first and a second detector having a respective first and second detection range and each arranged to detect entry and departure of a mobile wireless communications device into the detection range and store associated mobility data; and
a processor arranged to receive mobility data from the detectors and correlate the mobility data obtained in at least two detection ranges.
17. The system of claim 16 wherein the mobility data comprises at least one of an entry time defined as the time when a mobile wireless communications device enters the detection range, a mobile wireless communications device identifier, and a departure time defined as the time when a mobile wireless communications device departs from the detection range and a detector identifier.
18. The system of claims 16 or 17 wherein the mobility data is used to construct a mobility profile.
19. The system of claim 18 wherein the mobility profile contains information regarding at least one of: the probability of a user of a mobile wireless communications device moving from one cell to another, the distribution of the amount of time a customer spends in each cell and the correlations between these times.
20. The system of either of claims 16 or 17 wherein the correlated mobility data includes at least one of a probability of transition between two detection ranges, a probability of time spent in a detection range and the correlation between times spent in two detection ranges.
21. The system of claim 20 wherein the time spent in a detection range is defined as the departure time minus the entry time.
22. The system of any of claims 16 to 21 wherein, when new mobility data is received from the detectors, the processor correlates the new mobility data between the two detection ranges.
23. The system of any of claims 16 to 22 wherein the processor correlates the mobility data for specific customer types.
24. The system of claim 23 wherein the customer type is determined from information related to the mobility data.
25. The system of claim 24 wherein the processor requests the information from a service provider.
26. The system of claim 24 wherein a device other than the processor requests the information from a service provider.
27. The system of any of claims 16 to 26 wherein the processor is located on one of the detectors.
28. A method for collecting and characterising mobility data of at least one mobile wireless communications devices, the method comprising:
detecting the entry and departure of a mobile wireless communications device into a first or second detection range of respective first and second detectors,
storing associated mobility data on the first or second detector, receiving the mobility data at a processor, and
correlating the mobility data obtained in at least two detection ranges.
29. The method of claim 28 for use in the system of any one of claims 16 to 27.
30. A computer readable medium comprising instructions for performing the method of claims 28 or 29.
31. A computer processor arranged to operate under the instructions of the computer readable medium of claim 30.
32. A system, method, medium or processor substantially as described herein with reference to the accompanying drawings.
PCT/GB2013/051534 2012-06-14 2013-06-11 Aggregated mobility profiling WO2013186552A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB1210549.0 2012-06-14
GB201210549A GB201210549D0 (en) 2012-06-14 2012-06-14 Aggregated mobility profiling

Publications (2)

Publication Number Publication Date
WO2013186552A2 true WO2013186552A2 (en) 2013-12-19
WO2013186552A3 WO2013186552A3 (en) 2014-02-13

Family

ID=46605949

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2013/051534 WO2013186552A2 (en) 2012-06-14 2013-06-11 Aggregated mobility profiling

Country Status (2)

Country Link
GB (1) GB201210549D0 (en)
WO (1) WO2013186552A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9848301B2 (en) 2015-11-20 2017-12-19 At&T Intellectual Property I, L.P. Facilitation of mobile device geolocation
US9998876B2 (en) 2016-07-27 2018-06-12 At&T Intellectual Property I, L.P. Inferring user equipment location data based on sector transition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
K. K. LEUNG: "Power Control by Kalman Filter With Error Margin for Wireless IP Networks", PROCEEDING OF IEEE WCNC, September 2000 (2000-09-01)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9848301B2 (en) 2015-11-20 2017-12-19 At&T Intellectual Property I, L.P. Facilitation of mobile device geolocation
US10219115B2 (en) 2015-11-20 2019-02-26 At&T Intellectual Property I, L.P. Facilitation of mobile device geolocation
US9998876B2 (en) 2016-07-27 2018-06-12 At&T Intellectual Property I, L.P. Inferring user equipment location data based on sector transition
US10595164B2 (en) 2016-07-27 2020-03-17 At&T Intellectual Property I, L.P. Inferring user equipment location data based on sector transition

Also Published As

Publication number Publication date
WO2013186552A3 (en) 2014-02-13
GB201210549D0 (en) 2012-07-25

Similar Documents

Publication Publication Date Title
EP3241366B1 (en) Method and system for estimating a number of persons in a crowd
Chen et al. Analyzing and modeling spatio-temporal dependence of cellular traffic at city scale
EP3241368B1 (en) Method and system for a real-time counting of a number of participants at a public happening
EP3132592B1 (en) Method and system for identifying significant locations through data obtainable from a telecommunication network
EP3278580B1 (en) Method and system for a real-time counting of a number of persons in a crowd by means of aggregated data of a telecommunication network
KR101505624B1 (en) Mobility prediction scheme based on Relative Mobile Characteristics
Kalogianni et al. Passive WiFi monitoring of the rhythm of the campus
Hong et al. Crowdprobe: Non-invasive crowd monitoring with wi-fi probe
Caceres et al. Inferring origin–destination trip matrices from aggregate volumes on groups of links: a case study using volumes inferred from mobile phone data
EP3278579B1 (en) Method and system for estimating a posteriori a number of persons in one or more crowds by means of aggregated data of a telecommunication network
Kanasugi et al. Spatiotemporal route estimation consistent with human mobility using cellular network data
Qi et al. Oscillation resolution for massive cell phone traffic data
JP2016506638A (en) Electronic device configuration
Pourmoradnasseri et al. OD-matrix extraction based on trajectory reconstruction from mobile data
Pang et al. Crowdsourced mobility prediction based on spatio-temporal contexts
WO2013186552A2 (en) Aggregated mobility profiling
Shen et al. DMAd: Data-driven measuring of Wi-Fi access point deployment in urban spaces
KR20140056461A (en) Supporting method for forecasting population density and apparatus supporting the same
EP3563592B1 (en) Method for determining the mobility status of a user of a wireless communication network
Fazio et al. Cell permanence time and mobility analysis in infrastructure networks: Analytical/statistical approaches and their applications
KR101425891B1 (en) Promotion providing method using customer&#39;s predicted location and system for implementing the method
KR101138606B1 (en) Method, Server and System for Providing Location Based Service Using Billing Data
Verbree et al. Passive WiFi monitoring of the rhythm of the campus
EP4133769B1 (en) Estimating communication traffic demand
US20230239832A1 (en) Method and system for distributing, across a territory, data aggregated at a mobile communication network cell level

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13731156

Country of ref document: EP

Kind code of ref document: A2

122 Ep: pct application non-entry in european phase

Ref document number: 13731156

Country of ref document: EP

Kind code of ref document: A2