WO2019111007A1

WO2019111007A1 - Personal data management

Info

Publication number: WO2019111007A1
Application number: PCT/GB2018/053549
Authority: WO
Inventors: Gordon Johnston Robertson Povey
Original assignee: Trisent Limited
Priority date: 2017-12-06
Filing date: 2018-12-06
Publication date: 2019-06-13
Also published as: GB2584042B; GB201720341D0; GB2584042A; GB2584042C; GB202011783D0

Abstract

A method of searching comprising performing a first search; creating a search query for a second search based at least on the results of the first search, wherein one of the first search and second search is a search performed on a store of collected personal data and the other of the first search and second search is a search performed on at least one store of publicly available data.

Description

Personal Data Management

Introduction

The present invention relates to a method and system for personal data management.

Background

Known applications are available that help a user store and organise photos, track fitness, log business miles plus a number of other applications and social media platforms that also gather personal information about a user’s life. However, the personal data is scattered across disparate applications and databases. In reality, there are limited opportunities to manage our own personal data while maintaining privacy.

Collectively users hold much more personal data than all of the data available on the Internet, and yet there are no simple tools available to help us capture and manage this in a way to benefit ourselves.

One aspect of personal data is user activity data. Activity monitoring using a mobile device may rely on heavy usage of system resources, in particular, a battery of the mobile device.

There is a need for improved methods and systems for personal data management and activity monitoring.

Known search engines harvest personal data through tracking and use this to target advertising. Known 'non-tracking' search engines do not expose data but may not provide personalisation. There is a need to provide improved search methods that provide both privacy and personalisation.

Summary

According to a first aspect there is provided a method of monitoring user activity comprising:

obtaining first location data representative of a first location of a user device and determining a dwell area surrounding the first location;

obtaining second location data representative of a second location of the user; determining that the second location is substantially inside the dwell area or substantially outside the dwell area based at least on the obtained second location data;

determining that the user is moving or stationary based on at least the second location being substantially outside the dwell area or based on at least the second location being substantially within the dwell area, respectively.

Obtaining location data may comprise performing one or more measurements using one or more sensors and/or selecting one or more location data sources. The one or more sensors and/or location data sources may form part of or be associated with the user device.

The one or more sensors may include at least one of: a GPS sensor; at least part of a RF transceiver for receiving network calls, messages or data or associated circuitry, an accelerometer.

The one or more location data sources may include a processor and/or memory resource of the user device. The one or more location data sources may be one or more processes running on the processor and/or memory resource of the user device.

The user device may establish a list of available location data sources and select one or more of the available location data sources.

The dwell area may be defined by one or more parameters representative of a permitted distance. The dwell area may be any suitable shape, for example, a circle. The dwell area may be a circle centred on a location measurement and the parameter representative of a permitted distance may be a radius.

The method may further comprise measuring a time spent by the user device within the dwell area. The determining that the user is moving or stationary may be further based on a comparison of the measured time with a dwell time.

Dwell area and/or dwell time may be selected to be fixed parameters. Dwell area and/or dwell time may be varied and/or adaptive. Dwell area and/or dwell time may be selected based on, for example, whether the user device is determined to be moving or stationary or known behaviour, patterns and/or trends of a user associated with the user device.

The first and/or second location data may comprise first and/or second location measurements and further data representative of the accuracy of the first and second location measurements.

The method may further comprise iteratively obtaining second location data representative of the second location until an accuracy condition is satisfied. Obtaining further second location data may comprise performing one or more further measurements of the second location.

The accuracy condition may be one of:

a) the second location measurement is more accurate than an accuracy threshold;

b) the accuracy of the second measurement allows a determination that the second location is substantially inside a dwell area or substantially outside the dwell area based on the obtained second location.

The method may further comprise determining a mode of movement or transport using obtained location data and/or further data.

The further data may be:

a) motion data provided by one or more motion sensors of or associated with the user device;

b) external contextual data associated with the first or the second location, obtained from at least one of: a memory resource of the user device; an external server; a data capturing device of or associated with the user device, for example, image capturing apparatus, audio capturing apparatus, video capturing apparatus.

Determining a mode of movement or transport may further comprise performing a confirmation process. The confirmation process may comprise calculating a speed of movement using location data and comparing the determined speed of movement with one or more typical values associated with the determined mode of movement or transport . Determining a mode of movement or transport may comprise selecting the most likely mode of movement or transport from a number of different modes of movement or transport. Determining a mode of movement or transport may further calculating a probability or likelihood that movement represented by the location data is one of a number of modes of movement and transport. Selecting the most likely mode may be based on the calculated probability or likelihood.

Modes of transports may include:

walking, jogging, sprinting, cycling, skiing, air travel, travel by boat, rail travel.

Obtaining second location data may comprise requesting location data at a sampling frequency. The sampling frequency may be based on at least one of: determining that the user is substantially moving, determining that the user is substantially stationary, a determined mode of transport.

The sampling frequency and/or requested accuracy may be based on a status of the user device, for example, a remaining battery life. The sampling frequency and/or requested accuracy may be based on output from a processor or memory resource of the user device. For example, other active processes on the user device may provide an output. For example, another active process on the user device may already be requesting one or more location data. The sampling frequency and/or requested accuracy may be based on one or more user preferences.

The sampling frequency and/or requested accuracy and/or selecting a location data source may be based on at least one of: a user preference, a system parameter, a property of the user device or a characteristic of a user associated with the user device or previously determined activities, trends or behaviour of a user associated with the user device. Obtaining first and/or second location data may comprise selecting a requested accuracy and requesting location data representing measurements with an accuracy at least equal to or more accurate than the requested accuracy.

Obtaining first and/or second location data at a requested accuracy may comprise iteratively requesting location data from one or more location data sources or iteratively performing location measurements until the returned location data is equal to or more accurate than the requested accuracy.

Obtaining first and/or second location data at a requested accuracy may comprise switching a sensor or source of location data source offering location measurements at a first accuracy or first average accuracy to a second sensor or source of location data offering location data at a second accuracy or second average accuracy

The requested accuracy may be in dependence on the user being substantially stationary or being substantially moving. The requested accuracy may be greater when a user is substantially stationary than when the user is substantially moving.

The method may further comprise storing the first and second location data and/or the further data on one or more memory resources. The one or more memory resources may be provide on at least one of a memory resource of the user device or an external server.

The method may further comprise adding noise to at least part of the location data and optionally to the further data. The method may further comprise performing an encryption process on at least part of the location data and/or further data. The method may further comprise adding noise on at least part of the location data and/or further data and then performing an encryption process.

The noise applied to the data may form part of a pseudo-noise sequence based on a private key shared only with the user.

The method may comprise grouping the location data and further data into a first data group with high sensitivity and a second data group with lower sensitivity. The method may comprise encrypting the first data group using a first encryption process, and optionally encrypting the second data group using a second encryption process, wherein the second encryption process is less secure than the first encryption process.

A pseudo-noise sequence based on a private key may be applied to data of the first data group and the second data group such that noiseless data of the first data group is recoverable only with the private key but noisy data of the first data group is recoverable without the private key. A pseudo-noise sequence may be applied before a separate encryption process.

According to a second aspect there is provided a method of anonymizing personal data comprising:

processing location data and associated contextual data of at least two users to determine a set of most frequently visited locations for each user of the at least two users;

performing one or more random swaps of locations between the sets of most frequently visited locations.

According to a third aspect there is provided a method of obtaining a location of a user device using a location module of the mobile device comprising:

placing the location module into an inactive mode for a pre-determined time period;

activating the location module when an elapsed time is substantially equal to the pre-determined time period;

requesting motion data with the active location module from one or more motion sensors of the mobile device;

processing the motion data with the active location module to determine a motion state of the mobile device, wherein the motion state is either a moving state or a stationary state;

obtaining location data from one or more location data sources in response to determining the motion state is a moving state or obtaining old location data representative of a previously determined location in response to determining the motion state is a stationary state;

determining a current location using the obtained location data. The method further comprises storing the current location. Obtaining location data from one or more location data sources further comprises performing a location measurement.

According to a fourth aspect there is provided a method comprising:

gathering user activity data associated with activity of a user about one or more locations over a time period;

retrieving contextual information associated with the one or more locations; generating an ordered representation of user movement and/or activities over at least part of the time period based on the gathered user activity data and the retrieved contextual information.

Generating an ordered representation of user movement and/or activities may comprise processing the user activity data to establish movement elements and visit elements, wherein movement elements are representative of movement between at least two locations of the one or more locations and visit data representative of a user substantially remaining at a location dwell area of the one or more location. The user activity data comprises user movement data representative of movement between at least two locations of the one or more locations and visit data representative of a user substantially remaining at a location of the one or more locations. The ordered representation may comprise processing movement data to generate the movement elements and processing the user visiting data to generate the visit elements.

The method may further comprise retrieving user activity data from one or more databases on one or more servers and/or a memory resource of a user device. The user activity may be associated with one or more sensors or data sources or location data sources of a user device.

The ordered representation may comprise an order based substantially on time ordering of the user activity data or a time ordering of the generated movement and visit elements. The contextual information may be retrieved from at least one of: an external server, a memory resource of the user device, a data capturing device of the user device, for example, a camera or a microphone.

The activity information may comprise a determined mode of transport between locations, travel time, journey start and end time.

Generating an ordered representation of user movement and/or activities may comprises processing the user activity data and the retrieved contextual information to generate an output in a natural language format.

Generating an ordered representation of user movement and/or activities may comprises processing the user activity data and the retrieved contextual information to generate an output in a geographical format.

Generating an ordered representation of user movement and/or activities may comprises processing the user activity data and the retrieved contextual information to generate an output in an album format.

Generating an ordered representation of user movement and/or activities may comprises processing the user activity data and the retrieved contextual information to generate a statistical representation of historical user activity. More recent user activity data may be presented in contrast to trends and/or statistics and/or behaviours established using older user activity.

The ordered representation may be displayed using one or more of the following views: a) natural language output, for example, in the form of a diary, wherein user activity data and the retrieved contextual information are presented as entries of the diary;

b) as a geographical map, for example, wherein user activity data and the retrieved contextual information are presented as a map overlay;

c) in an album format, wherein user activity data and the retrieved contextual information are presented as entries of the album.;

d) in a statistical format, wherein user activity data and the retrieved contextual information are presented as a statistical representation of historical user activity. The method may further comprise selecting and/or switching between one or more different views. The views may be selectable by a user. The ordered representation may comprise visual link elements that allow a user to switch between different views or to display, at least in part, one or more different views.

The ordered representation may comprise user-editable elements.

The elements of the ordered representation may be selected based on user feedback and/or user preferences and/or determined behaviour of the user.

The ordered representation may be displayed on a display of a user device or other computing resource. The user activity data may be representative of user activity at a first spatial and/or temporal resolution and the ordered representation may comprise processing or re processing the user activity data to generate entries representative of user activity at a second spatial and/or temporal resolution. The method may further comprise presenting the ordered representation at the second spatial and/or temporal resolution. The method may further comprise selecting the second spatial and/or temporal resolution.

According to a fifth aspect there is provided a method of organizing data:

retrieving data associated with user activity about one or more locations, wherein the data comprises location data and further contextual data;

determining a first set of location data is representative of a journey of a user and grouping said first set of location data together with related contextual data into a first group;

determining a second set of location data location data is representative of a user substantially remaining at one or more locations and grouping said second set of location data together with related contextual data;

storing first and second groups in a database, thereby to provide filtering and/or retrieval of data based on their grouping. According to a sixth aspect there is provided a method of searching comprising: performing a first search;

creating a search query for a second search based on at least the results of the first search,

wherein one of the first search and second search is a search performed on a store of collected personal data and the other of the first search and second search is a search performed on at least one store of publicly available data.

Creating a search query for a second search may be based on the results of the first search. The other of the first search and second search may be a search performed on a store of publicly available data.

The store of collected personal data may be a database as provided by other aspects of the invention. The second search of publicly available data may be an internet or web search.

The search query may comprise keywords used and/or based on the search query for the first search and further keywords derived from one or more of the results of the first search.

The search query may be based on the results of the first search and/or keywords derived from the results of the first search.

The first search may be the search performed on the store of collected personal data and the second search may be the search performed on the at least one store of publicly available data.

The first search may return personal data and/or associated metadata and/or associated contextual data stored in the store of collected personal data. The search query for the second search may be created using at least part of the returned personal data and/or metadata and/or contextual data, the second search query thereby providing a personalised search of the store of publicly available data.

The personal data may be accessible only by a user. The personal data may relate to a user’s activity history. The search query for the second search may be further based on user input used for performing the first search, for example, wherein the user input comprises at least one of: a search query used for the first search and/or selection of one or more user input parameters.

The search query may comprise at least one of:

a first search query;

a search term representing a time, a location, a context and/or a journey,

The one or more user input parameters may comprise parameters representative of a selection of a user profile and/or selection of a user intent.

The method may further comprise: processing the results returned by the first search to determine at least part of the search query for the second search.

The method may further comprise:

identifying shared and/or overlapping information associated with at least two results from the first search and using at least part of the identified shared and/or overlapping information as part of the search query for the second search.

The shared and/or overlapping information may comprise one or more of: times, dates, locations, contextual data, activity type. At least part of the store of collected data may be encrypted in accordance with a searchable encryption scheme such that the search of the store of collected personal comprises a search of encrypted data.

The method may further comprise performing a decryption process on the results of the search of encrypted data and creating the search query using part of the decrypted search results.

The searchable encryption scheme may comprise a homomorphic encryption scheme. A first set of data in the store of personal data may be encrypted in accordance with a first encryption scheme, and a second set of data in the store of personal data may be encrypted in accordance with a second encryption scheme. The second encryption process may be less secure than the first encryption process. The second encryption scheme may be applied to images and/or video data and the second encryption scheme may be applied to location and/or indexing data. The second encryption scheme may be a homomorphic encryption scheme.

The created search query may be created such that it is non-attributable to the user.

The created search query may comprise substantially no sensitive data. The created search query may comprise only publicly available information. The created search query may comprise general and/or public information.

The method may further comprise obtaining and/or monitoring user selection data that is representative of a selection of one or more results of the second search by a user. The selection of results may be indicative of relevance to a user

Performing the first search may comprise obtaining a first set of search results ordered using a first ranking model. Performing the second search may comprise obtaining a second set of search results ordered using a second ranking model.

The method may comprise ordering at least one of the first set of search results and/or the second set of search results

The method may further comprise determining the model parameters for the first and/or second ranking model based on user selection data.

The method may further comprise monitoring and/or collecting user selection data representative of a selection of search results by one or more users and using said user selection data to update model parameters of one or more ranking models associated with the first and/or second searches. The user selection data may comprise at least one of: search result selection data representative of a user selecting a particular search result; search result visiting data representative of the time spent by a user browsing the search result.

The method may comprise providing collected user selection data as training data for determining model parameters of at least one of the first and second ranking models.

The method may comprise training the first of the ranking models using a first set of user selection training data and then training the second of the ranking models using a second set of user selection training data.

The method may comprise determining model parameters for one or more ranking models for each user and storing the determined model parameters for each user.

The method may further comprise performing the second search, wherein performing the second search comprises:

performing a search with each of a plurality of stores of publicly available data using the created search query and/or a further search query based on the search query;

obtaining a plurality of sets of secondary search results, wherein each set of secondary search results corresponds to a search performed on a respective store of publicly available data, and

performing a merging process on the received plurality of sets of search results thereby to produce a merged set of search results.

Creating the search query may comprise adapting the search query for the store of publicly available data to be searched.

The search query may comprise a Boolean search string.

The search query may comprise a plurality of adapted search queries, each adapted search query being adapted for the corresponding store of publicly available data to be searched. The method may further comprise ordering the merged set of search results in accordance with a ranking model and updating the ranking model based on user selection data.

Performing the merging process may comprise at least one of:

performing a combining process on the plurality of sets of search results;

identifying identical, substantially similar and/or duplicated results and removing said identified, substantially similar and/or duplicated results;

filtering search results.

The one or more stores of the publicly available data may comprise a plurality of search engine accessed via the internet.

The at least one store of collected personal data may be a database as provided by other aspects of the invention. The second search of publicly available data may be an internet or web search. The second search of publicly available data may be a search of any connected database or service that can be queried.

The first search and/or second search may be based on a selection and/or determination of a user profile and/or user intent such that the results returned by the first search and/or the second search are dependent on the selection and/or determination of user profile and/or user intent.

A part of the store of personal data that is searched may be selected based on the user profile and/or user intent.

User profile may comprise one or more of: a business profile, a home profile, a personal profile. User intent may comprise one or more of: research, leisure, browsing, shopping.

According to a seventh aspect, there is provided a system comprising:

a computing resource comprising a system and a memory resource in communication with a store of publicly available data and a store of collected personal data; wherein the computing resource is configured to perform a first search and to create a search query for a second search based at least on the results of the first search, wherein one of the first search and second search is a search performed on a store of collected personal data and the other of the first search and second search is a search performed on at least one store of publicly available data

The system may further comprise a user device, for example, a mobile device, comprising a user input device and a display for at least one of:

receiving user input via the user input device;

displaying results of the second search, optionally, receiving user input in response to displaying results of the second search.

According to an eighth aspect, there is provided a computer program product comprising a computer readable medium storing instructions that are executable to perform a method comprising: performing a first search; creating a search query for a second search based at least_on the results of the first search, wherein one of the first search and second search is a search performed on a store of collected personal data and the other of the first search and second search is a search performed on at least one store of publicly available data.

According to a ninth aspect there is provided a system configured to perform any of the above methods comprising:

a user device, for example, a mobile device, comprising a processor and a memory resource;

a computing resource comprising a system and a memory resource;

one or more external servers for storing one or more databases.

The system may comprise one or more databases configured for communication with a server, wherein the one or more databases are configured to store different types of data, wherein the different types of data are representative to activity type and/or a moving user and/or a stationary user.

According to a tenth aspect there is provided a computer program product configured to perform any of the above methods. Features in one aspect may be applied as features in any other aspect in any appropriate combination. For example, any one of apparatus, system, method and computer program product features may be applied as any other of apparatus, system, method and computer program product features.

Brief Description of the Drawings

Various embodiments will now be described by way of example only, and with reference to the accompanying drawings, of which:

Figure 1 is a schematic diagram of a user device, a system and external database according to an embodiment;

Figure 2 is a schematic diagram of a user device, a system and a browser according to an embodiment;

Figure 3 is an illustration of clustering of location data according to an embodiment;

Figure 4 is a flowchart showing a method of collecting location data according to an embodiment;

Figure 5 is a view of a time line of user activity according to an embodiment; Figure 6 is a view of an album of user activity according to an embodiment; Figure 7 shows an example of collecting news stories relevant to a user according to an embodiment, and

Figure 8 is a schematic diagram of a method of performing a search.

Detailed Description of the Drawings

Figure 1 is a schematic diagram showing a system 10 for gathering and storing information, for example, personal data associated with a user. The system 10 is connected with a device, for example, a mobile device 12 and one or more servers 14 configured to host one or more external databases. The system 10 may also be referred to as a personal context management system or PCMS. The system 10 may be provided on any suitable computer resource. The user device 12 has a processing resource and a memory resource. The system 10 has a processing resource and a memory resource. Access to services based on system 10 is provided through, for example, a host device 16. Host device 16 may run a personal web browser and/or provide access to data stored on system 10. Although shown as two separate devices, in some embodiments, host device 16 and user device 12 may be the same device. The mobile device 12 is connected to the system 10 using a wireless network, for example, using a mobile phone network or a wireless local area network. The connection between the mobile device 12 and the system 10 is enabled by a transmitter and/or receiver of the mobile device 12 configured to connect wirelessly.

Data transfer between system 10 and user device 12 and between system 10 and external database may be carried out in accordance with an application programming interface (API).

Mobile device 12 is configured to provide data representative of a location of the mobile device 12, also referred to as location data, to the system 10. The mobile device 12 has one or more sensors or devices configured to produce location data. These are also referred to as location or context data sources. The one or more sources may include: a GPS receiver, an assisted GPS receiver, an accelerometer, a compass.

As a further example, the one or more location data sources may include an RF receiver of the mobile phone 12 that is used to receive network traffic signals, for example, data, calls or SMS. Associated circuitry or processing resource may be provided to process this received data to produce location data. For example, at least one attribute of a received mobile network signal, for example, signal strength may be processed to determine location data. Other suitable network based techniques can be used to produce location data, for example, network triangulation, WiFi and Bluetooth signal.

Alternative sources of location data may be provided as part of the mobile device 12 or associated with the mobile device. Location data can be obtained using WiFi, Bluetooth, RFID, NFC, fixed IP address and other RF networks. For example, a user device may have different processes running on its processing resource that also involve retrieving location data using one or more sensors or alternative location data sources. The processor of the user device may obtain information about at least one of these processes, including recently obtained location data. The user device may be in communication with an operating system to establish a list of processes capable of providing location data. Location data may be retrieved from one or more of the location data sources by the system 10. A request is transmitted from the system 10 to the mobile device 12. Mobile device 12 receives the request for location data. In response, location data is transmitted from mobile device 12 to the system 10. Location data may be requested at periodic intervals. For example, the system 10 may transmit a request for location data from one of the one or more sources every 2 minutes. The frequency of requesting location data can be referred to as a sampling frequency.

The sampling frequency can be selected by a user or may be selected automatically dependent on a number of factors. For example, the sampling frequency may be based on one or more user preferences, or system parameters, or a property of the user device or a characteristic of a user associated with the user device or previously determined activities, trends of behaviour of the user. The sampling frequency may be varied or may be held constant.

The sampling frequency may be dependent on a reported status of one or more components of the mobile device 10. For example, an indication that battery life is running low or has fallen below a threshold may result in less frequent sampling to preserve battery life.

A request for location data may also include a request for data representative of accuracy of the location date. Alternatively, location accuracy data may be provided together with the location data without a separate request being made. The location accuracy data may be dependent on which of location data source is providing the location data. For example, a GPS sensor may have an inherent precision and network based location techniques may have a limited precision.

In some embodiments, the location requests and location data collection is performed by the processing resource provided on the mobile device 12.

The connections between the system 10 and the one or more servers 14 hosting external databases, may be either a wired or wireless connection. In some embodiments, the device is a computing device connected to a wired network, and the device communicates with system 10 using the wired network. The external databases of the servers 14 can provide further data to the system 10. The further data includes contextual data related to one or more locations visited by a user. The further databases may be third party databases that host third party data, for example, news and/or photos.

Figure 2 shows a schematic diagram illustrating further embodiments of the system 10. In addition to the elements shown in Figure 1 , Figure 2 shows additional elements that may be included. In particular, more data sources are shown. In particular, the one or more data sources include, social media, location data, calendar and contact data, health data, smart home, smart devices, smart car, personal notes data, internet history, messages and emails, banking and insurance data, activities and calls. Also shown are external information requests provided via General Data Protection Regulation (GDPR), 3^rd party data for example news, weather, POIs and other personal or shared devices.

Figure 2 also shows further detail of an embodiment of system 10. System 10 has two elements: an identity wallet configured to contain tokens on a blockchain and a distributed encrypted personal data store (PDS) that allows a user to store, manage their personal data. Figure 2 also shows further detail of the browser. The browser provides a user access to personal data and an interface to a data marketplace.

Tokens collected in user’s wallet can be exchanged for goods and services in the marketplace. Marketplace activity can include buying and selling of tokens. Tokens can be earned in exchange for a user providing resources for the purposes of storage, processing, blockchain mining and by selling data. Token may be used to pay for services including storage, service and application, for example, decentralized application, services. Token can also be sold and data can be a bought in exchange for tokens. All data access and token transfers are authenticated and periodically recorded in a distributed ledger.

The wallet also contains sensitive personal information that is encrypted and may include for example; name, gender, date of birth, identity proofs, address, email, bank details, encryption keys/hashes

Figure 3 illustrates using retrieved location data to determine types of user activity associated with movement of the mobile device 12. Following a determination of type of user activity, other data can be categorised under the determined type of user activity. Location data can be used to categorise user activity as a Journey type or a Stay type. Once categorised, other data can be added to the journey or stay.

A Stay corresponds to a user being present in substantially the same area for a period of time. For example, during the day, if a user is present at their place of work, this will be registered as a Stay. A Journey corresponds to a user from one location to another location, or from one Stay to another Stay. Journeys occur between adjacent Stays in different locations.

Classification of Stays are performed based on a Stay resolution. The Stay resolution is defined using parameters that represent a Stay area and a Stay time. The Stay area defines a boundary that a user must remain within before a Stay is considered as either a Journey or a separate Stay. The Stay area may be represented by any suitable parameters that allow an area to be defined. For example, if the Stay area is a circle the Stay area may be represented using a radius. The radius is an example of a permitted distance a user may move from the centre of the circle before the user leaves the Stay area. The Stay time, which may also be referred to as a dwell time, defines how long a user must remain in the Stay area before the activity is classified as a Stay.

An instance of a Stay can have attributes associated or attached with it. The attributes may include a start and/or end time for the instance and/or a time duration of the instance. The area of a specific Stay instance may be determined. This area defines the boundary of the Stay instance. The area of the Stay instance may be determined using the Stay resolution or representative parameters and other location data. For example, the area spanned by a Stay instance can be determined using a central location point and the Stay area. For example, if the Stay resolution defines all Stays as circular with a pre-selected radius, then a Stay instance is a circle centred on a central location point having a radius equal to the pre-selected radius.

As a first example, a first instance of a Stay, also referred to simply as a first Stay is established. If a user travels outside the boundary of the first Stay, then a new Stay will be established once the user has remained within a second area for at least the defined dwell time. After staying in the second area, a first and second Stay are established and a Journey between the first and second Stays can be determined.

The time spent between two adjacent Stays, which are referred to as a start Stay and an end Stay, is defined as a Journey. Like a Stay, a Journey can have several attributes assigned to it. These include a departure location, a departure time, an arrival location and an arrival time.

The start Stay corresponds to a starting point of a Journey. The departure location and departure time for the Journey may be determined using attributes of the start Stay. For example, the departure location of the Journey could be the central location of the start Stay, and the departure time could be the end time of the start Stay.

The arrival location and arrival time for the Journey may be determined using attributes of the end Stay, if Journeys are established following determination of Stays. For example, arrival location of the Journey could be the central location of the end Stay, and the arrival time could be the end time of the end Stay. Alternatively, if a Journey is established before the end Stay is determined, then attributes of the end Stay may be determined using attributes of the Journey.

The other attributes that can be attached to a Journey are travelling time, average speed and mode of transport. Mode of transport may be determined using further data. Journey data can also be combined with map data to map the journey to an established route, for example, to follow a road or a railway line. This may be performed after a few location points are determined together with the travel method. Some sources of location data or other movement sensors of the device 12 can be used to provide movement data to determine mode of transport. For example at least one of: an accelerometer, electronic compass, gyroscope, pressure (for altitude change), pedometer may be used. Mode of transport can include any suitable mode of moving including: by road, by rail, by sea, by air, any vehicle, car, train, ship, plane, walking, running, cycling, skiing.

In some embodiments, Stay resolution may be a pre-selected size for all Stay instances. In other embodiments, Stay resolution parameters including radius and dwell time can be adaptive. This provides zooming functionality that allows a user to zoom in and zoom out of collected data, including not just location data but attached contextual data. In other embodiments, Stay resolution parameters may be varied in real-time if required.

The stay resolution parameters may be adapted in dependence on determining whether or not the user is moving or stationary or based on previously collected behaviour, patterns and/or trends of the user.

Small radius stays can be merged into larger radius ones and the same is true of Stay times. For example, if a radius of 50m is instead expanded to 1 km, data relating to Stays and Journeys within individual buildings on a university campus might collapse to a singe Stay at the university campus.

Figure 3 shows an example of zooming and clustering of user activity. Figure 3 shows six determined Stays numbered 1 to 6, as resolved at a first Stay resolution and two determined Stays (A and B), as resolved at a second Stay resolution. Five journeys are established between the Stays 1 to 6. Zooming is achieved by changing the Stay resolution, in this case, by selecting a different, larger, Stay area. In particular, the Stay area is increased by selecting a larger radius. A larger Stay area is chosen, such that Stays 1 to 4 are merged or collapsed into one larger Stay A. Similarly, Stays 5 and 6 are collapsed into one larger Stay B. The five Journeys are replaced by a single Journey between Stay A and Stay B. In some embodiments, zooming by changing Stay area and/or Stay time is possible.

Another aspect of Stays is that patterns of a user tend to emerge and trends can be detected. For example, often the same Stay will keep reoccurring at similar times or on certain days and these can be tagged as“Places” which tend to be given names such as“Work” or“Home”. Based on previously collected data, newly collected location data can be processed and categorised without any user input.

Location data is processed with processing algorithms. These algorithms may accurately identify Stays and Journeys while minimising battery drain.

First it is considered how a Stay is determined. As described above, each Stay instance is determined in accordance with a Stay resolution including a dwell time and radius. For example, the radius may be set to 100 metres and the dwell time may be set to 3 minutes. If a user stays within a circle defined by the radius of 100m for at least 3 minutes then a Stay is established. If a user moves beyond the boundary of the circle then the user is considered to have left the Stay. Alternatively, if the user does not stay within the circle for at least 3 minutes, then the movement is considered as part of the previous Journey.

As described above, an application running on a processing resource of device 12 make periodic requests for location data. For the following example, the request frequency is a requested every 2 minutes. A location request can also have an associated accuracy requirement which may be set by a user. In this example, the accuracy is set to 50 metres. The location request can therefore request more accurate location data from the device 12.

In a first example, when the method processing starts, it is assumed that the user is in a Stay. However, subsequent processing may indicate otherwise, for example, it may indicate that the user is taking part in a Journey. Each location data point collected may be clustered. Geometrical methods are used to determine if the cluster is contained within the Stay radius. For all the points that are inside the Stay area, it is considered that the user is in the Stay.

There are now two conditions that, if met, indicate that a user has left a Stay. These are, firstly, that the user is moving and has vacated the stay at the beginning of a Journey. Secondly, a new location point is obtained that indicates that a Journey is starting, as it is outside the Stay area, but this may be due to inaccuracies in the location data.

Thus it may be required to further process the raw location data to determine which of the possible conditions is correct. Outliers can be determined by requesting and evaluating location accuracy. The returned location accuracy may be representative of an estimate location accuracy. If the location accuracy reported is good, for example, the location accuracy indicates that the location data point is accurate to within a certain area, and substantially all of this area is outside the Stay area, then it is determined that the location data point is outside the Stay area. If the location accuracy area overlaps with the Stay area, then a more accurate location can be requested. A more accurate location can be repeatedly requested until it is determined whether or not the location is an outlier.

The location continues to be monitored and if the monitored location returns to within the radius, then the previous one or more outlier location points that were outside the Stay area can be filtered and/or removed.

The accuracy requested for the location can be relaxed, once it is established that a user remains inside a Stay or using previously determined trends.

Other filters can be applied to the data to handle outliers. For example, location data outliers sometime cause the appearance of Journeys but at impossible speeds (such as several hundred miles per hour), therefore a threshold can be applied to remove data representing this journey.

A Journey may be established in several different ways. For example, stay resolution parameters may be used. For example, once the location data indicates that a user is at least twice the stay radius from the central location point of the Stay and after twice the Stay time, then the end of the Stay and start of the Journey can be confirmed. The multiple does not need to be two, and can be any chosen multiple.

Once a Journey is established, a most probable time of departure from the Stay can be estimated to the nearest minute.

Once a Journey is established, regular location updates are obtained. This may be at the same rate as during the Stay or at a different rate. Accurate location data may not be as important for a Journey. Therefore, the accuracy of the location data required can be reduced which may have benefits in battery usage. A new Stay is then established when again the obtained location data points cluster within a Stay area for over the Stay time, than the next Stay can be confirmed.

Different attributes can be assigned to a Journey. One is a Journey type. The Journey type is determined by a variety of inputs. First of contextual data, provided by external servers 14, relating to the start and end points can be used to determine Journey type. For example, points of interest including railway stations, airports, ferry ports etc. can be used to allow a Journey type to be determined. Secondly, a calculation of Journey speed can be used to determined the Journey type. Both of these inputs may be combined. The device 12 operating system may also provide further inputs, for example, from movement data. For example, the device may be able to determine if the mode of transport is walking, running or cycling. On its own, these further inputs are not always accurate but combined with other input the correct transport mode can be determined.

Confirmations of Journey type and Journey splits can also be performed. For example, if the Journey type looks like by boat, for example, the stays were both at a port/riverside and the speed was consistent with a boat journey, then it can be confirmed using waypoints in the location data of the Journey. For example, are the waypoints on water. Similar things can be done for other modes of travel, for example, rail and air travel.

A weighted probability using the above inputs can be calculated to determine a most likely transport mode. The most likely transport mode can then be selected and assigned to the Journey. A measure of likelihood or probability of a first mode of transport may be calculated and compared to a measure of likelihood or probability of a second mode of transport.

Other processing algorithms are implemented that may improve battery usage and prolong battery life. During an established Stay location requests may be turned off after a fixed period of time, for example, 30 minutes. The application may also be placed into the background of the device operating system tasks or placed into a sleep mode. The application and/or location requests can be reactivated using, for example on a device running iOS: a significant location change, via a notification or through a user restarting the application. Other operation systems, for example, Android devices, may have different but similar techniques for restarting an application.

The fixed period of time can be selected depending on past user behaviour. For example, a probability that the user has entered a stay and will remain in the stay for a long time can be calculated based on previous behaviour or trends and/or looking at phone activity. Processing of location data, optionally together with movement data, can provide a indication of how long a Stay may last. For example, if the device appears to be completely stationary, using information from the location data and/or motion data then it is more likely that the user will continue to be completely stationary.

The application can be placed into a sleep mode if there is a good probability that the user won’t move location for a long time. This can be performed based previous behaviour and from looking at phone activity. If the phone appears to be completely stationary (using motion sensors) then a user is more likely to continue to be stationary - i.e. the user has put their phone down and so the app shut down can happen faster.

Another way of predicting how long a Stay may last, is to look at past behaviour data and to match current behaviour with past behaviour. For example, past behaviour may show that a user is likely to be stationary during working hours when the location matches their place of work.

There are several techniques that can be employed to save battery during an established Journey. To detect the Journey, adjacent location points have stopped clustering or a significant location change has just activated the application. If the user is moving relatively fast then requested location accuracy can be reduced because the distance travelled between location updates, for example, every two minutes, is much greater than the location accuracy distance. Once the location points start to cluster then the accuracy must be increased to accurately determine the location clustering and define the next Stay location. Often in requesting a low accuracy location point a higher accuracy one is actually returned.

Many of the battery saving techniques are dependent on the operating system location services available and their impact on battery. Increased knowledge about where someone lives, works, typical journeys and movement trends can help improve location services and assist battery management.

Aside from using in-built operating system functions of the device, a method that may improve battery performance is to only periodically check for location when the phone is likely to have moved. This method is illustrated in the flowchart of Figure 4. Figure 4 shows a first step 102 that requests if motion is detected. This can be derived using movement data. If motion is detected, the process continues to step 104, that requests a new location which is then recorded, at step 108, as a new Journey location. After recording a new Journey location, the process continues to sleep for a dwell time.

If no motion is detected at the first step, then it is determined that a Stay is occurring. This time may be equal to the dwell time of the Stay resolution. The stay location can be recorded as still at the previously measured location but at a new updated time, thus avoiding a location request. The process then places the application to sleep, at step 1 10, for a set amount of time.

Following the period of sleep, the system returns to step 102 to check again to determine if any motion has been detected.

Using the above process, location requests are only made at periodic intervals defined by the dwell time and only during a Journey. Once stationary all location requests should stop. The accuracy of the location requests may be varied (asking for accurate location is more power hungry). When motion is detected only a medium location accuracy is required to help establish that the Stay has been exited. During the Journey location request can be low accuracy, especially if travelling fast. Once the motion has not been detected during a dwell time while in a Journey state it is likely that a Stay has been entered and so a high accuracy location will be requested so that the precise location of the new stay can be established. No further location request will occur until motion is again detected.

As a non-limiting example, good accuracy may be less than or equal to 10 metres, medium accuracy may be more than 10 m and less than or equal to 500 m and bad accuracy may be more than 500m.

Returning to Figure 1 , external data, for example, contextual data, can be assigned and/or stored together with the determined Stays and Journeys. The data can be obtained by system 10 from the external servers 14. Further data can also be obtained from other applications and memory resources on the device 12. The data includes social media, photos, calendar, health data, location data, internet history, messaging, activities, calls, texts, contact data, personal notes data, internet history data, emails, banking data, insurance data, smart home and smart car data. Once the information is gathered into its context of Stays and Journeys within the system 10 it can be appended with the other personal data using the time and location structure with each data item being attached to either a Stay or a Journey. The personal data can also be enriched using external data sources such as library images, points of interest, mapping etc. Although the main source of PCMS is through a Smartphone there is no restriction on where the data must come from.

Databases may be set up to store the above information in Stay and Journey categories.

As discussed above, it is possible to zoom between collected Stay and Journey information. The Stay information is defined by the location radius and dwell time. As an example, if there is a Stay radius of 1 m and dwell time of 5 secs then very small Stays are created. As with the associated Journeys joining these e.g. (Desk 3 min) walked 3m (Waste paper bin 10sec) walked 3m (Desk 10min). This level of detail may be too much information, so zooming out is provided. For example, a first zoom may be to 50m and 1 minute might to reveal: Office 45min, drive 4.8km (Home 10hr 37min). A second zoom may have parameters, 100 miles and 12 hours, which would reveal only large periods of time spent away from home such as going on holiday or a business trip with the smaller stays around home and work collapsing into one big stay potentially lasting weeks.

Thus by varying these two parameters it is possible to reveal more or less detail from the collected data. Collected data may be collected at a first resolution, for example, equal to the accuracy of a sensor and the data processed or re-processed to produce Stays and Journeys at a second resolution.

Another aspect of zooming is the level of contextual data that is be added to the Stays and Journeys. When the user zooms out there might be a lot of additional data tagged to an individual Stay and so the user may wish to reduce this level of detail. One way to do this is to layer the other data by type. For example, a first layer may be social media data, a second layer may be personal images. A user may be able to select different layers to be displayed at one time.

Even then it might be too cluttered and so the data layers can also be aggregated at different zoom levels. For example if the user zooms out they may find that they have 103 social media posts against a Stay. Therefore, instead of displaying all the context data, a summary may be provided. In the above example, the summary can state that there are 103 posts.

In a map view it might be logical to automatically adjust the Stay radius with the zoom level. This ability to zoom can be compared to, for example, Google Earth where a user may zoom in and out on the world and it can be decided what details are to be presented at any time by turning off and on different layers such as place names. The layers and levels of detail collapsing into summary data or disappear as a user zooms out. In such an example, more detail would emerge as a user zooms in.

As discussed above communication may be carried out in accordance with an API. The API comprises security features for permissions. Blockchain technologies may be used to implement the API and/or access. This is illustrated in Figure 2, where an identity wallet is provided that stores tokens with transactions recorded using a blockchain ledger.

A technique for data aggregation is also implemented. This is based on the technique of Differential Privacy combined with pseudo-noise sequences.

In the system database, it is possible to encrypt each individual’s personal data. However, only fully attributable elements of their data need to be heavily encrypted using a private hash key that may be derived from their password (e.g. name, email, full address, etc). The other elements of their data that will be used for data aggregation may be encrypted using a different public or group hash key which can be used to access and decode this data. This part of the data will also be anonymised using differential privacy prior to encryption whereby noise data is added to elements of the data. The difference between this and conventional differential privacy is that the noise applied to the data is deterministic, e.g. using a pseudo-noise (PN) sequence based on the user private hash key. The 3^rd parties receiving the noisy data can decrypt the data but cannot remove the noise so data privacy is maintained and the data in non-attributable. However, the data owner can recover their data from the same database, noise free - since they have the hash key that generated the deterministic noise sequence and it can be subtracted to reveal the original noiseless data. It is possible to fully encrypt the database while still enabling it to be searchable. It is possible for the system to apply this, however, the processing overhead could be a burden. With our proposed method only the very sensitive data that would make attribution possible is heavily encrypted. Much lighter encryption can be applied to the less sensitive data or indeed no encryption is an option since the other data is noisy and so it not a true record. It can be seen noise free by the data owner but this can be done on the client side. Thus, the server need never be exposed to the encryption or PN sequence keys. In particular, encryption or PN sequence keys can be stored by a user in their identity wallet.

It should not be possible for a third party to determine if they have removed the noise since they have no way to determine that the data is noiseless or if the PN sequence used in an attempt to remove the noise is correct or not.

One useful feature of this type of arrangement is that the owner of data may decide to remove their account from the system (a copy of all of their data is made available should they wish to copy this). Once they have removed their account there is no need to delete any of their data other than perhaps their attributable account details such as their name, email and full address to be compliant with certain data regulations. For example, the General Data Protection Regulation (GDPR). Their anonymised data in the database does not need to be deleted as it is not personal data due to the differential privacy process. A soft account delete may also be implemented such that if a user returned to the system and had their last password, the old hash code used for the noise code can be regenerated and thus the noiseless data can be regenerated. This allows the account to be fully restored.

The key and hash can be stored in the user’s wallet. Without the wallet, a third party cannot access the personal data. In this case, deletion of an account or the‘right to be forgotten’ corresponds to simply deleting a key and hash as this renders the personal data unrecoverable.

A good aspect of using noise signals is that the data becomes randomised and non- attributable, thus it has limited value to adversaries if it is hacked as they tend to be looking for highly personal data. Therefore, light encryption can be applied to this data since even if it becomes hacked and decrypted it will have no real value for criminal activities. This also means it is easy for the legitimate users to decrypt and the noise data is very simple to remove if the code is known. ZeroDB type techniques allow lightly encrypted data to be queried by authorised users.

One of the real challenges of applying privacy to the collected data is that location data from the Stays and Journeys may be attributable to a certain user even when noise is added. For example, an issue may arise if noise is added to each location but the user is visiting the same locations repeatedly. For example, a sample group of 10 users may be at home for long periods during the evening. After an extended period of time of collecting location data for these 10 users, 10 distinct location areas may start to emerge during the evenings. Something similar may emerge during 9am-5pm for weekdays. Therefore, work locations may be determined. If used in addition to demographic sample data and eventually (even with noise in the data) the probability of identifying an individual user is greatly increased. To avoid this, location patterns must be obfuscated in a way that does not affect the data in a statistical significant way.

This may be implemented by adding an additional layer of false location data or‘lies’ that obfuscates the locations without destroying underlying collected statistics. The first process identifies the common places a user spends time (and hence could be identified by). As an example, the top 5 Stays for any user can be extracted and the process can ensure that consistent lies are made about these places. For example, random swaps of locations of different users may be implemented.

Following processing and sorting of data, the data is presented to a user in the form of an ordered representation. This may be implemented by an application, for example, the personal data web browser, running on a processor of the device 16. At least Six different display views are available: Diary, Album, Map, Search, News and Statistical. The application may be a personal data web browser that displays at least the six different views. The personal data web browser may also provide access to a digital marketplace.

Creating an ordered representation involves gathering user activity data associated with activity of a user about one or more locations over a time period and retrieving contextual information associated with the one or more location and generating an ordered representation of user movement and/or activities over at least part of the time period based on the gathered user activity data and the retrieved contextual information.

The ordered representation may be made up of different elements. These elements include: a Stay element that are related to a Stay; a Journey element that is related to a Journey; and Other elements that are neither related to a Stay or a Journey. The Other elements may be related to user-defined data or to externally provided data, for example, an advertising element. The ordering of elements in the representation is typically based on a timing of the elements or underlying location data. However, it may be based on other characteristics such as a user preference.

Certain elements may be emphasised relative to others based on a number of factors, such as user preference or if the particular activity differs from previously determined behaviour of the user.

The first view is a diary view. Figure 5 shows an example of the diary view. The diary view is configured to automatically produce human readable output. The Journeys and Stays information is retrieved by the presentation application and presented in human readable format. Figure 5 shows an example of a diary entry. The format of the diary is modifiable. The diary shown is a one-page per day format. Additional context information is added to the diary as an aide memoire.

Adding a lot of information to the diary may lead to a cluttered and hard to read display. Additional information can therefore be contained in a scrapbook, and the diary itself has small icon elements indicating that additional content is available in the scrapbook.

In addition the same data from the diary and album can be viewed in different ways via a map or in statistics and graphs. Thus the application has at least 4 different views - Diary, Album, Map and Stats.

The diary view presents a complete day of a diary as a series of Stays and Journeys. All of the content is completely self-writing although a user can add of change content. For example a user can change the name of a place e.g. it say a user is at Queen Street, Edinburgh which is actually the user’s home - so a user can edit this place to ‘Home’ and add a picture of a house. Each time a user is back in this same location the diary will report the location as Home and will present the picture of the house. Notes can also be added to Stays.

If a Journey has the wrong mode - e.g. it indicates Road but the actual mode of transport was Rail, a user can edit this. The system will then note that a user corrected the transport mode and this is used as part of a self-learning algorithm.

Figure 6 shows an example of the scrapbook view. This view is a chronological record of items a user wishes to add or have automatically collected. The contents of the scrapbook can be any items a user creates or that are pulled from other data source, for example other application on the device. Scrapbook items can include: records in text, picture, video or audio form, URL bookmarks, social media posts, calendar entries, news feeds, health timelines, fitness data from external fitness tracking devices, calls/texts made or received, new contacts made.

Additional data can be added to any of the items. This may be automatic. For example, URL bookmarks can be enriched using location, time and/or related images.

In contrast to the diary view, the album, also referred to as a scrapbook view, is a compilation like a conventional scrapbook. This may be chronological. In order to avoid excess clutter in a user’s diary view, but still be able to access rich personal content the scrapbook view can be linked to in the diary view.

If against any Stay or Journey there is additional content a small paperclip icon on the Diary timeline indicates that there is content in the scrapbook for that time and place or journey. Clicking on the paperclip allows a user to access the scrapbook item(s) associated with that Stay or Journey or a user can jump to the Scrapbook view which will be indexed to that day in the current Diary view.

If a user chooses to add an item to the scrapbook from the Diary view this will pick the location and time from the Journey or Stay from which the user selected the add to scrapbook function. The time will default to the start time of that Stay or Journey, but can be edited. Scrapbook items are all given a date, time and location. The default time and location will be taken from the item, from a geo-tagged photo or post. If the item does not have a time the default time will be the current time. If it does not have a location the location will be set to the last Stay location closest to that time. It is possible to edit both the time/date and location. If a user clicks on a Scrapbook item then the user can receive an expanded view with all of the edit and navigation features. A user can navigate to the Diary for that time and place or cancel the expanded (pop-up) view using the cross icon.

The scrapbook is a chronological view with no pagination so everything is on a single scrollable screen of items with the most recent items at the top and oldest at the bottom. Each item placed in the Scrapbook has an associated location and time and the user interface allow a user to easily switch or flip between Diary and Scrapbook views. All of this content can be searched by keyword, can be filtered by category (e.g. only show the URL bookmarks, or just Facebook & Twitter etc.).

A search by location can also be undertaken. This search can be a keyword search or by using a point marked on the map view or sorted by distance from a named place or by time/date.

The current implementation uses external data feeds provided by other social media platforms. If any entries of a feed are changed or deleted then this will be reflected in the scrapbook. A copy may be generated and placed in the diary with a link to the original entry. A permanent copy is thus provided of what is posted on social media. The view may aggregate data from different data sources, for example, a user posts from different social media sites.

Although the diary/scrapbook is not automatically shared it is possible to share a scrapbook item using an external social media platform by clicking the‘share’ option and the post created will show that it was generated by the present application.

The map view provides a zoomable map view with Stay markers and Journey routes marked. The Journey routes are colour coded according to the transport mode. Clicking on a Journey line or Stay marker provides a pop-up with the corresponding Journey or Stay from the diary. From the pop-up there is an option to navigate to the Diary view for that Stay/Journey or to the Scrapbook that will be indexed to that date/time. There is also a cross to cancel the pop-up.

The map view will default to the current day, but if navigated to from a Diary entry or Scrapbook item it will default to that particular day. Zooming out on the map will cause Stays to cluster and combine (as the radius and dwell times are increased) and hence less Journeys will be shown.

The map view should allow us to view a day, week, month or year - panning and zooming to automatically include all of the Stays within that time period suitably summarised. It will also be possible to set start and end dates.

The statistical or stats view provides a view of personal data in graphs, statistics and trends. The Stats view will allow users to analyse their data and extract summaries of activities enabling data to be compared and graphed. Aggregated data can be used to compare different users. For example, it can be established that one user spends 17% more time at work over one year compared to the average person in a user’s age group.

The statistical format may include presenting recent user activity data, for example, over the previous day, week or month in contrast to historical trends and/or statistics and/or behaviours established using older user activity. Further statistical measures based on a comparison of recent activity with historical activity may be produced.

All views include options for including advertising elements. These may be implemented as either interstitial and/or banner. The interstitial ads may appear when a user looks back more than a set number of days in their diary as the content is loaded. The banner ads will account for a certain percentage of the real estate in the scrapbook - e.g. 10%. The advert elements may be used as product bookmarks.

A user may be able to select advertising elements that appear in their scrapbook. On viewing a banner advert in the scrapbook view, a user may swipe left or right depending on if they like it or not. The single banner advert left on the screen will be the one that will remain, however, if they swipe up or down this will scroll through the ones they said they liked and allow them to select their favourite for the scrapbook. After a period of time (say one month) the banner ads in the scrapbook will be fixed and can no longer be changed. Using this interface users will view many more banner ads than usual if they engage to change the ads in their scrapbook. The users can specify which ones they like and which they don’t, optionally which is their favourite. This will allow us to set the users profile automatically, although this may be editable by a user. With permission of the user, additional profile data may be pulled from external social media sites. This concept is‘product bookmarking’ so that users are essentially marking the ads/products they would like to stay in their scrapbook. Given that the historical ads do not change it is possible that the URL links may become broken and so there will be an automatic link repair system implemented that redirects broken links to an alternative page or ad. Note that historical ads - say from 10 years ago in a user’s scrapbook scrapbook will remain there. Other applications of user data include gamification. There are two core threads to the ideas of gamification

1 ) One is to centre a game/puzzle on a user’s data. How well does the user know themselves, when did they last do something, might include lifestyle data with targets (similar to activity trackers).

2) The other is to add competitiveness through sharing aspects of a user’s life.

A specific idea is the reassembling of a user’s day. Say events (Journeys & Stays) have the timestamp removed, but still have the duration, can a user reassemble their day/week etc. into the correct timeline. Once correct the times and days would appear. This is a bit like a jigsaw of a user’s life for them to complete.

The same idea can be used for sharing a user’s diary information. If a user has an interesting day they may want to send the puzzle to a friend to see if they can complete this - or do a swap. This is a way of sharing what an interesting life/day they had with a social element where a user may then discuss it or indeed just use it to show-off.

With aggregated data different users can be compared in similar demographics and order users in a league table according to one or more categories, for example, air miles. There is a lot of potential to base new applications on the present platform. These are some examples of such applications: Diary, Scrapbook, Journey log, personal centric news. e.g. health & fitness monitor, business productivity assistant, mileage tracker, life-work balance trends, personalised news feeds, carbon footprint tracker (based on Journeys - transport modes and distances, speeds, etc.), product assistant, etc.

A summary of a user activity may be sent to the user or another identified person. The summary may be sent at any suitable time interval, for example, first thing in the morning. The user can then read about what they did the day before together with local and externally provided content, for example, online news sources can be integrated into the summary. Local, national and international. Newspapers are regionalised so that the local news most relevant to a user is given. The summary of a user activity may be encrypted before sending.

By combining the concept of regionalised news with the diary related to a user then the summary is like a newspaper totally centred around the user. The user may read a specific paper or online new source every morning. The summary can act as a replacement to this and inform the user about what they did the day before together with local news from around where a user was yesterday, national news from the country the user is in and International news. This concept is illustrated in Figure 7. News sources can include online media sources, including social media content.

The content of such a summary can be monitored to examine what a user actually reads and how long a user spends reading it. More specific user behaviour may also be derive, for example, if a user expanded from the headline text, then scrolled to the bottom or only looked at one paragraph. The system can learn what a user prefers and then modify the content to a user’s preferences. With such a personalised self-learning newspaper the advertising content can become very well targeted.

The content does not need to be limited to written online newspaper format. For example, it could be turned into a personalised newsreel centred around a user that a user could watch instead of the evening news etc.

News can be presented to a user on a number of different levels. For example, a first level is news about a user based on stored personal data (e.g. what a user did yesterday). A second level relates to regional or national news that can be centred on a user’s current or very recent locations. A third level relates to international news that has the usual geographic or language bias.

To allow third parties to develop applications based on the system, a secure permissions based API is provided. Authorised applications gain permission from the users for their data to be used for certain purposes. Secure APIs will access encrypted data through authorisation and authentication via the decryption code.

The system users may be provided with a dashboard, via the personal data web browser, that shows which applications they have authorised to access their personal data, a description of what the application requires their data for, the date the permission was given, the date the data was last accessed by that app a future date when the apps permission will expire - this can be set to never expire. Other information can also be given about each app (e.g. memory, battery consumption, usage etc.) so the user understands the impact. Any user or external transactions (including token and data transactions) made through the browser will be recorded in the ledger which can be viewed by the user.

The personal data browser may also offer a user access to personalized search functionality, that is based on data stored in system 10 and third party data stored of third party databases. In some embodiments, the personal search utilizes the database structure as organised by time and context.

For example, search may be completed using both keywords and a search term representing a time and/or location and and/or context of a stay and/or a journey, for example, location or journey type. These searches are allowed due to the structure of stored data as groups of time, context and location. Other searches may be possible using context or searching by data source, media type, activity, financial data, health data.

In some embodiments, the personalized search may comprise a primary and secondary search. A primary search is carried out on personal data using user input such as a keyword and/or other search item, and then a secondary search is carried out using an internet based search and/or a search on other publicly available information. The secondary search may use keywords from the primary search and/or keywords that are derived from the results of the primary search. By using a primary and secondary search, a single search query from user can produce results from personal data and results from the Web that are highly personalised.

In some embodiments, the system can also monitor the results that are viewed (i.e. clicked on) to improve future ranking algorithms and keyword selection for the secondary search.

In some embodiments, the primary and secondary searches can be reversed, such that the primary search is a web-based search and the secondary search is a personalized search on personal data.

In some embodiments, the system may store different aspects of personal data as part of one or more personal profiles. The system may automatically assign certain personal data to different profiles using contextual information, for example, if activities relate to work, then data relating to those activities are stored as part of a business profile and, likewise, activities relating to leisure can be stored as part of a home profile.

In some embodiments, one or more user profiles or personas may be built automatically using behavioural data. A user is provided, via the personal data web browser, with options to view and edit their profile and also to build profiles that can be used for different purposes. The user may switch to a different profile at any time and the delivery of services will look different depending on which profile is selected.

Profiles may be used to further optimise searches. For example, a user can choose which profile and hence which part of their personal data is used for the primary or secondary search. For example, if a user wants search results optimized for business purposes or directed to be specific to their business, they can choose their business profile. Likewise, by selecting a home profile, a user will obtain results directed to their leisure data. If a user does not want any bias in their search they can choose an anonymous profile. Profiles and personalized searches can be used to optimize news delivery and news searches. For example, a secondary search can include searching for regional, national or international news using keywords from a primary search of the personal database. Keywords, locations, and personal data can all be used in the news search in order present the most personally relevant/interesting news stories first.

Personal profiles can also be used to improve advertising to match customers to specific goods and services they are looking for. Shopping searches may be optimised by using this more complete profile or personal data to narrow the searches. If the goods or services being searched for do not match a user’s profile then the profile can be changed for the purposes of the search (or online shopping trip). As an example, a man may be searching for presents for his wife or children. In this case, the profile can be selected to match that requirement. The search for goods or services becomes more accurate and so sellers are more likely to use an affiliate or brokerage model that brings them qualified customers rather than them having to rely on advertising that requires many views to convert prospects into customers.

As described earlier, the personal web browser may provide access to data marketplaces. A problem is that there is limited willingness to sell highly personal data on known marketplaces. The present marketplace allows data to be traded, i.e. bought and sold. The marketplace allows access via highly aggregated data sets which are both depersonalised and have noise added to them to make the data non-attributable.

Aggregated data sets can be sold, however, the preferred way might be to sell answers to specific questions using the concept of ‘Safe Answers’. Instead of giving up datasets that can be processed to reveal the required information (and potentially misused, copied and resold), it is answers to specific questions that are sold. For example,‘how many AB group males in Edinburgh are generally at home by 6pm?’ might be a question. The present system can use the personal data database to work out the answer which would be sold without revealing any underlying personal data.

A method of performing a personalised search comprising a primary and secondary search is described above. Figure 8 shows the method of performing a search in accordance with embodiments, in further detail. The personal search platform has two main parts called the Primary Search and the Secondary Search. The primary search may also be referred to as a first search and the secondary search may also be referred to as a second search.

In overview, Figure 8 shows a method of performing a primary search on a store of collected personal data, creating a search query for a secondary search based on at least the results of the primary search and performing the secondary search on at least one store of publicly available data. In the present embodiment, the store of personal data is a database of personal data, substantially as described above. In the present embodiment, the store of personal data is a database associated with a personal context management system as described above. In the present embodiment, the at least one store of publicly available data is a plurality of internet search engines.

The search shown in Figure 8 uses a two stage personal search using a store of user- side personal data. The user enters their chosen search terms then a search of the user-side personal data is run to reveal any personal information relevant to the search. This is referred to as the ‘primary search’. The results are processed and combined with other user input (for example the keywords of the primary search) to form multiple search strings used to conduct a federated search (i.e. using multiple databases and search resources) across the internet. This is referred to as the secondary search. The final processing ranks results according to the relevance to the search terms, the ranking being personalised for each user. The filtering and ranking of each search uses supervised machine learned ranking techniques that are automatically adapted to each user.

By providing a two-stage search in accordance with embodiments, a user may search for information and the personalised results may be based on what the user has requested (and ranked according to what has been learned about their relevance to that user). The architecture is designed so that no personally identifiable information or attributable is ever revealed to a 3rd party during the search.

The above search uses a user’s private data stored on a personal data store. However, the user retains ownership of these data. Control and full access to the consolidated private data is returned to the user through a user interface. This presents personal data to the owner with data being secured through homomorphic encryption that still allows it to be searched without decryption. In some embodiments, data access and transactions are recoded onto a private blockchain based ledger that is visible to the user. This may be stored as part of the personal data store.

Figure 8 shows the following steps and structural elements for performing the personalised search in accordance with embodiments:

• A (step 202) - Keywords provided by user as input.

• B (step 204) - Primary search conducted on user-side data based on the keywords.

• C (step 206) - Personal Context Management System (PCMS) database.

• D (step 208) - Search results filtering and ranking.

• E (step 210) - Output of top ranked results.

• F (step 212) - Common contexts are also output.

• G (step 214) - Detection of user selection (via followed links) from results.

• H (step 216) - Temporary training connection (users select from primary results).

• I (step 218) - Predefined persona terms.

• J (step 220) - Processing forms search strings from four inputs.

• K (step 222) - Federated search uses search strings sent to multiple on-line search services.

• L (step 224) - Search results filtering and ranking.

• M (step 226) - Output of top ranked results.

• N (step 228) - Supervised machine learning algorithms control primary and secondary ranking.

At step 202, a user provides user input for the primary search. In the present embodiment, the user input is in the form of keywords. In other embodiments, the user input can be any suitable search item, for example, an image or audio recording.

In some embodiments, the search query is in the form of search keywords representative of the search to be performed and one or more a further search terms, for example, keywords, representative of contextual data (for example, a time, a location, a context and/or a journey).

In some embodiments, the user provides further user input, for example, in the form of a selection of one or more user parameters. In further detail, the user selects a user profile, as described above, for example, a personal profile or a business profile. The first and/or second search may be optimised based on the selection of a user profile. Furthermore, a user may select a user parameter representative of a user’s intent for the search session. User intent includes one or more of: research, leisure, browsing, shopping. In such embodiments, search results are tailored depending on the selection of user profile and/or user intent.

As a non-limiting example, a user who selects user intent to be shopping will receive results that are more relevant to shopping. In such an example, the search term used may be“mobile phone” and the results will be more directed towards retailers offering mobile phones for sale. If the user performs the same search but with different user intent, for example, with the user intent selected to be research, the search results will be more directed to websites containing factual information about mobile phones.

In the above described embodiment, user profile and/or user intent is selected by a user. However, it will be understood that user profile and/or user intent may also be determined based on user behaviour.

At step 204, the primary search is conducted on a store of the user’s personal data (shown at 206). The personal data is stored in an indexed and searchable format. For security the data in the database uses homomorphic encryption and is searchable in its encrypted form. The result of the primary search includes personal data and/or metadata and/or associated contextual data. The primary search therefore includes a search of encrypted data. As described above, the personal data is only accessible to a user (the owner of the data).

In the present embodiment, homomorphic encryption is used, however, it will be understood that other encryption schemes may be used.

In the present embodiment, the outputs of the primary search are provided from the data store in an unencrypted format. To provide the outputs in an unencrypted format, a decryption process is performed.

In the present embodiment, the processing of results is not encrypted as these are then deleted following the search. Therefore, the results from the primary search are decrypted using a user’s primary key and all processing following that will not be encrypted. However, any further additions to the personal database as a result of the search will be encrypted in accordance with the encryption scheme being used. In the present embodiment, the additions will be encrypted using homomorphic encryption.

It will be understood that in other embodiments, ranking and/or filtering and/or other processing steps may be performed on encrypted results.

At step 208, a filtering process is performed to remove any spurious results. At step 208, the results are also ordered in accordance with a ranking model. The ranking model is described in further detail below. The ranking may be considered to be an algorithm programmed with model parameters.

At step 212, the primary results are processed to determine one or more further inputs for the search query for the secondary search. In the present embodiment, the primary results are processed to determine any common contexts (e.g. geographic locations, dates, activity, etc.) found within the top-ranking results may also form an output. In the present embodiment, the primary results are processes to identify shared and/or overlapping information associated with at least two results from the primary search and using at least part of the identified shared and/or overlapping information as part of the search query for the second search. The shared and/or overlapping information can be at least one of shared or overlapping times, dates, locations, contextual data, activity type.

At step 218, persona data is determined from the personal data store. The persona data is user personalisation information that has been precomputed and can be viewed and edited by the user. Persona data can be processed and form part of the secondary search query.

Persona data forms part of a user persona. The user persona can be built automatically based on account data and personal data stored in the personal database. The persona can be updated as further information is added to the personal database. Different personas can be used by a single user based on intent or by a role. For example, a user can select an intent (e.g. the user selects an shopping intent) or by a role (e.g. separating a business/work persona from their leisure/home persona).

Users also have the ability to view their personas, edit their personas, or even create a custom persona. Custom personas allow a user to put yourself in a different search perspective e.g. for considering how customers see things, or if shopping for a present for a partner.

At step 220, one or more secondary search queries are created. In the present embodiment, the Secondary Search terms are constructed from the User Keywords or other user input, primary search results, contexts, and persona data. It will be understood that in other embodiments, one or more of the above inputs may be used.

In the present embodiment, the secondary search query comprises keywords that are used for the primary search and further search terms derived from processing the results of the first search. The secondary search query also includes terms based on contexts and persona data.

The first search returns personal data and/or associated metadata and/or associated contextual data stored in the store of collected personal data and the search query for the second search is created using at least part of the returned personal data and/or metadata and/or contextual data. By providing a second search query as described above, the second search provides a personalised search of the store of publicly available data.

The second search query is constructed such that it is non-attributable to the user. For example, the data used in the search query is non-attributable. In some embodiments, the search query comprises substantially no personally identifiable information. In some embodiments, the created search query is created so that it comprises substantially no sensitive data and/or only publicly available information.

By providing a search in accordance with embodiments, privacy is protected is because the search term cannot be attributed to a specific user by a third party. On the service side all of the search terms are associated with a service provider and therefore cannot be attributed to a specific user. In some embodiments, no user identify information is exposed as part of the federated search. As a non-limiting example, the IP address of the user is not exposed.

In some embodiments, the created search query may comprise only general and/or only public information.

At step 220, these four sources of data are combined into an advanced search string. In some embodiments, a series of parallel Boolean search strings are likely to be used.

The Primary Search uses keywords to generate personalisation terms. The personalisation terms and the keywords are then used to conduct a personalised Secondary Search on the internet using a federated search. The federated search includes a number of searches, each performed on a different search engine or publicly accessible database. The federated search may be combined such that each search is not attributable to the user.

At step 222, a plurality of searches is performed as part of a federated search. The search query formed at step 220 is transmitted to multiple on-line information sources such that a plurality of searches is performed. The results are then retrieved from the on-line information sources.

In further detail, based on the created search query, a plurality of search queries is generated. In the present embodiment, a search query for a particular source is adapted for that particular source. The search query is transmitted to a server of the information source using a publicly available API. A plurality of sets of secondary search results is then obtained in response to transmitting the search query to the plurality of information sources. Each set of secondary search results corresponds to a search performed on a respective store of publicly available data. The search query for each information store may be tailored or adapted using rules specific to that information source.

In the present embodiment, the search query is a Boolean search string. For the federated search, parallel Boolean search strings are therefore used. It will be understood that other representations of the search query can be used. In the present embodiment, the search query is performed for each source substantially anonymously. For example, the search query may be performed as an incognito search. In some embodiments, the created search query includes keywords or other search terms that are interpreted by the information source to be a selection of one or more search options (for example, incognito or anonymous searching and/or performance of search do not take into account properties, for example, location, of the server from which the query is transmitted).

At step 224, the sets of search results returned from each information source are merged. Merging includes combining sets of results. Merging also includes consolidating results, for example, identifying identical, substantially similar and/or duplicated results from different searches and removing these results from the final set of results. Merging also includes filtering spurious results.

At step 224, the merged results are ordered in accordance with a ranking model (represented as step 228). Further details of the ranking model are described below.

At step 226, the final results are provided to a user. For example, the final results are then delivered and presented to the user.

At step 214, user selection data is obtained. The user selection data is representative of a selection of one or more results of the secondary search by a user. The user selection data may be indicative of the relevance of the search results to a user. The user selection data may include search result selection data representative of a user selecting a particular search result (i.e. clicking on a selected link) and/or search result visiting data representative of the time spent by a user browsing the search result (i.e. the time a user spends at the selected link). In the present embodiment, the user selection data is obtained via user input received via the user device.

At step 214, the user’s reaction to the presented results is monitored by observing which results are selected, by clicking or other user input, from the ranked results (i.e. which results are so called followed links). In some embodiments, further user selection data may also be collected and/or monitored, for example, the time a user spends visiting a selected link. In the present embodiment, the user selection data is monitored as users use the search. As described below, the monitored user selection data is used to modify and/or update model parameters for one or more ranking models. Previously collected user selection data may be used to train one or more ranking models.

At step 216 and 228, two ranking models were described. At step 216, the results from the primary search are ordered using a first ranking model. At step 228, the results from the secondary search (the merged results) are ordered using a second ranking model.

In some embodiments, the machine learning algorithms used to determine the model parameters are known machine learning algorithms. In the present embodiment, supervised machined learning is used to train the system automatically by monitoring the user’s responses to the ranked results.

Known machine learning techniques include those discussed in the paper“Ranking, Boosting, and Model Adaptation” by Qiang Wu, Chris J.C. Burges, Krysta M. Svore and Jianfeng Gao, Microsoft Research Technical Report MSR-TR-2008-109. For example, machine learning algorithms based on one or more of: neural networks, support vector machines, Bayesian networks and/or genetic algorithms may be used.

In some embodiments, each user has model parameters for a machine learning model that are trained independently just for them.

In the present embodiment, the first and second ranking models are models for ranking the results. The models have model parameters determined or learned using machine learning methods. In the present embodiment, the models are generated using machine learning algorithms. In the present embodiment, the model parameters are determined based at least on user selection data obtained at step 214. The model parameters may be determined on pre-determined training data that corresponds to user selection data.

As searches are performed, the method monitors user selection data representative of a selection of search results by one or more users and updates the model parameters based on the monitored data. In this way, the behaviour of the user (i.e. which links are selected by users) feeds back into the ranking model.

In the present embodiment, the ranking model of step 216 (for the primary search) and the ranking model of step 218 (for the secondary search) are trained independently. In other words, user selection data is temporarily applied to the primary results. If required, the ranking algorithm on the Primary Search can be trained independently of the Secondary Search by allowing the User Selection to be temporarily applied to the Primary Results.

The Supervised Machine Learning algorithms {N} receive feedback based on how highly ranked the User Selection(s) are, and also potentially the time spent on the selected link(s). Future rankings are based on‘tuning’ of the ranking parameters such that the overall scored performance for the entire dataset would be maximised.

The Primary and Secondary machine learning ranking model parameters are adapted iteratively to improve scores for future rankings. When the parameters have converged then training can be limited to data subsets.

In the present embodiment, the ranking model used on the primary results is a machine learning ranking algorithm. The model uses user selection data as input data. The ranking model used on the secondary results is a machine learning ranking algorithm and uses selection data as input data.

It will be understood that supervised ML and machined-learned ranking (MLR) will use known algorithms that have been documented.

In some embodiments, the training of the machine model is performed for a large number of users and further refinement is performed based on monitored user selection parameters from a specific user. In such embodiments, an initial model is refined iteratively based on a single user’s behaviour. The personalised machine learning model parameters are then stored.

Personalisation of the search results is achieved using personal and persona data contained in the personal database. In addition, each user will have individual machine learning model parameters based on their reaction to the search results applied to the machine learned ranking. All personalisation data is held securely on the user-side and is encrypted.

By providing access to a richer personal dataset for each user, the search described herein may provide more relevant search results. The search also does not skew ranking towards sponsored content.

Known search engines (DuckDuckGo) market themselves as a‘non-tracking search engine’ and offer their customers a similar experience to Google, but without the privacy concerns. The present search method may offer customers a different type of privacy model, but one where data is not abused or sold.

In some embodiments, the present search eliminates advertising content from the search results and so the user experience is improved since it is both personalised and ad free.

As shown in Figure 8, in the present embodiment, the system for performing the search method has a user device (one user device for each user), a first computing resource also referred to as a personal database server, for storing the personal database and a second computing resource, also referred to as a search server. The search server is in communication with the personal database server and the user device, directly or via a network. The search server is also in communication with a plurality of information sources via the internet. The user device receives user input from a user at step 202 and provides the user input to the search server. The search server then performs processing steps including step 207, 210 and 212 and 220 to form secondary search queries. The search server then communication the secondary search queries with the plurality of information sources via the internet, and retrieving the sets of results from these sources. The search server then performs further processing steps on the set of results, for example, steps 214 and 226. The final set of results (step 226) is sent from the search server to the user device and user selection parameters are monitored by via user interaction with the user device (providing input for steps 228 and 216). The above system, allows search requests for each user to originate from the search server not the personal computer of a user, which may reduce a risk of user identification. The user device has a display and a user input device, for example, a touchscreen or keyboard. The user device is configured to receive user input and display results of the second search. The user device is further configured to receive user input and/or user selection parameters, in response to displaying results of the second search. In some embodiments, a first set of data in the store of personal data is encrypted in accordance with a first encryption scheme, and a second set of data in the store of personal data is encrypted in accordance with a second encryption scheme. In some embodiments, the second encryption process is less secure than the first encryption process. The second, less secure, encryption scheme may be applied to blob data, such as images and/or video data. The first, more secure, encryption scheme may be applied to location and/or indexing data (which enables data to be attributed to a user).

It will be understood that different configurations of hardware may perform the described method of search. In some embodiments, the search server is provided at the same server as the personal database server. In some embodiments, more than one stores of collected personal data is provided.

A skilled person will appreciate that variations of the described embodiments are possible without departing from the invention. Accordingly, the above description of the specific embodiments is made by way of example only and not for the purposes of limitation.

Claims

CLAIMS:

1. A method of searching comprising:

performing a first search;

creating a search query for a second search based at least on the results of the first search,

2. A method according to claim 1 , wherein the first search is the search performed on the store of collected personal data and the second search is the search performed on the at least one store of publicly available data.

3. A method according to claim 2, wherein the first search returns personal data and/or associated metadata and/or associated contextual data stored in the store of collected personal data and the search query for the second search is created using at least part of the returned personal data and/or metadata and/or contextual data, the second search query thereby providing a personalised search of the store of publicly available data.

4. A method according to any preceding claim, wherein the personal data is accessible only by a user.

5. The method according to any preceding claim, wherein the search query for the second search is further based on user input used for performing the first search, optionally, wherein the user input comprises at least one of:

a search query used for the first search and/or selection of one or more user input parameters.

6. The method according to any preceding claim, further comprising:

processing the results returned by the first search to determine at least part of the search query for the second search.

7. The method according to any preceding claim, further comprising: identifying shared and/or overlapping information associated with at least two results from the first search and using at least part of the identified shared and/or overlapping information as part of the search query for the second search.

8. The method according to claim 7, wherein the shared and/or overlapping information comprises one or more of:

times, dates, locations, contextual data, activity type.

9. The method according to any preceding claim, wherein:

at least part of the store of collected data is encrypted in accordance with a searchable encryption scheme such that the search of the store of collected personal comprises a search of encrypted data.

10. The method according to claim 9 wherein the method further comprises performing a decryption process on the results of the search of encrypted data and creating the search query using part of the decrypted search results.

11 . The method according to any of claims 9 or 10, wherein the searchable encryption scheme comprises a homomorphic encryption scheme.

12. The method according to any of the preceding claims, wherein the created search query is created such that it is non-attributable to the user.

13. The method according to any preceding claim, wherein performing the first search comprises obtaining a first set of search results ordered using a first ranking model and/or performing the second search comprises obtaining a second set of search results ordered using a second ranking model.

14. The method according to claim 13 further comprising determining the model parameters for the first and/or second ranking model based on user selection data.

15. The method according to any preceding claim, further comprising monitoring and/or collecting user selection data representative of a selection of search results by one or more users and using said user selection data to update model parameters of one or more ranking models associated with the first and/or second searches.

16. The method according to claim 15 wherein the user selection data comprises at least one of:

search result selection data representative of a user selecting a particular search result;

search result visiting data representative of the time spent by a user browsing the search result.

17. The method according to any of the preceding claims further comprising determining model parameters for one or more ranking models for each user and storing the determined model parameters for each user.

18. The method according to any preceding claim further comprising performing the second search, wherein performing the second search comprises:

19. The method according to claim 18, wherein the search query comprises a plurality of adapted search queries, each adapted search query being adapted for the corresponding store of publicly available data to be searched.

20. The method according to any of claims 18 or 19, further comprising ordering the merged set of search results in accordance with a ranking model and updating the ranking model based on user selection data.

21 . The method according to any of claims 18 to 20, wherein performing the merging process comprises at least one of:

performing a combining process on the plurality of sets of search results; identifying identical, substantially similar and/or duplicated results and removing said identified, substantially similar and/or duplicated results;

filtering search results.

22. The method according to any preceding claim, wherein the one or more stores of the publicly available data comprises a search engine accessed via the internet.

23. The method according to any preceding claim, wherein the first search and/or second search is based on a selection and/or determination of a user profile and/or user intent such that the results returned by the first search and/or the second search are dependent on the selection and/or determination of user profile and/or user intent.

24. The method according to claim 23, wherein a part of the store of personal data that is searched is selected based on the user profile and/or user intent.

25. A system comprising:

a computing resource comprising a system and a memory resource in communication with a store of publicly available data and a store of collected personal data;

wherein the computing resource is configured to perform a first search and to create a search query for a second search based at least on the results of the first search, wherein one of the first search and second search is a search performed on a store of collected personal data and the other of the first search and second search is a search performed on at least one store of publicly available data

26. A computer program product comprising a computer readable medium storing instructions that are executable to perform a method comprising:

performing a first search;

27. A method of monitoring user activity comprising: obtaining first location data representative of a first location of a user device and determining a dwell area surrounding the first location;

28. The method according to claim 27, wherein obtaining location data comprises performing one or more measurements using one or more sensors and/or selecting one or more location data sources.

29. The method according to any of claims 27 to 28, wherein the one or more sensors comprises at least one of: a GPS sensor; at least part of a RF transceiver for receiving network calls, messages or data or associated circuitry, an accelerometer.

30. The method according to any of claims 27 to 29 wherein the dwell area is defined by one or more parameters representative of a permitted distance, optionally, wherein the dwell area is a circle centred on a location measurement and the parameter representative of a permitted distance is a radius.

31. The method according to any of claims 27 to 30 wherein the method further comprises measuring a time spent by the user device within the dwell area and the determining that the user is moving or stationary is further based on a comparison of the measured time with a dwell time.

32. The method according to any of claims 27 to 31 wherein the first and/or second location data comprises first and/or second location measurements and further data representative of the accuracy of the first and second location measurements.

33. The method according to any of claims 27 to 32 further comprising iteratively obtaining second location data representative of the second location until an accuracy condition is satisfied.

34. The method according to claim 33, wherein the accuracy condition comprises one of:

a) the second location measurement is more accurate than an accuracy threshold;

35. The method according to any of claims 27 to 34, further comprising determining a mode of movement or transport using obtained location data and/or further data.

36. The method according to claim 35, wherein the further data comprises at least one of a) or b):

37. The method according to any of claims 35 or 36 wherein determining a mode of movement or transport further comprises performing a confirmation process, optionally wherein the confirmation process comprises calculating a speed of movement using location data and comparing the determined speed of movement with one or more typical values associated with the determined mode of movement or transport.

38. The method according to any of claims 35 to 37, wherein determining a mode of movement or transport comprises at least one of:

selecting the most likely mode of movement or transport from a number of different modes of movement or transport.

calculating a probability or likelihood that movement represented by the location data is one of a number of modes of movement and transport, optionally, wherein selecting the most likely mode is based on the calculated probability or likelihood.

39. The method according to any of claims 27 to 38, wherein obtaining second location data comprises requesting location data at a sampling frequency.

40. The method according to claim 39, wherein the sampling frequency is based on at least one of:

determining that the user is substantially moving;

determining that the user is substantially stationary;

a determined mode of transport;

a status of the user device, for example, a remaining battery life;

output from a processor or memory resource of the user device;

one or more user preferences.

41 . The method according to any of claims 27 to 40, wherein obtaining first and/or second location data comprises selecting a requested accuracy and requesting location data representing measurements with an accuracy at least equal to or more accurate than the requested accuracy.

42. The method according to claim 41 , wherein obtaining first and/or second location data at a requested accuracy comprises iteratively requesting location data from one or more location data sources or iteratively performing location measurements until the returned location data is equal to or more accurate than the requested accuracy.

43. The method according to any of claims 41 or 42, wherein obtaining first and/or second location data at a requested accuracy comprises switching a sensor or source of location data source offering location measurements at a first accuracy or first average accuracy to a second sensor or source of location data offering location data at a second accuracy or second average accuracy

44. The method according to any of claims 41 to 43, wherein the requested accuracy is in dependence on the user being substantially stationary or being substantially moving.

45. The method according to any of claims 27 to 44, wherein the method further comprises storing the first and second location data and/or the further data on one or more memory resources.

46. The method according to any of claims 27 to 45, wherein the method further comprises adding noise to at least part of the location data and optionally to the further data.

47. The method according to any of claims 27 to 46 further comprising performing an encryption process on at least part of the location data and/or further data.

48. The method according to any of claims 27 to 47, further comprising grouping the location data and further data into a first data group with high sensitivity and a second data group with lower sensitivity.

49. The method according to any of claims 27 to 48, further comprising encrypting the first data group using a first encryption process, and optionally encrypting the second data group using a second encryption process, wherein the second encryption process is less secure than the first encryption process.

50. A method of anonymizing personal data comprising:

51 . A method of obtaining a location of a user device using a location module of the mobile device comprising:

requesting motion data with the active location module from one or more motion sensors of the mobile device; processing the motion data with the active location module to determine a motion state of the mobile device, wherein the motion state is either a moving state or a stationary state;

determining a current location using the obtained location data.

52. A method comprising:

53. The method according to claim 52, wherein generating an ordered representation of user movement and/or activities comprises processing the user activity data to establish movement elements and visit elements, wherein movement elements are representative of movement between at least two locations of the one or more locations and visit data representative of a user substantially remaining at a location dwell area of the one or more location.

54. The method according to any of claims 52 to 53, wherein the user activity data comprises user movement data representative of movement between at least two locations of the one or more locations and visit data representative of a user substantially remaining at a location of the one or more locations.

55. The method according to any of claims 52 to 54, wherein the ordered representation comprises processing movement data to generate the movement elements and processing the user visiting data to generate the visit elements.

56. The method according to any of claims 52 to 55, wherein the ordered representation comprises an order based substantially on time ordering of the user activity data or a time ordering of the generated movement and visit elements.

57. The method according to any of claims 52 to 56, wherein the contextual information is retrieved from at least one of: an external server, a memory resource of the user device, a data capturing device of the user device, for example, a camera or a microphone.

58. The method according to any of claims 52 to 57, wherein generating an ordered representation of user movement and/or activities comprises at least one of:

processing the user activity data and the retrieved contextual information to generate an output in a natural language format;

processing the user activity data and the retrieved contextual information to generate an output in a geographical format;

processing the user activity data and the retrieved contextual information to generate an output in an album format;

processing the user activity data and the retrieved contextual information to generate a statistical representation of historical user activity.

59. The method according to any of claims 52 to 58, wherein the ordered representation is displayed using one or more of the following views:

a) natural language output, for example, in the form of a diary, wherein user activity data and the retrieved contextual information are presented as entries of the diary;

d) in a statistical format, wherein user activity data and the retrieved contextual information are presented as a statistical representation of historical user activity.

60. The method according to any of claims 52 to 59, wherein the user activity data is representative of user activity at a first spatial and/or temporal resolution and the ordered representation comprises processing or re-processing the user activity data to generate entries representative of user activity at a second spatial and/or temporal resolution.

61 . The method according to any of claims 52 to 60, further comprising presenting the ordered representation at the second spatial and/or temporal resolution.

62. A method of organizing data comprising:

storing first and second groups in a database, thereby to provide filtering and/or retrieval of data based on their grouping.