WO2021090032A1

WO2021090032A1 - Data aggregation methods and systems

Info

Publication number: WO2021090032A1
Application number: PCT/GB2020/052834
Authority: WO
Inventors: Peter Ellen
Original assignee: Peter Ellen
Priority date: 2019-11-07
Filing date: 2020-11-09
Publication date: 2021-05-14
Also published as: GB201916230D0

Abstract

There is disclosed a computer-implemented method for aggregating data, the method including the steps of: (i) receiving a plurality of UID data sets, each UID data set including a unique identifier (UID); (ii) receiving publicly available data; (iii) processing the received publicly available data to assign data of the received publicly available data to the UID data sets, to assign data of the received publicly available data to appropriate UIDs, and including the assigned data of the received publicly available data in the UID data sets of the appropriate UIDs; (iv) receiving customer data sets from a client server; (v) processing the customer data sets, to identify co-occurrences between the customer data sets and the UID data sets; (vi) using the identified co-occurrences to assign respective UIDs to respective customer data sets; and (vii) including data from respective UID data sets into respective customer data sets, according to the identified co-occurrences, to generate updated customer data sets. Related methods, systems and computer program products are disclosed.

Description

DATA AGGREGATION METHODS AND SYSTEMS

BACKGROUND OF THE INVENTION

1. Field of the Invention

The field of the invention relates to data aggregation methods, systems and computer program products.

2. Technical Background

Achieving customer data unicity and intelligence is challenging. ‘Unicity’ is the fact of being or consisting of one; oneness.

Traditionally the unicity problem has been approached using a “single view of customer” technology which provides a centre point for each unique customer’s information. This data set and system is used by various business functions including management, analyst teams, customer service, operations and marketing, to ensure a consistent approach to customer experience. Long-standing technical solutions exist to solve the core of this problem within customer relationship management (CRM) functions of businesses and its deployment is particularly mature in sectors like banking and telecommunications. For example, technologies exist to create unification rules and to deploy “fuzzy logic”, to consider probable matches between two or more customer records. We call these methods “stitching” where one or more datasets are bound, then key values like frequency of purchase are subsequently calculated and aggregated across data silos. Even in traditional businesses, deployment projects are typically long and expensive, requiring complex logic and data plumbing between applications.

What is needed is an improved method of achieving customer data unicity.

3. Discussion of Related Art EP3096258B1 and EP3096258A1 disclose that a system for anonymizing and aggregating protected information from a plurality of data sources includes a master index server coupled to a data repository. The master index server receives an anonymized records associated with an individual from a plurality of data hashing appliances. The system includes a cluster matching engine that applies a plurality of rules to hashed data elements of the received record for comparing hashed data elements of the record with hashed data elements of a plurality of clusters of anonymized records associated with different individuals stored in the data repository to determine whether the individual associated with the received record corresponds to an individual associated with one of the clusters of anonymized records. When a match is found, the cluster matching engine adds the received record to the cluster of anonymized records associated with that individual. EP3096258B1 and EP3096258A1 disclose prior art Figure 2. EP2485430B1 and EP2485430A2 disclose that a private stream aggregation (PSA) system contributes a user's data to a data aggregator without compromising the user's privacy. The system can begin by determining (302) a private key for a local user in a set of users, wherein the sum of the private keys associated with the set of users and the data aggregator is equal to zero. The system also selects a set of data values associated with the local user. Then, the system encrypts individual data values in the set based in part on the private key to produce a set of encrypted data values, thereby allowing the data aggregator to decrypt an aggregate value across the set of users without decrypting individual data values associated with the set of users, and without interacting with the set of users while decrypting the aggregate value. The system also sends (308) the set of encrypted data values to the data aggregator. EP2485430B1 and EP2485430A2 disclose prior art Figure 3. SUMMARY OF THE INVENTION

According to a first aspect of the invention, there is provided a computer- implemented method for aggregating data, the method including the steps of:

(i) receiving a plurality of UID data sets, each UID data set including a unique identifier (UID);

(ii) receiving publicly available data;

(iii) processing the received publicly available data to assign data of the received publicly available data to the UID data sets, to assign data of the received publicly available data to appropriate UIDs, and including the assigned data of the received publicly available data in the UID data sets of the appropriate UIDs;

(iv) receiving customer data sets from a client server;

(v) processing the customer data sets, to identify co-occurrences between the customer data sets and the UID data sets;

(vi) using the identified co-occurrences to assign respective UIDs to respective customer data sets; and

(vii) including data from respective UID data sets into respective customer data sets, according to the identified co-occurrences, to generate updated customer data sets.

An advantage is that updated customer data sets are generated, the updated customer data sets being more reliable and more complete than the previous customer data sets, because the updated customer data sets include data that has been obtained from a plurality of sources. An advantage is that the updated customer data sets have improved consistency and accuracy, compared to the previous customer data sets. An advantage is that the proprietary logic held on a client server is not required. An advantage is that the updated customer data sets being more reliable and more complete than the previous customer data sets is achieved more rapidly than would be expected using processing on the client server. An advantage is that updated customer data sets are generated at lower cost.

The method may be one wherein step (i) includes generating the plurality of UID data sets. The method may be one wherein step (i) includes storing the plurality of UID data sets.

The method may be one including the step of: (viii) storing the updated customer data sets.

The method may be one wherein step (iii) includes storing the UID data sets, including the assigned data of the received publicly available data in the UID data sets of the appropriate UIDs.

The method may be one wherein including data from respective UID data sets into respective customer data sets, according to the identified co-occurrences, comprises including data from respective individual UID data sets into respective individual customer data sets, according to the identified co-occurrences.

The method may be one wherein if two or more different customer data sets have been assigned an identical UID, then the two or more different customer data sets are unified into a single customer data set, including the identical UID. An advantage is that a unified single customer data set can be obtained, which is more reliable and more complete than the previous customer data sets. An advantage is that the updated customer data sets have improved consistency and accuracy, compared to the previous customer data sets. An advantage is that the proprietary logic held on a client server is not required. An advantage is that the updated customer data sets being more reliable and more complete than the previous customer data sets is achieved more rapidly than would be expected using processing on the client server. An advantage is that updated customer data sets are generated at lower cost.

The method may be one wherein a UID is generated for each member of some of the world’s population. An advantage is that many different customer datasets can be processed.

The method may be one wherein a UID is generated for each member of most of the world’s population. An advantage is that many different customer datasets can be processed.

The method may be one wherein a UID is generated for each member of all of the world’s population. An advantage is that many different customer datasets can be processed.

The method may be one wherein a UID is generated for each member of some of a territory’s population. An advantage is that many different customer datasets can be processed.

The method may be one wherein a UID is generated for each member of most of a territory’s population. An advantage is that many different customer datasets can be processed.

The method may be one wherein a UID is generated for each member of all of a territory’s population. An advantage is that many different customer datasets can be processed.

The method may be one wherein the UID data sets include standard code blocks.

The method may be one wherein the standard code blocks have a sequence.

The method may be one wherein the sequence includes co-occurrence data. An advantage is that the UID data sets include data which can be used to speed up execution of the method, and/or to improve the reliability of the updated customer data sets.

The method may be one wherein the standard code blocks include co-occurrence data. An advantage is that the UID data sets include data which can be used to speed up execution of the method, and/or to improve the reliability of the updated customer data sets.

The method may be one wherein the UID data sets include co-occurrence data. An advantage is that the UID data sets include data which can be used to speed up execution of the method, and/or to improve the reliability of the updated customer data sets.

The method may be one wherein the publicly available data includes person names.

The method may be one wherein the publicly available data includes a person’s gender.

The method may be one wherein the publicly available data includes a person’s city of residence.

The method may be one wherein the publicly available data includes publicly available data sets. An advantage is that large data sets can be used, to improve the reliability of the updated customer data sets.

The method may be one wherein the data from the publicly available data sets is assigned to the UIDs, and is then encoded into the UID data sets. An advantage is that the UID data sets include data which can be used to speed up execution of the method.

The method may be one wherein the data from the publicly available data sets is assigned to the UIDs, using distributions across the publicly available data sets, and is then encoded into the UID data sets.

The method may be one wherein the distributions are achieved using machine learning and intelligence including one or more, or all, of: co-occurrence matrices, convolutional analysis, and graph based node analysis.

The method may be one wherein identifying a co-occurrence includes satisfying a confidence threshold, or wherein identifying a co-occurrence includes satisfying a probability threshold.

The method may be one wherein the UID data sets are updated with data relating to identified co-occurrences. An advantage is that the UID data sets include data which can be used to speed up execution of the method, and/or to improve the reliability of further updated customer data sets.

The method may be one wherein the updated UID data sets are stored.

The method may be one wherein a customer data set is a customer data set of a bank.

The method may be one wherein a customer data set is a medical history customer data set. An advantage is that a clinician or analyst may use an updated customer data set to derive more powerful insights on a patient’s condition. An advantage is that a clinician or analyst may use an updated customer data set to derive a more complete patient analysis of a condition over time. An advantage is that a health care organization may use an updated customer data set to understand their patients’ medical care requirements across providers and different record keeping systems. An advantage is that unified transmission surveillance of entities for targeted disease control measures may be enabled.

The method may be one wherein a customer data set is an IOT device or things data set. An advantage is that a service provider can better understand their customer requirements in relation to IOT devices or things.

The method may be one wherein a customer data set includes a plurality of customer data sets.

The method may be one wherein a customer data set includes a plurality of customer data sets, in retail or financial services. An advantage is that a retail or financial services company may be able to unify and maintain a single view of a customer.

The method may be one wherein a customer data set includes a plurality of medical history customer data sets, e.g. genetic and biometric records, or to records for patient monitoring technologies for different products or substances. An advantage is that a clinician or analyst may use an updated customer data set to derive more powerful insights on a patient’s condition. An advantage is that a clinician or analyst may use an updated customer data set to derive a more complete patient analysis of a condition over time.

The method may be one wherein a customer data set includes a plurality of customer data sets, including two or more, or all, of: a mobile global positioning system (GPS) data aggregator set, a medical testing results data set, a contact tracing operations data set, and an exposure notification application data set. An advantage is that unified transmission surveillance of entities for targeted disease control measures may be enabled.

The method may be one wherein a customer data set includes a plurality of digital advertising data sets for single consumers. An advantage is that updated customer data sets allow media buyers to develop unique consumer focused attribution models by understanding the multi-channel responses of single consumers.

The method may be one wherein a customer data set includes a plurality of digital rights management systems collected data on consumption and/or use of content. An advantage is that the method can be used to associate the customer data sets with a UID for each rights owner in order to consolidate consumption data.

The method may be one further including delivering a maintenance service to clients which provides refreshed and iterated UID’s as new learnings become available and data sets evolve. An advantage is that clients receive further improved customer data sets.

The method may be one further including providing a “unicity” service which unifies disparate (e.g. customer) data sets.

The method may be one wherein the UID is dynamic and secure, encoding useful information to power consumers’ experiences anywhere.

The method may be one wherein the UIDs are generated with capacity for foreseeable population change.

The method may be one wherein UID generation replaces the need for an organisation’s proprietary logic and dramatically increases the speed to achieve unicity of disparate data silos.

The method may be one wherein the same learning process is used to identify co occurrence between clients’ data sets and UTDs, as is used to assign publicly available data to the UID’s using distributions across the publicly available data sets.

The method may be one wherein the method is repeated to self-learn from observations that enable faster and more accurate unicity services and to understand the relationships between data sources.

The method may be one wherein clients can use an interface to configure data values and triggers, including direct, derived, bucketed and modelled information to be encoded from one or more data sources into the UID sequence.

The method may be one including mapping a received request into the UID code sequence.

The method may be one wherein the clients specified destination technologies then have exclusive access through micro services to encoded Unicity Intelligence and appropriate public data as a service.

The method may be one wherein the specified destination technologies hold encryption keys to decode their specified Unicity Intelligence.

The method may be one wherein as new data collection and interaction layers are added, they can be added easily by connecting to the Unicity Service.

The method may be one wherein any single appropriate data set can be probabilistically matched using a UID service, where the probability of the matches to a UID increases with use as the system and/or method understands greater levels of co-occurrence in data variables for each UID, with increasing use of the UID service.

According to a second aspect of the invention, there is provided a computer program product, the computer program product executable on a processor to:

(i) receive a plurality of UID data sets, each UID data set including a unique identifier (UID);

(ii) receive publicly available data;

(iii) process the received publicly available data to assign data of the received publicly available data to the UID data sets, to assign data of the received publicly available data to appropriate UIDs, and including the assigned data of the received publicly available data in the UID data sets of the appropriate UIDs;

(iv) receive customer data sets from a client server;

(v) process the customer data sets, to identify co-occurrences between the customer data sets and the UID data sets;

(vi) use the identified co-occurrences to assign respective UIDs to respective customer data sets; and

(vii) include data from respective UID data sets into respective customer data sets, according to the identified co-occurrences, to generate updated customer data sets.

The computer program product may be further executable on the processor to perform a method of any aspect of the first aspect of the invention.

According to a third aspect of the invention, there is provided a computer system, the computer system configured to:

(ii) receive publicly available data;

(iv) receive customer data sets from a client server; (v) process the customer data sets, to identify co-occurrences between the customer data sets and the UID data sets;

The computer system may be further configured to perform a method of any aspect of the first aspect of the invention.

According to a fourth aspect of the invention, there is provided a method of deriving a unified medical data set, the method including the steps of:

(ii) receiving publicly available data;

(iv) receiving customer data sets, wherein the customer data sets include a plurality of medical data sets, for example genetic records, and biometric records;

(vii) including data from respective UID data sets into respective customer data sets, according to the identified co-occurrences, to generate updated customer data sets, wherein the plurality of medical data sets are unified into a unified medical data set, when a common UID is assigned to two or more of the plurality of medical data sets.

An advantage is that a clinician or analyst is able to unify previously disparate sets of patient data, where a common UID is assigned to two or more of those data sets by the method. An advantage is that the clinician or analyst is enabled to derive more powerful insights on a patient’s condition. An advantage is that patient safety is improved.

The method may be computer-implemented.

The method may be one including a method of any aspect of the first aspect of the invention.

The method may be one including the further step of a clinician or analyst deriving improved insight into a patient’s condition, using the unified medical data set.

According to a fifth aspect of the invention, there is provided a computer program product, the computer program product executable on a processor to perform a method of any of aspect of the fourth aspect of the invention.

According to a sixth aspect of the invention, there is provided a computer system, the computer system configured to perform a method of any aspect of the fourth aspect of the invention.

According to a seventh aspect of the invention, there is provided a method of deriving a unified medical data set, the method including the steps of:

(ii) receiving publicly available data;

(iv) receiving customer data sets, wherein the customer data sets include a plurality of medical data sets, for example patient responses to a plurality of pharmaceutical products or substances;

(v) processing the customer data sets, to identify co-occurrences between the customer data sets and the UID data sets; (vi) using the identified co-occurrences to assign respective UIDs to respective customer data sets; and

An advantage is that an analyst can use the method results of mapping the discrete data sets, relating to products or substances, to a single UID for a data subject or patient, for the purpose of deriving a more complete patient analysis of a condition over time. An advantage is that the patient analysis can include responses to multiple vaccines or therapies. An advantage is that patient safety is improved.

The method may be computer-implemented.

The method may be one including the further step of, using the unified medical data set, a clinician or analyst deriving a more complete patient analysis of a condition over time, for example including responses to multiple vaccines or therapies.

According to an eighth aspect of the invention, there is provided a computer program product, the computer program product executable on a processor to perform a method of any aspect of the seventh aspect of the invention.

According to a ninth aspect of the invention, there is provided a computer system, the computer system configured to perform a method of any aspect of the seventh aspect of the invention.

According to a tenth aspect of the invention, there is provided a method of deriving a unified medical data set, the method including the steps of:

(ii) receiving publicly available data;

(iv) receiving customer data sets, wherein the customer data sets include a plurality of medical data sets, for example from multiple care and service providers;

An advantage is that this can provide improved safety for patients, because their aggregated data is held in a unified medical data set, so that important medical data is not unavailable at an important time.

The method may be computer-implemented.

The method may be one including the further step of a health care organization understanding their patients’ medical care requirements across the multiple care and service providers, using the unified medical data set.

According to an eleventh aspect of the invention, there is provided a computer program product, the computer program product executable on a processor to perform a method of any aspect of the tenth aspect of the invention. According to a twelfth aspect of the invention, there is provided a computer system, the computer system configured to perform a method of any aspect of the tenth aspect of the invention.

According to a thirteenth aspect of the invention, there is provided a method of deriving a unified disease control data set, the method including the steps of:

(ii) receiving publicly available data;

(iv) receiving customer data sets, wherein the customer data sets include a plurality of data sets, for example including two or more, or all, of: a mobile global positioning system (GPS) data aggregator set, a medical testing results data set, a contact tracing operations data set, and an exposure notification application data set;

(vii) including data from respective UID data sets into respective customer data sets, according to the identified co-occurrences, to generate updated customer data sets, wherein the plurality of data sets are unified into a unified disease control data set, when a common UID is assigned to two or more of the plurality of data sets.

An advantage is that unified transmission surveillance of entities for targeted disease control measures can be performed.

The method may be computer-implemented.

The method may be one including the further step of a disease control clinician or analyst obtaining unified transmission surveillance of entities for targeted disease control measures.

According to a fourteenth aspect of the invention, there is provided a computer program product, the computer program product executable on a processor to perform a method of any aspect of the thirteenth aspect of the invention.

According to a fifteenth aspect of the invention, there is provided a computer system, the computer system configured to perform a method of any aspect of the thirteenth aspect of the invention.

According to a sixteenth aspect of the invention, there is provided a method of deriving a unified digital advertising data set, the method including the steps of:

(ii) receiving publicly available data;

(iv) receiving customer data sets, wherein the customer data sets include a plurality of digital advertising data sets, for example data sets including responses for single consumers for advertising media interactions on multiple digital applications, distribution platforms or devices;

(vii) including data from respective UID data sets into respective customer data sets, according to the identified co-occurrences, to generate updated customer data sets, wherein the plurality of digital advertising data sets are unified into a unified digital advertising data set, when a common UID is assigned to two or more of the plurality of digital advertising data sets.

An advantage is that media buyers can develop unique consumer focused attribution models by understanding the multi-channel responses of single consumers.

The method may be computer-implemented.

The method may include the further step of media buyers developing unique consumer focused attribution models by understanding the multi-channel responses of single consumers.

According to a seventeenth aspect of the invention, there is provided a computer program product, the computer program product executable on a processor to perform a method of any aspect of the sixteenth aspect of the invention.

According to an eighteenth aspect of the invention, there is provided a computer system, the computer system configured to perform a method of any aspect of the sixteenth aspect of the invention.

According to a nineteenth aspect of the invention, there is provided a method of deriving a unified retail or financial services data set, the method including the steps of:

(ii) receiving publicly available data;

(iv) receiving customer data sets, wherein the customer data sets include a plurality of retail or financial services data sets, for example from service channels such as stores, branches, contact centers, apps and web channels;

(vii) including data from respective UID data sets into respective customer data sets, according to the identified co-occurrences, to generate updated customer data sets, wherein the plurality of retail or financial services data sets are unified into a unified retail or financial services data set, when a common UID is assigned to two or more of the plurality of retail or financial services data sets.

An advantage is that the retail or financial services company is able to use the method to assign a UID to discrete data sets from each channel and/or system, in order to unify and maintain a single view of a customer.

The method may be computer-implemented.

The method may include a method of any aspect of the first aspect of the invention.

The method may include the further step of a retail or financial services company using the method to maintain a single view of a customer, using the unified retail or financial data set.

According to a twentieth aspect of the invention, there is provided a computer program product, the computer program product executable on a processor to perform a method of any aspect of the nineteenth aspect of the invention.

According to a twenty first aspect of the invention, there is provided a computer system, the computer system configured to perform a method of any aspect of the nineteenth aspect of the invention. Aspects of the invention may be combined. A computer program product may be embodied on a non-transitory storage medium.

5

BRIEF DESCRIPTION OF THE FIGURES

Aspects of the invention will now be described, by way of example(s), with reference to the following Figures, in which:

Figure 1 shows an example of a system for providing a unified view of customers. Figure 2 is a block diagram of an environment in which a system for anonymizing and aggregating protected health information (PHI) may operate, according to prior art publications EP3096258B1 and EP3096258A1. Figure 3 presents a flow chart illustrating a process performed by a data aggregator for determining an aggregate value from encrypted data provided by a set of participants, which is from prior art publications EP2485430B1 and EP2485430A2.

DETAILED DESCRIPTION

This Description includes a description of unifying customer experience for the internet of things (IOT) age. This Description includes a description of unifying customer experience in the age of connected things.

Whether it is being recognised by your local bartender or a bank clerk, customers implicitly and explicitly attribute value to places where their requirements are not just recognized, but understood. In addition to the sin of offering a real ale drinker, or a wheat beer drinker, or a lager drinker, a vodka shot, you can add to the list failing to recognise a customer’s opt-in preferences and contacting them without consent.

For a multi-channel business, understanding of a consumer’s requirements isn’t achieved by verifying their identity. It requires a unicity of on-hand information so that appropriate experiences can be delivered anywhere the customer chooses to interact.

The proliferation of digitised consumer channels creates complex challenges for how people and technology support the information for each consumer. The problem includes two major challenges:

1. Establishing methods for consumers to recognise and verify their identity across multiple touch points so that they can access services whilst safeguarding their data security and privacy;

2. Establishing methods for organisations to understand consumers and provide relevant and consistent experiences for the purpose of satisfying demand.

The first challenge is addressed by Identity Verification technologies which include evolutions of security focused technologies like web based single sign-on and are researched by collaborative groups of technologists and are served by individual providers.

Here we are concerned with the second challenge, of customer understanding, which can only be consistently achieved by customer data unicity and intelligence. Here we consider

• How organisations traditionally solved the problem;

• How technological and behavioural changes impact the viability of solutions available;

• We describe how a novel service fixes current and future requirements for data unicity and intelligence.

The Unicity: A State of Oneness, and the Problem

Without a unified view of customers, businesses struggle to understand and consistently serve their true requirements, behaviours and preferences. This compromises many functions including commercial performance, privacy management, product management, procurement, operations and customer satisfaction.

Delivering a consistent experience is desirable.

What’s New

In the modem era of business, the unicity challenge is becoming significantly more difficult and, with a dramatic proliferation of connected “things”, this promises more complex consumer behaviour and diverse data collection systems. The stitching process is increasingly costly and unmanageable as the complexity and dynamism of data accelerates. Recruiting and managing skilled data engineers to keep up is often impractical as complexity dramatically out-paces skills availability. As a result, organisations face significant problems with partially unified data resulting in unsatisfactory outcomes.

What is accelerating the complexity of stitching:

• Data collection capabilities continue to dramatically increase the volumes of data;

• Rapid deflation in unit storage and processing costs increase the availability of data;

• The proliferation of connected “things” creates a proliferation of disparate customer identification codes;

• Each system includes discrete logic around the creation and maintenance of identification mechanisms, such as cookies, account numbers or encrypted identification numbers/codes;

• IOT based data sources promise increased device sharing in households, transport and places of work; for example you might have discrete access to a mobile phone, and shared access to a self-driving vehicle;

• Consumers’ journeys are being dramatically changed and intermediated by interaction with numerous touch-points including those within the direct and indirect control of organisations.

Only top global companies have the engineering resources to tackle the problem and often they disintermediate smaller businesses customer relationships at a cost to margin.

The Unicity Solution

In an example, our solution uses a uniquely novel approach and design to provide tightly integrated client services, enabling clients to leverage best in class technology.

In an example, there is provided a system for providing a unified view of customers. In an example, ingestion services provide public data as an input to the system. In an example, submission services provide client data as an input to the system. The data input to the system is received by a data ingestion sub-system, and is processed by the data ingestion sub-system. The data that has been processed by the data ingestion sub system is passed to a smart mapping engine for identity (ID) resolution, where the data is processed to provide a resolved identity. The data which has been processed by the smart mapping engine for identity (ID) resolution, and which includes a resolved identity, is passed to a UID sequence management module, which processes the data which has been processed by the smart mapping engine for identity (ID) resolution, and which includes a resolved identity, to provide sequenced data including a unique identifier (UID). There is provided system management, and user interfaces, for the system. The output from the UID sequence management module is made available for processing by client UID management provisioning. The output from the client UID management provisioning processing is made available for distributed service management. In an example, the distributed service management provides system output for client UID services. In an example, the distributed service management provides system output for client intelligence services. An example system for providing a unified view of customers is shown in Figure 1, in schematic form.

In an example, there is provided a system and/or a method for assigning unique and immutable identifying code to discrete data sets where the code (e.g. a unique identifier) relates to a single data subject (e.g. a person) or entity.

In an example, a UID system and/or method differs from traditional merging or de duping of data sets as commonly found in customer relationship data solutions. The traditional method involves deciding whether two or more data sets can be assigned the same identifier, in which if the two or more data sets can be assigned the same identifier, then the same identifier is then compiled with the two or more data sets. However, in an example, in the case of a UID system and/or method, any single appropriate data set can be probabilistically matched using a UID service, where the probability of the matches to a UID increases with use as the system and/or method understands greater levels of co-occurrence in data variables for each UID, with increasing use of the UID service. Prior to this initial matching, the UID’s are pre created and have already been assigned to entities and ingested data sets.

In an example, the approach enables a decentralised approach to unifying client data by applying a UID to one or more client data sets at source.

Unicity Services

In an example, there is provided a “unicity” service which unifies disparate (e.g. customer) data sets, with exceptional consistency and accuracy, through the provision to clients of a smart and unique identifying code (e.g. a unique identifier (UID)). This UID is dynamic and secure, encoding useful information to power consumers’ experiences anywhere.

In an example, we adopt a top down approach to UID management, generating a UID for some of (e.g. most of, or all of) the world’s population, e.g. country by country, or territory by territory, with extensible capacity for foreseeable population change, then deliver the UID via a real-time querying service. This replaces the need for an organisation’s proprietary logic and dramatically increases the speed to achieve unicity of disparate data silos.

An example, or examples, of how this may work: i. The UID construct includes a finite number of standard code blocks whose dynamic sequence infers useful/co-occurrence information which the system can re-use; in an example the encoding method is analogous to sequencing methods used in genomics. ii. The system consumes and assigns publicly available data (e.g. a person’s gender, a person’s city of residence) to the UID’s using distributions across the data sets, which are then encoded into the UID. Distributions are achieved using machine learning and intelligence including co-occurrence matrices, convolutional analysis and graph based node analysis, to distribute this data to appropriate UIDs. iii. The system then performs a similar (e.g. the same) learning process to identify co occurrence between clients’ data sets and UIDs, whereby upon achieving an acceptable confidence level, unicity is accepted and a UID is supplied by the system and assigned to data in the client data set, for example by being added to each customer data row. iv. Within the processes above useful co-occurrence information is encoded into the next iteration of the UID sequence, improving the performance (e.g. the speed, accuracy and consistency) of the future service, increasing the amount of system learning. v. The system continues to repeat the process to self-leam from observations that enable faster and more accurate unicity services and to understand the relationships between data sources. vi. Continuing to encode uni city information from the previous steps into the sequence of the UID improves future co-occurrence analysis, and reflects relevant changes of information. vii. After the provision of the first UID provision, the system delivers a maintenance service to clients which provides refreshed and iterated UID’s as new learnings become available and data sets evolve.

Unicity Intelligence

Clients use this to maintain the best possible and permissible understanding of customer requirements within organisations, across systems and data sets. As a result they achieve better and more consistent analysis, improve service and deliver more relevant customer experiences.

An example, or examples, of how this may work:

This service enables clients to decode intelligence from the UID so that they can access first party and public data across disparate systems in order to improve customer experience. i. clients use our interfaces to configure data values and triggers including direct, derived, bucketed and modelled information to be encoded from one or more data sources into the UID sequence. This becomes the basis for their proprietary Unicity Intelligence. ii. Upon receipt of these requests they are mapped into the UID code sequence. iii. The clients specified destination technologies then have exclusive access through micro services to encoded Unicity Intelligence and appropriate public data as a service. iv. Those same recipient technologies hold encryption keys to decode their specified Unicity Intelligence. v. As a result, a bespoke data architecture which previously included multiple pathways and dependencies can be replaced by a real-time central point of Uni city. vi. As new data collection and interaction layers are added, they can be added easily by connecting to the Unicity Service. vii. As new or changed data and triggers are required they can be configured by clients and shared across any application.

Examples of uses of the system or method in the sphere of healthcare

1. One example is in the medical field where disconnected diagnostic data sources, originating from a single patient, for example genetic and biometric records, are assigned by the system to a single UID. As a result, a clinician or analyst is able to unify previously disparate sets of patient data, where a common UID is assigned to two or more of those data sets by the system. This enables the clinician or analyst to derive more powerful insights on a patient’s condition.

2. Another example is where patient monitoring technologies are deployed within pharmaceutical products and substances, in each instance providing separate data streams. An analyst can use the system or method to map the discrete data sets, relating to products or substances, to a single UID for a data subject or patient, for the purpose of deriving a more complete patient analysis of a condition over time. This patient analysis can include responses to multiple vaccines or therapies.

3. Another example is where a health care organization requires to unify disparate medical records from multiple care and service providers, where that data is assigned by the system or method to a UID for each patient, for example enabling the organization to understand their patients’ medical care requirements across providers and different record keeping systems.

4. For the process of disease control, health officials or agencies acquire data sets from discrete sources like mobile global positioning system (GPS) data aggregators, testing facilities, contact tracing operations and exposure notification apps. Using the UID system, disparate data sources can be assigned UIDs when the data subject from these sources is a single citizen. This enables unified transmission surveillance of entities for targeted disease control measures.

Example of uses of the system or method in the sphere of IOT devices

An example is where a service provider delivers its services to a single consumer through multiple internet-connected devices or things, e.g. IOT devices or things, each device or thing producing discrete data sets. The service provider can use the UID system or method to assign device or application discrete data sets, with a UID relating to a single entity or data subject, so the service provider can better understand their customer requirements.

Examples of uses of the system or method in the sphere of digital content and marketing

An example is where media buyers want to unify digital advertising data sets for single consumers in response to advertising media interactions on multiple digital applications, distribution platforms or devices. The media buyers can use the system or method to assign discrete data sets with a UID for each consumer, allowing the media buyers to develop unique consumer focused attribution models by understanding the multi-channel responses of single consumers.

An example is where multiple digital rights management systems collect data on consumption and use of content; each discrete system can use the UID system or method to associate those data sets to a UID for each rights owner in order to consolidate consumption data.

Example of uses of the system or method in the spheres of retail or financial services

An example is where a retail or financial services company wishes to build a single view of a customer across its service channels such as stores, branches, contact centers, apps and web channels. The retail or financial services company is able to use the service or method to assign a UID to discrete data sets from each channel and system, in order to unify and maintain a single view of a customer.

What’s the Impact for our Clients

For the Business or Organisation

Traditionally clients’ data architectures involved complex diagrams of interconnected systems and flow charts that evolved with the organisation and included complex technical dependencies. When technology leadership prepares design principles, implementation requires immense effort and their governance is constantly stressed by proliferating executive ownership of systems. As new technologies are adopted they leave a legacy of dependencies which can be hard to resolve.

With our Unicity Platform these complex architectures can gradually be replaced by a spoke and hub service architecture where dependency relationships can be easily managed, and technology management can configure unified intelligence services to be supplied to all applications and systems. Organisationally this transforms road maps, dramatically cuts costs and ensures customer facing teams are able to develop unified experiences.

For their Customers

As customer transactions and relationships use increasing numbers of channels, and IOT enabled devices, intermediaries and service provides, our clients are able to identify and deliver a consistent experience where their customer requirements are well understood and better served.

Conclusion

With connected things forecast to proliferate to over 50bn by 2020, and customer experiences fracturing across channels, we advocate that legacy stitching methodologies that are currently straining to achieve unified customer views will experience accelerated failures. With on-hand unicity of understanding our clients have the opportunity to achieve unrivalled customer centricity and respond to rapidly changing habits, without requiring rip-and-replace technology programs.

Our Unicity Solution provides the de-facto standard for a customer centred strategy, a dramatically simplified architecture, and improving efficiency, so that any organisation can respond effectively to anyone’s requirements wherever they choose.

Note

It is to be understood that the above-referenced arrangements are only illustrative of the application for the principles of the present invention. Numerous modifications and alternative arrangements can be devised without departing from the spirit and scope of the present invention. While the present invention has been shown in the drawings and fully described above with particularity and detail in connection with what is presently deemed to be the most practical and preferred example(s) of the invention, it will be apparent to those of ordinary skill in the art that numerous modifications can be made without departing from the principles and concepts of the invention as set forth herein.

Claims

1. A computer-implemented method for aggregating data, the method including the steps of:

(ii) receiving publicly available data;

(iv) receiving customer data sets from a client server;

2. The method of Claim 1, wherein step (i) includes generating the plurality of UID data sets.

3. The method of Claims 1 or 2, wherein step (i) includes storing the plurality of UID data sets.

4. The method of any previous Claim, including the step of: (viii) storing the updated customer data sets.

5. The method of any previous Claim, wherein step (iii) includes storing the UID data sets, including the assigned data of the received publicly available data in the UID data sets of the appropriate UIDs.

6. The method of any previous Claim, wherein including data from respective UID data sets into respective customer data sets, according to the identified co occurrences, comprises including data from respective individual UID data sets into respective individual customer data sets, according to the identified co-occurrences.

7. The method of any previous Claim, wherein if two or more different customer data sets have been assigned an identical UID, then the two or more different customer data sets are unified into a single customer data set, including the identical UID.

8. The method of any previous Claim, wherein a UID is generated for each member of some of the world’s population.

9. The method of any previous Claim, wherein a UID is generated for each member of most of the world’s population.

10. The method of any previous Claim, wherein a UID is generated for each member of all of the world’s population.

11. The method of any previous Claim, wherein a UID is generated for each member of some of a territory’s population.

12. The method of any previous Claim, wherein a UID is generated for each member of most of a territory’s population.

13. The method of any previous Claim, wherein a UID is generated for each member of all of a territory’s population.

14. The method of any previous Claim, wherein the UID data sets include standard code blocks.

15. The method of Claim 14, wherein the standard code blocks have a sequence.

16. The method of Claim 15, wherein the sequence includes co-occurrence data.

17. The method of Claim 14, wherein the standard code blocks include co occurrence data.

18. The method of any previous Claim, wherein the UID data sets include co occurrence data.

19. The method of any previous Claim, wherein the publicly available data includes person names.

20. The method of any previous Claim, wherein the publicly available data includes a person’s gender.

21. The method of any previous Claim, wherein the publicly available data includes a person’s city of residence.

22. The method of any previous Claim, wherein the publicly available data includes publicly available data sets.

23. The method of Claim 22, wherein the data from the publicly available data sets is assigned to the UIDs, and is then encoded into the UID data sets.

24. The method of Claim 22, wherein the data from the publicly available data sets is assigned to the UIDs, using distributions across the publicly available data sets, and is then encoded into the UID data sets.

25. The method of Claim 24, wherein the distributions are achieved using machine learning and intelligence including one or more, or all, of: co-occurrence matrices, convolutional analysis, and graph based node analysis.

26. The method of any previous Claim, wherein identifying a co-occurrence includes satisfying a confidence threshold, or wherein identifying a co-occurrence includes satisfying a probability threshold.

27. The method of any previous Claim, wherein the UID data sets are updated with data relating to identified co-occurrences.

28. The method of Claim 27, wherein the updated UID data sets are stored.

29. The method of any previous Claim, wherein a customer data set is a customer data set of a bank.

30. The method of any of Claims 1 to 28, wherein a customer data set is a medical history customer data set.

31. The method of any of Claims 1 to 28, wherein a customer data set is an IOT device or things data set.

32. The method of any of Claims 1 to 28, wherein a customer data set includes a plurality of customer data sets.

33. The method of any of Claims 1 to 28, wherein a customer data set includes a plurality of customer data sets, in retail or financial services.

34. The method of any of Claims 1 to 28, wherein a customer data set includes a plurality of medical history customer data sets, e.g. genetic and biometric records, or to records for patient monitoring technologies for different products or substances.

35. The method of any of Claims 1 to 28, wherein a customer data set includes a plurality of customer data sets, including two or more, or all, of: a mobile global positioning system (GPS) data aggregator set, a medical testing results data set, a contact tracing operations data set, and an exposure notification application data set.

36. The method of any of Claims 1 to 28, wherein a customer data set includes a plurality of digital advertising data sets for single consumers.

37. The method of any of Claims 1 to 28, wherein a customer data set includes a plurality of digital rights management systems collected data on consumption and/or use of content.

38. The method of any previous Claim, further including delivering a maintenance service to clients which provides refreshed and iterated UID’s as new learnings become available and data sets evolve.

39. The method of any previous Claim, further including providing a “unicity” service which unifies disparate (e.g. customer) data sets.

40. The method of any previous Claim, wherein the UID is dynamic and secure, encoding useful information to power consumers’ experiences anywhere.

41. The method of any previous Claim, wherein the UIDs are generated with capacity for foreseeable population change.

42. The method of any previous Claim, wherein UID generation replaces the need for an organisation’s proprietary logic and dramatically increases the speed to achieve unicity of disparate data silos.

43. The method of any previous Claim, wherein the same learning process is used to identify co-occurrence between clients’ data sets and UIDs, as is used to assign publicly available data to the UID’s using distributions across the publicly available data sets.

44. The method of any previous Claim, wherein the method is repeated to self- learn from observations that enable faster and more accurate unicity services and to understand the relationships between data sources.

45. The method of any previous Claim, wherein clients can use an interface to configure data values and triggers, including direct, derived, bucketed and modelled information to be encoded from one or more data sources into the UID sequence.

46. The method of Claim 45, including mapping a received request into the UID code sequence.

47. The method of any previous Claim, wherein the clients specified destination technologies then have exclusive access through micro services to encoded Unicity Intelligence and appropriate public data as a service.

48. The method of Claim 47, wherein the specified destination technologies hold encryption keys to decode their specified Unicity Intelligence.

49. The method of any previous Claim, wherein as new data collection and interaction layers are added, they can be added easily by connecting to the Unicity Service.

50. The method of any previous Claim, wherein any single appropriate data set can be probabilistically matched using a UID service, where the probability of the matches to a UID increases with use as the system and/or method understands greater levels of co-occurrence in data variables for each UID, with increasing use of the UID service.

51. A computer program product, the computer program product executable on a processor to:

(ii) receive publicly available data;

(iv) receive customer data sets from a client server;

(v) process the customer data sets, to identify co-occurrences between the customer data sets and the UID data sets; (vi) use the identified co-occurrences to assign respective UIDs to respective customer data sets; and

52. The computer program product of Claim 51, further executable on the processor to perform a method of any of Claims 1 to 50.

53. A computer system, the computer system configured to:

(ii) receive publicly available data;

(iv) receive customer data sets from a client server;

54. The computer system of Claim 53, the computer system further configured to perform a method of any of Claims 1 to 50.

55. A method of deriving a unified medical data set, the method including the steps of:

(ii) receiving publicly available data;

56. The method of Claim 55, wherein the method is computer-implemented.

57. The method of Claims 55 or 56, the method including a method of any of Claims 1 to 50.

58. The method of any of Claims 55 to 57, the method including the further step of a clinician or analyst deriving improved insight into a patient’s condition, using the unified medical data set.

59. A computer program product, the computer program product executable on a processor to perform a method of any of Claims 55 to 57.

60. A computer system, the computer system configured to perform a method of any of Claims 55 to 57.

61. A method of deriving a unified medical data set, the method including the steps of:

(i) receiving a plurality of UID data sets, each UID data set including a unique identifier (UID); (ii) receiving publicly available data;

62. The method of Claim 61, wherein the method is computer-implemented.

63. The method of Claims 61 or 62, the method including a method of any of Claims 1 to 50.

64. The method of any of Claims 61 to 63, the method including the further step of, using the unified medical data set, a clinician or analyst deriving a more complete patient analysis of a condition over time, for example including responses to multiple vaccines or therapies.

65. A computer program product, the computer program product executable on a processor to perform a method of any of Claims 61 to 63.

66. A computer system, the computer system configured to perform a method of any of Claims 61 to 63.

67. A method of deriving a unified medical data set, the method including the steps of:

(ii) receiving publicly available data;

68. The method of Claim 67, wherein the method is computer-implemented.

69. The method of Claims 67 or 68, the method including a method of any of Claims 1 to 50.

70. The method of any of Claims 67 to 69, the method including the further step of a health care organization understanding their patients’ medical care requirements across the multiple care and service providers, using the unified medical data set.

71. A computer program product, the computer program product executable on a processor to perform a method of any of Claims 67 to 69.

72. A computer system, the computer system configured to perform a method of any of Claims 67 to 69.

73. A method of deriving a unified disease control data set, the method including the steps of:

(ii) receiving publicly available data;

74. The method of Claim 73, wherein the method is computer-implemented.

75. The method of Claims 73 or 74, the method including a method of any of Claims 1 to 50.

76. The method of any of Claims 73 to 75, the method including the further step of a disease control clinician or analyst obtaining unified transmission surveillance of entities for targeted disease control measures.

77. A computer program product, the computer program product executable on a processor to perform a method of any of Claims 73 to 75.

78. A computer system, the computer system configured to perform a method of any of Claims 73 to 75.

79. A method of deriving a unified digital advertising data set, the method including the steps of:

(ii) receiving publicly available data;

80. The method of Claim 79, wherein the method is computer-implemented.

81. The method of Claims 79 or 80, the method including a method of any of Claims 1 to 50.

82. The method of any of Claims 79 to 81, the method including the further step of a media buyers developing unique consumer focused attribution models by understanding the multi-channel responses of single consumers.

83. A computer program product, the computer program product executable on a processor to perform a method of any of Claims 79 to 81.

84. A computer system, the computer system configured to perform a method of any of Claims 79 to 81.

85. A method of deriving a unified retail or financial services data set, the method including the steps of:

(ii) receiving publicly available data;

86. The method of Claim 85, wherein the method is computer-implemented.

87. The method of Claims 85 or 86, the method including a method of any of Claims 1 to 50.

88. The method of any of Claims 85 to 87, the method including the further step of a retail or financial services company using the method to maintain a single view of a customer, using the unified retail or financial data set.

89. A computer program product, the computer program product executable on a processor to perform a method of any of Claims 85 to 87.

90. A computer system, the computer system configured to perform a method of any of Claims 85 to 87.