CN114514547A

CN114514547A - System and method for associating data sources with mobile devices using spatial and temporal analysis

Info

Publication number: CN114514547A
Application number: CN202080067820.6A
Authority: CN
Inventors: 蒂蒂·哈特泽尔; 马克·韦尔顿; 迈克尔·佩里; 斯蒂芬·斯科尔
Original assignee: Mober Technologies
Current assignee: Mober Technologies
Priority date: 2019-09-25
Filing date: 2020-09-24
Publication date: 2022-05-17
Also published as: WO2021061897A1

Abstract

Various embodiments of the present technology relate generally to data delivery. More particularly, some embodiments of the present technology relate to systems and methods for associating data sources with mobile devices using spatial and temporal analysis. Delivery of data supports a variety of services for and with mobile devices based on data stored in enterprise, business, and government databases that are not currently linked to personal mobile devices. Some embodiments allow advertisers to better determine with greater accuracy that their advertisements are targeted to relevant target audiences.

Description

System and method for associating data sources with mobile devices using spatial and temporal analysis

Cross reference to related applications

This application claims the benefit of U.S. patent application No. 16/583,185 filed on day 9 and 25 in 2019, which is a continuation-in-part application of U.S. application No. 15/046,394 filed on day 2 and 17 in 2016, which is a continuation application of U.S. application No. 14/731,281 filed on day 4 in 2015 6 and 4 (now U.S. patent No. 9,439,033), which is a continuation application of U.S. application No. 14/509,390 filed on day 8 in 2014 in 10 and 8 (now U.S. patent No. 9,087,346), which claims priority to U.S. provisional application No. 61/888,950 filed on day 9 in 2013, all of which are incorporated herein by reference in their entirety for all purposes.

Technical Field

Various embodiments of the present technology relate generally to data delivery. More particularly, some embodiments of the present technology relate to systems and methods for associating data sources with mobile devices using spatial and temporal analysis.

Summary of the invention

Some embodiments use location data records from websites, mobile advertising networks, mobile applications, and/or networks with sensors located in malls, airports, transportation terminals, hotels, offices, medical offices, elevators, etc. The location data may be used to construct location profiles (profiles) that may be linked to the home address through a series of analytics processes. Once the mobile device is associated with the home address, any database containing the home address as a data element may be associated with the mobile device to build an enhanced service that can be delivered (delivered) to the mobile device or may be used to provide location and situation information that requires the use of a certain area of mobile devices to build the profile.

In various embodiments, the system may also have the ability to group devices into a "social network" based on an analysis of the overlap of location data input to a single location in the system or multiple locations autonomously identified by the system. These social networks may be further analyzed using corresponding data elements in the link database to refine the social networks based on common characteristics found in the data.

Various embodiments may perform one or more of the following functions:

1. the identity of the mobile device is provided to the individual or family, which can be used to match back to any database that uses addresses as key elements to identify data.

2. Identification of mobile devices is provided, where the system can be used with any type of unique mobile device identifier, such as UDIDs, Wi-Fi MAC addresses, Bluetooth IDs, browser cache files (cookies), or any other persistent or semi-persistent identifier. A semi-persistent identifier is an identifier that exists for a period of time before being changed, which may be a day, a week, a month, or more.

3. Providing identification of mobile devices, wherein the system may be used with any type of mobile device on a satellite, cellular, or Wi-Fi network, using any type of service plan, including subscription, corporate (corporation), prepaid, etc.

4. Identification of mobile devices is provided, wherein the system provides cross-matching of multiple mobile device identifiers with a single anonymous identifier.

5. Identification of mobile devices is provided, where the system provides anonymization of data such that privacy is protected when the data is used for business purposes.

6. Providing an identification of the mobile device, wherein the system can be used with any mobile device data including the following elements: 1) a mobile device identifier, and 2) a geographic location tag, such as a latitude and longitude pair or other location coding system. Time/date stamps associated with mobile device data are desirable and may or may not be required to link the device to a database, but may be required for some applications and analysis to deliver different services.

7. Identification of the mobile device is provided, wherein the system works with any mobile device location data and takes into account variations in the accuracy of the mobile location data as a function of the data source.

8. Identification of the mobile device is provided, where the system can obtain real-time data as well as bulk data.

9. Providing identification of mobile devices, wherein the system provides delivery of linking data to commercial services, merchants, governments, and other customers in three ways-1) in response to queries about personal devices, 2) in response to queries about locations or radii around locations, or 3) in response to queries about lists or groups of devices.

10. An identification of the mobile device is provided, wherein the system does not require any subscriber data from the mobile operator to link the device back to any database.

11. The identification of the mobile device is provided, wherein the system does not require any location data from the mobile operator.

12. Identification of mobile devices is provided, where the system can identify "social networks" of devices having a common interest based on location data, which can be linked back to a business database for analysis purposes.

13. Identification of mobile devices is provided, where the system can identify a "social network" based on a single selected location input to the system or based on multiple locations autonomously generated by system analysis.

Embodiments of the present technology also include computer-readable storage media containing a set of instructions to cause one or more processors to perform the methods, variations of the methods, and other operations described herein.

While multiple embodiments are disclosed, still other embodiments of the present technology will become apparent to those skilled in the art from the following detailed description, which shows and describes illustrative embodiments of the present technology. As will be realized, the technology is capable of modifications in various respects, all without departing from the scope of the present technology. Accordingly, the drawings and detailed description are to be regarded as illustrative in nature and not as restrictive.

Drawings

Embodiments of the present technology will be described and explained by use of the accompanying drawings, in which:

FIG. 1 illustrates an example of a network-based environment in which some embodiments of the present technology may be utilized.

FIG. 2 illustrates various components and interactions in accordance with one or more embodiments of the present technology.

FIG. 3 is a block diagram illustrating various data and partner components (partner components) in accordance with various embodiments of the present technology.

FIG. 4 is a block diagram illustrating the use of anonymous requests by advertising network partners to obtain data from the system, in accordance with some embodiments of the present technology.

FIG. 5 is a flowchart illustrating an exemplary set of operations for associating a mobile device with a home address in accordance with one or more embodiments of the present technology.

FIG. 6 illustrates a graphical structure corresponding to a social link in a social network.

FIG. 7 is a flow diagram illustrating an embodiment of a method of generating a location social network.

FIG. 8 is a flow chart illustrating a method of detecting a relocation.

Fig. 9 is a table showing an example of detecting a relocation. The table shows two address changes.

FIG. 10 illustrates an example of a computer system that may utilize some embodiments of the present technology.

The drawings are not necessarily to scale. For example, the dimensions of some of the elements in the figures may be exaggerated or reduced to help improve the understanding of the embodiments of the present technology. Similarly, some components and/or operations may be separated into different blocks or combined into one block for the purpose of discussing some embodiments of the present technology. In addition, while the present technology is amenable to various modifications and alternative forms, specific embodiments have been shown by way of example in the drawings and will be described in detail below. However, it is not intended to limit the technology to the particular embodiments described. On the contrary, the technology is intended to cover all modifications, equivalents, and alternatives falling within the scope of the technology as defined by the appended claims.

Detailed Description

Various embodiments of the present technology relate generally to data delivery. More particularly, some embodiments of the present technology relate to systems and methods for associating data sources with mobile devices using spatial and temporal analysis. Some embodiments enable delivery of data to support a variety of services for and with mobile devices based on data stored in enterprise, commercial, and government databases that are not currently accurately linked to personal mobile devices. One application of this technique is to allow advertisers to better target their ads to relevant target audiences with greater accuracy. The technology uses location data records from mobile advertising networks, mobile applications, and hundreds of networks with sensors located in shopping malls, airports, transportation terminals, hotels, offices, medical offices, elevators, and the like. The location data may be used to construct a location profile, which may be linked to the home address through a series of analysis processes.

Once the mobile device is associated with the home address, any database containing the home address as a data element may be associated with the mobile device to build an enhanced service that may be delivered to the mobile device or may be used to provide a service that requires the use of mobile devices in a certain area to build the location and situation information of the profile. This information may also be used to build a "social network" that identifies individuals with common interests, associations, and social dynamics to provide additional insight (insight) to the mobile user.

A large amount of data about each individual and family is stored in enterprise, retailer, government, and marketing databases. Such data may include any type of data collected today-demographic data, psychological data, behavioral data, purchasing data, interest data, criminal data, occupational data, enrollment data, survey data, medical data, and the like. Such data may be used for a variety of purposes including advertising, marketing, location research, public safety, healthcare, and the like. There are many techniques for capturing location data from a mobile device and building a historical location profile associated with the device.

The challenge is to link the mobile device to an individual or home so that the data in these existing databases (usually keyed by name and address) can be used to provide enhanced services to the user of the mobile device and to extend services for advertisers, merchants, and governments that utilize location data from the mobile device. Even if these business and government databases have mobile phone numbers in the database, they are still not easily linked to mobile devices for delivery of other services. Mobile applications and services can only access device ID keys, mobile data network ID keys, Wi-Fi network keys, bluetooth IDs, cache files (cookies), and software defined as persistent and transient device identifiers that are not present in these databases.

Identifying the home address associated with the mobile device may be done by the mobile operator from their billing and provisioning database, but this information is not available to other service providers and government agencies. To provide enhanced services, these commercial and governmental agencies need an alternative solution that can accurately identify the home address of mobile devices to link to their data independent of the mobile operator's data or databases.

One of the main trends in marketing is social-based marketing through the use of social networks, aimed at contacting like consumers based on their common social interests and affiliations. Unfortunately, the ability to reach these audiences is controlled by large social networking companies that determine the way advertisers contact and interact with these consumers. Mobile devices provide great influence (reach) for advertisers, while they can provide new ways to advertise and interact with these consumers independent of the social networks and interest groups that these large social networking companies have exposed to. There would be a strong ability to link these social networking and interest groups to business and marketing data related to these consumers, allowing richer analysis of these groups and enabling predictive modeling to find similar types of customers.

The challenge is to attempt to identify mobile devices in a social group or interest group. Mobile advertising networks, mobile applications, and mobile websites have billions of records associated with mobile transactions that can be mined to create social network "graphs" to connect these devices together, and thus individuals. Various embodiments of the present technology provide solutions to this challenge.

In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of embodiments of the present technology. It will be apparent, however, to one skilled in the art that embodiments of the present technology may be practiced without some of these specific details.

Furthermore, the techniques described herein may be embodied as dedicated hardware (e.g., circuitry), programmable circuitry that is suitably programmed in software and/or firmware, or a combination of dedicated and programmable circuitry. Thus, embodiments may include a machine-readable medium having stored thereon instructions, which may be used to program a computer (or other electronic devices) to perform a process. The machine-readable medium may include, but is not limited to, optical disks, compact disk read-only memories (CD-ROMs), magnetic-optical disks, ROMs, Random Access Memories (RAMs), erasable programmable read-only memories (EPROMs), electrically erasable programmable read-only memories (EEPROMs), Application Specific Integrated Circuits (ASICs), magnetic or optical cards, flash memory, or other type of media/machine-readable medium suitable for storing electronic instructions.

Term(s) for

The following presents a simplified definition of terms, abbreviations, and phrases used throughout this application.

The terms "connected" or "coupled" and related terms are used in an operational sense and are not necessarily limited to a direct physical connection or coupling. Thus, for example, two devices may be coupled directly or via one or more intermediate media or devices. As another example, devices may be coupled in such a way that information may be passed between them without sharing any physical connection between each other. Based on the disclosure provided herein, one of ordinary skill in the art will appreciate that there are a variety of ways to connect or couple according to the foregoing definitions.

The phrases "in some embodiments," "according to some embodiments," "in an illustrated embodiment," "in other embodiments," and the like generally indicate that a particular feature, structure, or characteristic described in connection with the phrase is included in at least one embodiment of the present technology and may be included in more than one embodiment. Moreover, such phrases are not necessarily referring to the same embodiment or to different embodiments.

If the specification states that a component or feature is "possible" ("may", "can", "result", or "right") to be included or have a property, then it is not required that the particular component or feature be included or have the property.

The term "module" or "engine" refers broadly to a general or special purpose hardware, software, or firmware (or any combination thereof) component. Modules and engines are generally functional components that can generate useful data or other output using specified inputs. The modules or engines may or may not be self-contained. The modules or engines may be centralized or functionally distributed, depending upon the particular implementation or other considerations. An application (also referred to as an "application") may include one or more modules and/or engines, or a module and/or engine may include one or more applications.

General description of the invention

FIG. 1 is a block diagram of a network-based environment 100 in accordance with one or more embodiments of the present technology. As shown in FIG. 1, user devices 110A-110N may use network 115 to submit information and obtain information from data delivery platform 120. User devices 110A-110N may be enabled through a native operating system of the device (e.g., a native operating system of the device)

Or ANDROID^TM) An Application Programming Interface (API) running thereon interacts with the data delivery platform 120. With the data delivery platform 120, a mobile device user can target delivery of custom data using, for example, spatial and temporal analysis to associate data sources with the mobile device through the data delivery platform 120. The content management platform 125 enables the delivery of data stored in the database 130 to support a variety of services for and with mobile devices based on data stored in enterprise, business, and government databases that are not currently accurately linked to personal mobile devices.

For example, the data delivery platform 120 may use location data records from websites, mobile advertising networks, mobile applications, and hundreds of networks with sensors located in malls, airports, transportation hubs, hotels, offices, medical offices, elevators, and so forth. This location data can be used to build a location profile, which can be linked to the home address through a series of analysis processes. Using this information, custom profiles can be built around the mobile device.

User devices 110A-110N may be any computing device capable of receiving user input and transmitting and/or receiving data via network 115. In one embodiment, the user devices 110A-110N may be any device having computer functionality, for example, as a Personal Digital Assistant (PDA), mobile phone, smart phone, wearable computing device (e.g., glasses, watch, etc.), tablet, or the like. The user devices 110A-110N may be configured to communicate via a network 115, and the network 115 may include any combination of local and/or wide area networks, using both wired and wireless communication systems. In one embodiment, the network 115 uses standard communication techniques and/or protocols. Thus, the network 115 may include links using technologies such as Ethernet, 802.11, Worldwide Interoperability for Microwave Access (WiMAX), 3G, 4G, CDMA, Digital Subscriber Line (DSL), and the like.

Similarly, networking protocols used on network 115 may include multiprotocol label switching (MPLS), transmission control protocol/internet protocol (TCP/IP), User Datagram Protocol (UDP), hypertext transfer protocol (HTTP), Simple Mail Transfer Protocol (SMTP), and File Transfer Protocol (FTP). Data exchanged over the network 115 may be represented using techniques and/or formats including hypertext markup language (HTML) or extensible markup language (XML). In addition, all or part of the link may be encrypted using conventional encryption techniques such as Secure Sockets Layer (SSL), Transport Layer Security (TLS), and internet protocol security (IPsec).

Various types of network communication mechanisms may be used to couple the various components shown in fig. 1 to network 115 using various types of network communication mechanisms. These network communication mechanisms may communicate with other electronic devices by transmitting and receiving wireless signals over the network 115 using licensed, semi-licensed, or unlicensed spectrum. In some cases, network 115 may include multiple networks, even multiple heterogeneous networks, such as one or more border networks, voice networks, broadband networks, service provider networks, Internet Service Provider (ISP) networks, and/or Public Switched Telephone Networks (PSTN), interconnected via operational gateways to facilitate communication between the various networks. The network 115 may also include third party communication networks, such as a Global System for Mobile (GSM) mobile communication network, a code/time division multiple access (CDMA/TDMA) mobile communication network, a third or fourth generation (3G/4G) mobile communication network (e.g., general packet radio service (GPRS/EGPRS)), enhanced data rates for GSM evolution (EDGE), Universal Mobile Telecommunications System (UMTS), or Long Term Evolution (LTE) network, or other communication networks.

FIG. 2 illustrates various components and interactions in accordance with one or more embodiments of the present technology. The system may associate data in merchant, business, and government databases with mobile device data from a variety of vendors, including mobile advertising networks, mobile operators, mobile applications, merchants, Wi-Fi networks, and any other viable source. The components shown in FIG. 2 provide some examples of methods (means) for performing the various operations described.

In some cases, the system collects mobile device data. The mobile device data may include, but is not limited to, the following event data: mobile network call data, mobile data network registration and usage, mobile device location data, mobile device browsing and network data, transaction data, mobile application data, social media data, purchase data, login data, device sensor data, credit card data, and the like. The mobile device event data may include one or more of the following fields: 1) a device identifier, such as a UDID, a MAC address, cache files (cookies), or any other persistent or semi-persistent identifier; 2) location information, typically latitude and longitude or address; and/or 3) a timestamp, including date and time, in minutes and seconds. Note that not all data must contain a timestamp to provide a substantial match. In some embodiments, timestamps may be used to cross-match data sources having different device identifiers.

The mobile device event data may be clustered by location, device identifier, and time of day. The clusters are then evaluated based on the home address data. The address data is then used to link the mobile device ID to other databases. As part of this process, the system processes the data anonymously to provide enhanced security for the collected and linked data and to ensure that Personal Identity Information (PII) is not disclosed to anyone. As part of this process, an anonymous ID may be created so that the PII is never revealed when the client application uses the data.

FIG. 3 is a block diagram illustrating the use of a stand-alone data processor to match output data from the system with data provided by advertising network partners. Since PII is used in the matching process, a separate data process is used to prevent the system or advertising network partner from accessing PII. The output of the data processor is the link data that matches the two data sources.

As shown in fig. 3, the system may collect raw mobile device data from a variety of partners as well as business, and government data about individuals. This data may be processed by the system in fig. 2 and used to create a system data warehouse that contains PII as a key. The system may export the data warehouse as a system file that may be transferred to another party, including a separate data processor.

Similarly, partners such as the advertising network also collect customer information from customers of partner services (applications, websites, etc.) and registered users of these partner services, which information can be similarly accumulated into the advertising network data store. The advertising network data store may also use PII as a key. The advertising network data store may also be exported as a partner file for transmission to a separate data processor.

The independent data processor obtains the system files and partner files and compares the PII keywords. The independent data processor creates an output file containing combined records from the system file and the partner file, only for records in both files that have a matching PII key. In some embodiments, if a record with a PII key is unique to only one of the files, it is not included in the output. The merged file is then transmitted to advertising network partners for use. In various embodiments, the system may ensure that PII data for individuals not known to them is not shared with the system or advertising network partners.

FIG. 4 is a block diagram illustrating the use of anonymous requests by advertising network partners to obtain data from the system, in accordance with some embodiments of the present technology. One advantage of using anonymous requests is that this eliminates the need to expose the PII while providing real-time access to the system output.

As shown in fig. 4, the system collects raw mobile device data from various partners as well as business, and government data about individuals. This data is processed by the system in fig. 2 and used to create a system data warehouse that contains PII as a key. The system then processes the system data repository through an anonymization process that removes or modifies PII using data that cannot be directly linked to the PII. One way to do this is some one-way hashing algorithm so that anyone cannot convert the data back to the original PII, but other methods include matching tables used internally by the system to map the PII to usable non-PII data, but much less secure, as the matching tables themselves may be vulnerable to attack. For example, anonymous data may be stored in a system mobile data mart accessible in real-time.

When a publisher web site (or mobile application) makes a request to a partner ad server, the ad server in turn makes a request to a system targeting data engine that provides an external interface to the system mobile data mart. The system target data engine obtains anonymous keywords transmitted by the advertisement server and searches data in the system mobile data mart. The data returned by the system data mart is transmitted to the ad server, which in turn uses the data to decide what ads to return to the publisher.

FIG. 5 is a flowchart illustrating an exemplary set of operations for associating a mobile device with a home address in accordance with one or more embodiments of the present technology. The operations illustrated in fig. 5 may be performed by a variety of methods, including but not limited to data analysis platform 120, content management platform 125, database 130, one or more servers, one or more processors, network and networking hardware, various modules or engines (e.g., receiving module, profile module, linking module, association module, etc.), and/or one or more computing systems such as those described below in fig. 6. As shown in fig. 5, location data may be received from one or more sources during a receive operation 510. Using this information, a build operation 520 may build a location profile that may be linked to the home address during a link operation 530. Association operation 540 may then use this information to associate the mobile device with the home address.

Data operation flow

Various embodiments of a system for linking mobile device data with other databases using spatial and temporal analysis may include one or more of the following components and processing algorithms that may be executed on commercial servers using real or virtual servers organized into server clusters. According to various embodiments, the system may perform one or more of the following seven functions:

function one: processing of mobile device/location event data

The mobile device/location event data may be transmitted to the system in a batch file format or in real-time via an Application Program Interface (API) provided to the data provider. The batch files transferred to the system use standard secure File Transfer Protocol (FTP) technology. Real-time transport is done per event and uses an Application Program Interface (API) that is built using WS02 open source platform. The API may be built using JavaScript object notation (JSON) and provides partners a way to transfer data to the platform when requesting it. In some embodiments, the elements transmitted via the batch file or AP for any mobile device/location record may include at least:

device ID-possible device IDs include, but are not limited to:

omicron mobile phone number

Omicron Unique Device Identifier (UDID)

Omicron International Mobile Equipment Identifier (IMEI)

Omicron Mobile device identifier (MEID)

Omicron Electronic Serial Number (ESN)

Omicron Medium Access Control (MAC) Address (MAC-48/EUI-48/EUI-64)

Bluetooth omicron (BD _ ADDR)

Date: MMDDYY

Time: HH: MM: SS

Latitude: integer number of

Longitude: integer number of

Partner ID: distributed by E2M for real-time feeding

The mobile event data may be considered PII because it contains a unique identifier for each mobile device. Although it may be transmitted "in the clear" from the data provider to the system, typically the mechanism involves a secure connection and the mobile device ID data is encoded using an agreed upon obfuscation algorithm (e.g., hash) before sending the data to the system. Once the system receives the data, it ensures that all mobile device IDs are obfuscated before being stored to the system database and used for processing. For example, such obfuscation may be performed by the data provider prior to transmission, or by the receiving system using a SHA-1 hashing algorithm, which is a one-way hash that cannot be inverted back to the original data. Any other similar one-way hashing or encoding algorithm may be substituted for the SHA-1 algorithm.

Incoming movement event data is processed through a series of filters that organize the data in the system by mobile device ID. The data may be organized such that it is processed or evaluated differently with different priorities during subsequent processing. These filters may include, but are not limited to, the following:

time/date filter — data can be segmented by event date/time, and timestamps can be normalized to a single time zone or multiple different time zones, even if the data comes from a system that stores time using different default time zones. For example, a filter flags the records that occur between 6:00 PM and 6:00 AM, providing higher priority for location analysis.

Location data cleanup — these filters ensure that location data is accurate by:

correct or eliminate records with invalid latitude/longitude data that have been revoked by the provider, lack a leading minus sign (missing leading minus signs), or are lost altogether.

Discard records with a default or "blacklist" location. The process performed in identifying an address associated with a mobile device creates a blacklist of locations, often from particular providers (advertising networks or publishers), that are not valid locations for the device.

Adjust accuracy resolution of cross-source data for source-based processing. Depending on the data source, we can round the latitude/longitude data to a particular decimal place, to normalize the resolution across different data sources, or to weight data points based on the accuracy associated with the source. The weighting may be applied based on the source or other information contained in the provided data, or may be defined separately for each source or each data point in the system. Note that this process can also be applied to previously processed or stored location data to continually improve the quality of results in the system.

Discard location data associated with devices that have been marked as inactive or deleted by the system. The system may use a variety of methods, such as analyzing the time since the last data point reported for a device, mobile operator registration data, or other ways of identifying a particular device as no longer being used. Once the devices are tagged, filters can be used in processing the historical location data to eliminate data points from these devices from the process.

Mobile device ID filters — these filters evaluate the mobile device IDs passed to the system to check for existing IDs and identify which other mobile device IDs may be associated with the same device.

Once the movement event data has been processed through the filter and stored in the database, it is ready for location analysis. Location analysis is the process by which the system analyzes all of the filtered mobile event location data associated with a single device to identify the location most frequently associated with the mobile device. This process uses a density-based scanning algorithm to group the data points and find the center location of the groups of data points. Note that any other type of grouping algorithm may be employed.

The density-based scan may pair each movement/location record latitude/longitude as a single point for cluster analysis. Clustering is performed for each device ID using a variety of algorithms. The algorithm may use the following two parameters:

eps (e): maximum radius of the neighborhood point. The current embodiment uses 30 feet, but the settings can be adjusted to balance accuracy with processing time.

MinPts: eps-the minimum number of points in the neighborhood (specified radius). The current embodiment uses 10 as this value, but other settings may be used to balance accuracy with processing time.

The algorithm can identify clusters of points that meet the density requirement of MinPts within Eps. Each data point may then be classified. Some embodiments use the following categories:

core points are points with more than a specified number of points (MinPts) within the Eps. These are points located inside the cluster.

The boundary points are less than MinPts within Eps, but in the neighborhood of the core point (within Eps).

Noise points are any points that are not core points or boundary points. These points are ignored.

The clustering algorithm of one or more embodiments may work by:

arbitrarily select a point p.

Get all density reachable points from p for Eps and MinPts.

If p is a core point, a cluster is formed.

If p is a boundary point, no point is reachable from p density, and density-based noise application spatial clustering (DBSCAN) accesses the next point of the database.

Continue the process until all points have been processed

The result of the clustering process may be a list containing the core point locations and the number of data points associated with the locations. The locations may then be sorted from highest frequency to lowest frequency based on the number of data points associated with the locations. The generated location is geographic location coordinates using latitude and longitude, although any location reference system may be used.

Function 2. identify street address associated with each mobile device

Once the mobile event data has been processed and a resulting list of locations is generated for each device, these locations may be associated with the data source in one of two ways: 1) a location identifier (e.g., latitude/longitude), the location identifier associated with the pair may be compared to the location identifier stored in the data source. If the data sources use street addresses and do not include location identifiers, the system will generate location identifiers that can be used for comparison as part of the processing of the input data to these sources. 2) The second approach is to convert the location generated for each device to a Street address (e.g., 123Main Street, Anytown, CO, 80301) using a commercially available reverse geocoding service or database. This process aims to identify two primary addresses for each device:

"residential" address: the home address is critical for linking mobile devices to businesses, and government databases that use the home address as a key field. Home address matching may match many devices to the same home address, even if the address is a single family, because there are multiple devices and multiple individuals in the family. This is considered a "family" level match of the data returned from the database. One anomaly is a multi-family dwelling, such as an apartment building. Since the geo-location data used cannot distinguish apartment numbers or floor differences, multiple households will have the same address for multiple-family homes.

"commercial" address: may be a merchant, school, retail, or other commercial location. The daytime address is critical to identifying a family or individual within a single family dwelling or a multi-family dwelling unit because the residential address matches the family level of a single family. The daytime address is compared to other databases including points of interest data, merchant directories, and other data sources that may be used to identify commercial and public entities at a location.

Commercial reverse geocoding services return addresses of widely varying quality, attempting to return street addresses that are closest to the incoming geocode. These addresses are then compared with addresses used as keywords in a business database containing profile information. In some embodiments, the system analyzes the returned address against the business database and classifies it into one of the following categories:

full match address-an address found in a business database.

Complete match with city alias-address found in the business database when using city alias. Some cities have postal address names that are different from geocoded addresses.

Incomplete match: but very close to the address-an address where the street numbers do not match exactly but can match street numbers within +/-N house numbers of the address (where N can be defined in the system).

Incomplete match: but very close to addresses using city aliases-addresses where the street numbers do not match exactly but can match street numbers within +/-N house numbers of the address when a city alias is used (where N can be defined in the system).

Incomplete match: but a little further from the address-an address where the street number does not match exactly but may match a street number between N and M house numbers of the address (where N and M may be defined in the system).

Incomplete match: but slightly further from the address using the city alias-an address where the street number does not match exactly but may match the street number between the N and M house numbers of the address when the city alias is used (where N and M may be defined in the system).

Incomplete match: but far from the address-an address where the street number does not match exactly but may match a street number outside of the address +/-M house number (where M may be defined in the system).

Incomplete match: but is far from the address using the city alias-an address where the street number does not match exactly but may match a street number outside of +/-M house numbers of the address when the city alias is used (where M may be defined in the system).

Failure to match addresses-failure to satisfy any match condition.

Addresses cannot be matched even with aliases-no matching condition can be met even with city aliases.

Unmatched addresses

Failure to match an address: no latitude/longitude or virtual latitude/longitude-the reverse geocoder cannot even return an address.

These categories may be used to rank the quality of the returned matches and improve the quality of the provided data. When creating location data points from a street address for a business data source in a system, these categories may also be used to rate the quality of location data points created from the street address.

Function 3. linking Mobile device ID to data sources at Home and personal level

Once the home address associated with the device has been identified, it can be linked to data provided in any database that uses it as a key element. These databases may be commercial, merchant, marketing, government, law enforcement, healthcare, or any other database containing family or personal information.

Matching devices to the family and personal data using the residential address will result in a one-to-one match for the family of only one person, or a many-to-many match for the families of multiple individuals where there will be many devices associated with the address that need to be matched to individuals in the family. For a multi-family unit, such as an apartment building, there will be multiple devices that match the address and then must first match a particular household in the house and then match individuals in that particular household. While device matching at the family level is useful, it is more desirable to be able to identify devices associated with a single individual in a family, or to identify families in a multi-family dwelling unit.

To match devices to individuals within a home or multi-family residence, analysis of non-residential location clusters associated with each device ID may be used. The simplest is the identification of the individual in the household. The system uses an external data source that provides data to each person in the family that can be used to correlate with the characteristics of the location cluster. These data sources may be marketing data providers, online databases such as LinkedIn and Hoover's. A point of interest database or other database containing location-related information, which may be associated with a cluster of locations, may be useful in comparing to known data (e.g., interests, hobbies, entertainment activities, purchases, etc.) of individuals.

The individual associated with the device may be uniquely identified by comparing data associated with the non-residential location generated for each device with known data of the individual. Similarly, age information may be compared to a location record corresponding to a school to uniquely identify other family members. Throughout the process, devices may be associated with individual members of the family, and by exclusion, may potentially be associated with individuals who cannot achieve direct data matching. Personal identification is particularly important for services that prohibit measurement, tracking, analysis, or service children.

Personal identification within a multi-family home address is performed in a similar manner, with one enhancement. Additional processing is first performed to identify devices associated with each household in the multi-family dwelling. This process uses an overlay analysis of the data for each mobile device to determine which devices have a large number of common locations, indicating that the individuals of these devices are often together as family members. Once a device is identified as a home, the same process for identifying individuals within the home can be performed to identify a single device.

Function 4. linking Mobile device ID without Home location data

Some mobile event data sources provide data only from commercial or public locations and do not include any residential locations after processing. To match the mobile device ID associated with this "non-residential" data (NR data) back to residential-based data sources, the data may be linked to other IDs that have been linked to those data sources.

The process may use an overlap analysis of event location and timestamp data from the NR data with event location and timestamp data from the link source. In some embodiments, this analysis may build possible matches based on the number of overlapping occurrences, and also allow for variations in the time of occurrence of events from different sources, as it is rare to find a complete match.

Some embodiments of the overlap analysis may include the steps of:

1) each record in the NR data is compared by location to a location record from a data source that includes the residential data.

2) For location-matched records, the timestamp is compared to the timestamp of the NR record (t) to find records within a particular variation N. Records within the window of t-N to t + N are considered possible matches for the device.

3) The count may be created by device IDs from residential data sources that may match the device IDs from the NR data. These counts are then ordered from largest to smallest, with the largest representing the most likely match between the two data sets.

4) The potentially matching residential data source device IDs are then compared with the likely matches of all other NR data devices to determine if the multiple NR data device IDs are likely matches of the same residential data device ID. If more than one device is likely to match, they are ranked by the highest number of matches.

This process is repeated for existing NR data using different parameters, or when new NR data is acquired and new home data is acquired, to improve the results and get the highest quality match possible. Once a match is found, all home and personal data linked to the home data device ID can now also be linked to the NR data device ID.

This process may be performed for any device data that does not include a residential location, such as public Wi-Fi data, bluetooth data, digital outdoor sensor data, in-store sensor data, and so forth.

Function 5. identify "social network" groups

Unique groupings of devices can be created through additional location and data analysis. These identified "social networks" may be sold as unique audiences that are used to contact socially relevant groups without relying on traditional social networking sites (e.g., Facebook) to provide the data. The added value of groups created by system analysis is that these are real-life face-to-face social groups, not just groups that may be merely virtual online groups.

Grouping or device linking arranges the mobile devices into social graphs, where connections are inferred by commonality of access to the locations at approximately the same time. The number of times two devices are seen at the same time in the same place represents a stronger or more likely factual social connection. Social graphs are used to expand mobile device audiences, relying on the assumption that the mobile device user's social network is often interested in the same product.

Various embodiments may use a variety of methods to identify social network groups: 1) location or location/date/time for a particular input location, or 2) group-based autonomous multi-location. Each type of group has different benefits to the advertiser. Location-specific based groups tend to identify larger groups of macro audiences, such as audiences that show interest in a particular type of sporting event, entertainment, or retail category type. Groups may be identified based on characteristics of locations or points of interest where users are found together. For example, devices found together on a series of bike paths may be identified as riders. The user's personality as a person may be stored on a node or edge of the graph-that is, associated with the device, or the person itself, or with a link that connects many devices or people together. Multi-location groups are smaller groups that exhibit more common characteristics of interest, thereby providing a more focused audience.

According to some embodiments, the process of identifying a social network group from a mobile device location or event dataset may include the steps of:

1) data was processed as described in "function 1" of the "data operation flow" above, with the following modifications. Rather than grouping the data by device prior to performing the clustering algorithm, the data sources may be grouped by discrete dates and time periods, e.g., from 12:00pm to 12:15pm for 10 months and 5 days, and then run through the clustering algorithm. This generates clusters based on location, each cluster having multiple devices. This is done for multiple dates/time periods.

2) An algorithm is used to compare devices present in one cluster for one date/time period with devices present in clusters for other date/time periods and identify which devices appear together in many different clusters for different date/time periods.

3) Scoring a quality of possible associations between devices identified in 2) above using an algorithm.

4) A database is created that identifies "social groups" of devices, each group having a unique identifier.

5) The system creates a device list containing location records and assigns a unique group ID for future reference.

The process of the system autonomously identifying the social network group involves more because of the amount of data that must be processed. In accordance with one or more embodiments, the step of autonomously creating a social network group may comprise:

1) the system sorts and segments all location records in the system by date and time block within each date. The time blocks may be specified in N minute increments. For example, a time block of 15 minutes (N ═ 15) would group all records for a particular date into different groups, with times 00:00 to 00:14, 00:15 to 00:29, 00:30 to 00:44, etc. over the entire 24 hour period.

2) The position coordinates in each time block are grouped using the same type of clustering algorithm described in function 1 above. The generated groups are divided by location and include all devices and will result in multiple groups being created for each time block. These groups are assigned a temporary group identifier, e.g., T1G1, for time block 1, group 1.

3) The system may then create a table where the rows represent individual device IDs and the columns represent group identifiers. If a device is present in a group, the corresponding cell may be marked with a presenter (1, true, etc.). If the device is not present in the group, the cell may be empty.

4) The system may then compare one device at a time with all other subsequent devices line-by-line for analysis. If another device has at least a Z location group overlap (where both devices have a "1" in the location group), where Z is entered by the operator and is variable, then the two devices are placed into a new table that is keyed by social group ID (SG0, SG1, SG2, etc.), with a list of devices associated with each social group. Each time a device is added to a new social group, a counter is updated in the device list.

5) The system repeats this process for the next device, but only compares with subsequent devices, not those previously analyzed.

6) Once all devices have been analyzed, there is a large list of social groups that are recognized by the system. A single device may be in zero, one, or multiple identified social groups.

7) Counters in the device list may be used to identify and rank based on the reach of the social influencers (most groups to least groups).

8) The social group table may also be processed into a relationship graph format to identify relationships between groups.

FIG. 6 illustrates a graphical structure 600 corresponding to a social link 610 in a social network. The constructed graph will have nodes 620 corresponding to device ids (maids), with edges (connections) representing the relationship 610. Attached to node 620 is metadata of a mobile ad id (maid) value and attached to edge 610 includes metadata such as a count, a fine (fine) S2 hash list, a timestamp list.

The graph includes three nodes 620. With metadata "MAID: node 620A of a12 … "connects to a node having metadata" MAID: b34 … "620B and" MAID: c45 … "602C. The edges themselves have metadata. The edges are symmetrical. Queries against the graph may filter the metadata of the node 620 and the edge 610.

Social graphs are used to make the audience larger. One can use it in both the reverse and forward directions. The following are some example scenarios: establishing the audience for the device seen at the auto dealer. The initial audience may be expanded using social graphs in a variety of ways. We may do this if we: add all first level connections seen with the original equipment 5 or more times; adding a primary connection seen with the original equipment only during off hours; first and second level (friends of friends) connections. An audience for people who visit places related to cars is constructed, which indicate (sunstest) higher-order car knowledge, such as tracks, car shops, cars and coffee parties, etc. We can then expand the audience. One modification is to extend only to social connections that have recently visited auto dealers, regardless of whether they visited dealers using the original equipment. In this way, the system can improve the social network, connecting those who want to purchase a car with those who know about the car in their social network.

The system may also enhance the social network by overlaying data from the linked database to provide a representation of the social groups and further segment them by these conditions to create subgroups. This process may be repeated using different time periods and/or new or modified data to improve results, identify changes, and increase confidence levels of the quality of the identified social groups.

FIG. 7 is a flow diagram illustrating an embodiment of a method of generating a location social network. In step 710, the system collects observation data on the mobile device. The data includes longitude and latitude locations as well as a device ID and a timestamp. The associated tuple can be represented as: mobile ad id (maid), latitude/longitude, timestamp. In some embodiments, the collected data also includes metadata about the place of interest at the identified longitude/latitude, i.e., the location is identified as a residential location (further divided into apartment buildings or villas/detached houses), a commercial location with a particular focus (e.g., comic book store), a public place (e.g., a park), or a public use (e.g., on a street/sidewalk). In some embodiments, the collected data is limited by geographic and/or temporal limitations. The associated output may be divided into a number of output files (e.g., CSVs) in a column format.

The system processes the data using the MapReduce method, where the final output is in a format suitable for ingestion into a graphical database. In some embodiments, the method utilizes a single mapper step and two reducer steps, and an optional third reducer step. In step 720 (mapper), the system reads the observation data and bases it on each record. In some embodiments, the mapper step comprises rounding the timestamp down to the nearest hour (up or down). It is unlikely that two devices at the same location will report locations at exactly the same time, so some rounding off helps generate social links between devices.

The mapper further computes a position hash (e.g., S2 geometric hash) for the coarse level and the fine level. Examples of fine and coarse include levels 12 and 20, respectively. Class 12 buckets (buckets) are about 2 kilometers per side and class 20 buckets are about 10 meters per side. The specific level is a parameter that can be changed. The coarse level is adjusted so that subsequent calculation steps are easy to handle. Too fine a bucket may result in an unmanageable number of reduction tasks, while too coarse a bucket may result in errors of unbalanced work or insufficient memory. The fine buckets are sized small enough that the devices in the same bucket may be meaningfully close together, but too fine to eliminate all meetings between devices (e.g., a bucket one centimeter in length would have little to no meetings between devices).

In some embodiments, rather than a location bucket, the Euclidean distance between each device is used. The pair of devices falling within the given threshold proceeds to the following step. To improve computational complexity, a combination of location buckets may be used to filter the devices prior to performing euclidean distance processing.

The mapper sends out the observation (MAID, latitude/longitude) on the combined key of (coarse S2 bucket, timestamp). The keyword is a string with connectors between each part of the keyword (e.g., "SSSSSSSSSSSS _ YYYY-MM-DD-HH"), where S is S2 hash value, Y is year, M is month, D is day, and H is hour.

In step 730 (the first reducer), the method processes the observations of the combined key of step 720. The method operates on observations of all matching keys from the mapper. In particular, observations having matching fine geographic buckets and timestamp values are grouped together. To get a matching observation in the fine bucket, the system first performs a reduction based on the coarse bucket. The fine S2 bucket subdivides (tesselate) the coarser bucket above it. Because finer buckets subdivide coarser buckets and the method performs reduction on coarse buckets, there is no overlap between reducers on the fine buckets.

In some embodiments, fine bucket groups having more than a certain number of devices are filtered out. Reasons for removing crowded, elaborate bucket groups include reducing computational complexity and machine inferences about social connections. The next step involves O (N)²) The computational complexity of (2) which becomes cumbersome for very large groups. Second, places/times with a large number of devices are less informative for building a social network. Two devices are less meaningful in the same major sporting event (e.g., a wild horse race) than they are the only devices at a certain location/time (e.g., a quiet community park).

For each remaining fine bucket having a group, the method constructs a pair-wise relationship between all devices in the group (this is O (N) as noted above²) Step (ii). The pairwise relationship uses tuples (device _ a, device _ B) with additional data (fine S2 hash, rounding timestamp) as keys. In some embodiments, the pair-wise relationship is symmetric, so no additional mirror records are needed, e.g., (device _ B, device _ a).

In step 740 (second reducer), the method processes the device relationships. For each device value, the second reducer receives a list of (device _ a, device _ B, fine S2 hash, round timestamp) values and sorts and maps the values. That is, sorting the pairings generates a count of device _ a and device _ B together. One example output is a fine S2 hash list and a list of YYYY-MM-DD-HH timestamps when two devices were observed together.

Given a list of each device _ a and device _ B pair (including counts), the second reducer applies a count filter, removing device pairs with too few meetings (e.g., removing all device pairs with less than 5 events). The count filter may also layer pairings. For example, five or more events may be at a first level, 10 or more events at a second level, 15 or more events at a third level, and so on. Higher layers represent stronger social links. The above list and count are used to generate metadata that is attached to the graph relationships.

In step 750, the method may further categorize social links between the devices based on metadata included with the respective location observations and time periods. For example, observations that occur only on weekdays may indicate that the respective owners of the two devices are co-workers. If many meetings are observed in a wide variety of types of places over a long period of time, this may indicate that the respective owners have a relationship of love. Where the meeting is only observed at the entertainment venue, the user may be a friend. Each classification is developed as a confidence score. Taxonomy relationships are searchable attributes in an associated database search engine. In some embodiments, the classification relationships are filtered by deriving a predefined confidence score.

In particular, the advantages of the graphic database allow a query of all devices associated with a given originating device to be efficiently returned given that device: "given device _ a, all devices associated with it are returned". The associated location and time may be further filtered by saving the (fine S2 hash, rounded timestamp) values as a list. For example, the following problems may be posed: "given device _ a, returns all devices that appear more than 5 times with device _ a" or "given device _ a, returns all connected devices seen at that location between the times of the day. "

The graph database allows one to traverse the network at an arbitrary depth. How this is found to be "friends of friends": "given device _ a, and all devices connected to device _ a, return all devices connected to device _ a and all its connected devices. "the query may select any number of degrees of interest to the searcher. For example, "given device _ a, return all devices within 4 degrees of device _ a. The result will include all links to device _ a, all links to these devices, and all links two degrees more to those subsequent devices. Any of the above searches may be combined to filter in a variety of forms.

The graphical social network may also query users found at each of the multiple places. If the nodes in the graph exist as both sites and users/devices, the site nodes cannot be directly connected to each other. However, a user node may act as a path between two or more site nodes. An administrator may query a path between any two site nodes. An example query would include "given site _ a and site _ B, return all devices that have gone to both in case _ X". This example query returns a list of devices from which further queries may identify the social networks of these devices. The premise is that if a person of a certain desired type would visit two particular locations, their friends may be the same type of person.

There are a number of ways to construct social network graphs: in some embodiments, the relationship data is represented in chunks divided by time, e.g., 8 months of 2019 and 9 months of 2019 are separate tables. Doing so may remove old data. However, a major drawback of time chunked data is that the total number of meetings between two devices across all times will be more difficult to calculate and filter.

In some embodiments, the social networking graph is incrementally built (micro-chunking). Micro-chunks are short time spans (e.g., hourly or daily). The result is added incrementally to the database, meaning that if two devices are already linked in the database, the operation will attach a new meeting between them, otherwise a new relationship will be stored. The advantage is that the system can query/filter the entire history between the two devices more easily. The disadvantage is that the computational cost of iterating the entire data set is higher (for creation and deletion).

Function 6. delivery of data to applications

Once the mobile device has been linked to homes and individuals within a variety of databases, these individuals can be delivered to consumer applications for a variety of business, public safety, and other uses. In this case, the consumer specifies any mobile, Web-based, or other type of application that uses the data to provide services based on the data. The first embodiment supports mobile advertising networks to target advertisements by delivering interest data to the mobile advertising network, but can be used in any application based on the use of mobile devices or location information.

The service may provide real-time or historical information from the database to the consumer application. These applications may receive data from the system as file transfers or Web-based synchronous or asynchronous services based on JSON or other similar protocols. Two main modes of providing data to consumer applications are: 1) device specific requests, and 2) location requests.

The device specific request is designed to return family or personal information associated with a particular device. For this service, the consumer application passes to the system a mobile device ID that is appropriately encrypted or obfuscated (obfuscated), and the service returns a set of anonymous data associated with the device ID.

The location request may take two forms, but in each form, the consumer application passes in a location, typically in latitude/longitude format, which is requesting information from the system. A first type of location request generates a combined response for all mobile devices within a particular radius of the location. The second generates individual-level responses for all devices within a certain radius of the location. For both types of requests, the system uses real-time movement event data to identify mobile device IDs near the requested location.

The combined response request builds an aggregated view of all devices. This is typically used for marketing type services, which are looking for the characteristics of the group. In this response, the system combines the data for each data field to be returned and provides a weighted percentage of the value in each data field. For example, if one of the data fields is "male" and 10 devices are identified near the location, 3 of which are linked to data marking that field as "yes," the system will return a response to the consumer application telling it how many devices in total, "male" is 30%.

The personal response request may build an array of all data through the personal device and pass it back to the consumer application. This allows the consumer application to view each person's data separately. Note that the system may or may not return the encoded device ID as part of this service.

Target data for use by partners

The system does not provide advertising services. To supply data for online advertising use and deliver targeted offers to consumers, the system may share aggregated buyer audience data (e.g., furniture buyers) with selected advertising services, distribution, or advertising network partners, as described below. There are 4 ways to do this.

Option 1: serving audience level data in an advertising network via PII matching

The system provides data to the independent data processor as a third party partner to execute PII-based databases that match other Network Advertisement Initiative (NAI) members. Typically, some or all of the following fields are used for matching — mobile number, device ID, MAC address, name, address.

The process for matching data according to some embodiments is summarized as the following steps and is shown in fig. 4:

the system creates a data file containing a name, address, phone number, UDID, or MAC address, or other identifier, for matching the audience, and transmits it to the data processor over a secure channel.

The system also provides buyer audience attributes (e.g., furniture buyers) in the data file that will ultimately be used by partners of E2M for targeting.

The system file is sent to the data processor and compared with the partner file to identify matching records for output.

An output file called the system matching set is constructed to include the partner's record identifier (ANONYMOUS-ID) and the target attributes of the system.

The output file from the data processor to the partner does not contain any PII.

During matching, the standalone data processor appends system buyer _ audience level information to the partner files for which PII matching exists. After a match occurs, all unmatched information may be discarded.

Matching and outputting: the system match set is transmitted from the data processor to a data store of partners where it will be served for digital advertising.

Partner targeting process:

the data processor transmits the output of the matching process, the system match set, to the partner over the secure channel. The partner then performs the following steps to prepare the data for targeting:

standardizing system buyer audience data and storing the data in their database;

omicron anonymous user profile;

determine target data for an online advertisement delivery system provisioning system; and

starting active delivery

Partner anonymization of user profiles:

partner ad delivery is based on anonymous IDs rather than PII, e.g., UDIDs or any ID associated with PII. To enable the use of the above-described matching set on partner networks for the purpose of targeting, a forward hashing technique has emerged to convert the PII-ID to an ANONYMOUS-ID.

Note that:

data keyed to ANONYMOUS identifiers (ANONYMOUS-IDs) and PII-IDs are stored in different operating environments and are not strategically co-located.

There is no look-up table associating ANONYMOUS-IDs and PII-IDs.

The conversion from PII-ID to ANONYMOUS-ID is unidirectional.

These anonymous profiles are then moved to the user profile store to be provisioned for ad delivery.

Partner-supplied advertisement delivery service

As previously described, each user active on the partner network may have at least one partner ANONYMOUS-ID associated with them. The partner advertisement delivery system may deliver the advertisement based on this identifier ANONYMOUS-ID. Whenever the user is on the partner network, an ad delivery request from the browser request will be associated with the user ANONYMOUS-ID. The browser request may be completed by a partner.

Advertisements using system data are delivered when the advertisement delivery system sees a user with a defined set of system attributes 1 specified for that particular campaign.

Option 2: real-time provision of audience-level data for advertisement servers and advertisement networks

The system may be integrated with partner ad servers and ad networks such that when a request is made to Get _ Offer (i.e., display an ad) within their ad serving platform, the ad serving platform will make the request to the system target data engine. During invocation of a recommendation engine of the system, the ad server provides a mobile device identifier, such as a hashed UDID or location, to the system. The system may then return audience data associated with the device or location. The advertisement server will then use the audience data provided by the system to select advertisements to display. As shown in fig. 4.

According to some embodiments, obtaining the format of the _ audience request may include:

PID: a system assigned partner ID to identify the source of the request.

RT: location (1) or device (0). The default is the device.

DID: the identifier, which is the mobile number of the SHA-1 hash, the device ID (UDID, IMEI, MEID, ESN), MAC address, or cookie ID, which E2M will use to obtain audience data for the device request. If this is a location request, this field should be used as a partner-generated tracking ID for the request/response.

LLAT: the latitude of the location collected from the device is requested for the device (if available) or the latitude of the location of the audience is aggregated.

LLON: the device is requested (if available) for the longitude of the location collected from the device or the device is requested to aggregate the longitude of the location of the audience.

The request tag is a fully qualified URL with the query parameter set:

URL syntax:

http：//on.spot.extendtomobile.com/onspot.jsPID＝<PartnerID>&RT＝<Value>&DID＝<Hashed_ID_VALUE>&LLAT＝<Latitude>&LLON＝<Longitude>

the audiodataresponse tag will have a different format for device and location requests. Any one type of response may be one of the following 3 types: script, image, or i-frame. Each partner will provide the system with the format/syntax required by its response tag.

The device requests a response. The device request response will only return the targeting information for the hashed device ID sent in the request, as well as the audience category to which the device belongs. One example of a response is:

http：//www.ThePartner.comDID＝＜Hashed_ID_Value>&id＝D045&id＝C001&id＝C004

the response returns three audience segments of the requested device ID and the advertising network can now select an advertisement based thereon.

A location request response. The location request response may return certain target data for aggregating the audience at a particular location. For example, if the system found 100 people near the location coordinates passed in the request, it would identify audience 10 people at D045, 1 people at C001, and 25 people at C004. The response looks like:

http：//www.ThePartner.comDID＝<Partner_Tracking_ID>&tot＝100&id＝COO4&cnt＝25&id＝D045&cnt＝10&id＝C001&cnt＝1

this provides partners with a total audience size and an audience size divided by category so they can decide on quantity and quality for their advertising decisions.

The particular item of the response tag may be customized according to various embodiments by:

segmentation symbol: the partners may represent a single character separating the segments between the elements. The most common is "&".

Suffix: the partner may include additional static information, which may be additional name/value elements, appended to the response tag.

Type (2): the partner may specify a script, an image, or an i-frame tag.

Option 3: automatically supplying audience level data for advertisement servers and advertisement networks

The system can be integrated with our partner ad servers and ad networks so that we send them targeting data for each known audience member in the system database. This may be transmitted periodically using the same type of response format as in option 2 or using secure FTP as a file transfer.

Customizing audience: reducing advertisement server processing

The previous discussion focused on delivering a system standard audience to partners. The system may also build and serve a custom audience for partners, for example, if an advertiser wants to show their advertisement to hispanic furniture buyers with revenue between $ 100,000 and $ 150,000. These characteristics correspond to the standard audience of the system, D047 (hispanic), D104 (income 100-. The system can run an offline process to build a custom audience-P1001, which has analyzed all of these characteristics-so that when the system reports to the ad server, in its response the ad server need only look at P1001 to see if it should display the advertiser's ads, rather than trying to cross-check all conditions in real time, especially if there are a large number of selection conditions.

To do this, the advertising network must inform the system in advance of the advertising campaign and conditions of the advertiser so that the system can serve the viewer. The goal of the provisioning time is < 1 working day after receiving a partner request.

Function 7. Interactive tool for querying data marts

To provide added value to customers, many interactive reporting tools may be made available via a website, mobile device, computer, or other platform. These reporting tools include, but are not limited to:

1. audience counting tool: the system may provide a real-time interface where the user may select a condition from the available data in the database and obtain a real-time count of the number of devices in the database matching the condition. This basic information enables a sales team or advertiser to interactively estimate audience size for an advertising campaign, in addition to other potential uses for such counting data. By augmenting the base count data with other data (e.g., the number of ad requests a device makes on a particular ad network per day), a more comprehensive model of the number of ad impressions that may occur on a particular ad network can be constructed and compared to the effectiveness and coverage of different ad networks.

2. Location device count and profile: the user may enter a location (e.g., a street address or latitude/longitude) and obtain a report of the number of devices seen at that location, or how many devices are currently at that location in real time. The user may select conditions to obtain reports that partition information by different date/day/time periods and create statistical profile reports based on the information stored in the system data marts.

3. Location-based audience: the system may provide an interface for a user to enter a location or set of locations and tag them with a set of data features that can be used to query the data mart to identify devices with particular conditions. For example, a user may enter a set of locations and identify them using the following data characteristics: "jewelry stores", "high-end", or "shopping malls". The data may be from a point-of-interest database, a government database (e.g., a business database, a retailer database, or any other commercial or private database), or may be manually entered.

Once the new data characteristics are stored in the system, the user may enter a query to "find all devices that visited the high-end jewelry store within the past 14 days". The system will be able to identify all devices already at these locations and build a profile report on the personal/family characteristics associated with these devices. In addition, the system may take the results and build a list of devices that may be targeted to an "in market" audience.

Note that these types of audiences can be built in real-time by setting the trap query so that any device that matches one of the locations in the trap query is automatically added to the audience each time the system receives and processes the mobile device location data. These audiences can then be provided to the user via the method in function 6 described above.

4. Location-based profile: reports may be generated for retailers or other persons seeking information about individuals visiting or near the physical location. For example, a retailer may want to know who passes through their store every day, regardless of whether they are coming in or not. In addition to the time period (day/date/time), the system may provide an interface to enter street location or latitude/longitude and provide a profile report based on the devices that meet this condition. This information may be used to build a device list for a mobile advertising campaign, or to perform analysis on the source and mobile behavior of the device.

INDUSTRIAL APPLICABILITY

Data from multiple mobile event data sources results in the rapid construction of a very large set of mobile device IDs that can be linked back to a very large number of data sources. While primarily applicable to smart phones and tablet devices of the large number of consumers of today's mobile data services, as consumers adopt these devices in the coming years, it will eventually cover the entire mobile device population.

By providing information about individuals associated with a mobile device, the system is able to build many different solutions, including but not limited to:

intelligent solutions-these solutions provide access to data, enabling applications and services to customize service delivery or user experience based on the data provided. This may be a financial application that offers different financial solutions to potential customers based on age, income, and investment information, or a mobile advertising network that delivers different advertisements based on personal shopping interests or brand preferences.

Analytical solutions — these solutions provide a comprehensive view of the population at different times for a given area. Retailers who want to plan new locations want to know about people coming to a particular shopping center or street, or city planners who want to know about the commute pattern of a particular area by looking at different times of the day.

Situation aware solutions-these solutions provide real-time views and information for public safety, homeland safety, emergency rescuers, etc. Examples include information that can identify the likely number of people at the event location, and age, health, criminal background, etc.

These are only a small part of the applications, as every merchant and government agency has a large amount of data that they use today and wishes to associate with mobile devices and users to extend their public services.

1. The present system may provide an identification of the mobile device to the individual or family, which may be used to match back any database that uses addresses as key elements to identify data.

2. The present system may provide identification of mobile devices, where the system may be used with any type of mobile device identifier, such as UDID, Wi-Fi MAC address, Bluetooth ID, browser cache files (cookies), or any other persistent or semi-persistent identifier. A semi-persistent identifier is an identifier that exists for a period of time before being changed, which may be a day, a week, a month, or more.

3. The system may provide identification of mobile devices, where the system may be used with any type of mobile device on a cellular or Wi-Fi network using any type of service plan, including subscription, corporate (corporation), prepaid, etc.

4. The present system may provide for identification of mobile devices by cross-matching multiple mobile device identifiers with a single anonymous identifier.

5. The present system can provide identification and anonymization data of mobile devices such that privacy is protected when the data is used for business purposes.

6. The present system may provide for identification of a mobile device, which may be used with any mobile device data including the following elements: 1) a mobile device identifier, and 2) a geographic location tag. Time/date stamps associated with mobile device data are desirable and may or may not require linking the device to a database, but for some applications and analysis are required to deliver different services.

7. The present system can provide an identification of the mobile device that works with any mobile device location data and takes into account the variation in accuracy of the mobile location data from the data source.

8. The system may provide an identification of the mobile device that may obtain real-time data as well as batch data.

9. The present system can provide identification of mobile devices, where the system provides delivery of link data to commercial services, merchants, governments, and other customers in two ways: 1) in response to a query about a single device, or 2) in response to a query about a location.

10. The present system can provide identification of the mobile device that does not require any subscriber data from the mobile operator to link the device back to any database.

11. The present system may provide for identification of mobile devices that do not require any location data from the mobile operator.

12. The present system may provide for identification of mobile devices, which may identify "social networks" of devices having a common interest based on location data, which may be linked back to a business database for analysis purposes.

13. The present system may provide for identification of mobile devices, which may identify a "social network" based on a single selected location entered into the system or based on multiple locations autonomously generated by system analysis.

Function 8. detection of family relocation

Function 8 detects when the user has changed home based on the behavior of the user's respective mobile device. People moving homes often have many commonalities that are useful to data science; thus, the present system includes functionality to build an audience based on the migration date and/or the migration location.

The detection of relocation builds on the home matching method (HHMM). In some embodiments, the relocation detects monthly matching devices to detect monthly changes. Every day, a list of devices seen that day is generated in the data intake. The HHMM is read in this list. The HHMM determines the address of the month and compares it to the address of the previous month. A user is identified as having moved if the month address of a given device is a new address that is different from the family associated with the device within two months of any three month span.

FIG. 8 is a flow chart illustrating a method of detecting a relocation. For each device being analyzed, the relocation detect reads all observations of that device from the database at step 810. In some embodiments, the observations span up to 12+ months of past observations, but less than 13 months. Relocation detection is performed on the relevant devices on a regular basis. In step 820, a filter and wash step is applied to the observations. The filter removes duplicate observations by latitude/longitude and observations that may be obtained during transit (on the road). At step 830, the relocation detect identifies any previous change of address records. At step 840, the relocation detect identifies the last month parsed out in the change of address record and updates the analysis using that month as the starting month.

In step 850, the HHMM is applied to the observations within each given month. First, the observation data is divided by month. The relocated person, especially a tenant, usually moves at the end/beginning of the month. For each month including and following the starting month, the relocation detect application HHMM and finds the best match (if any). It is possible that a month does not return a possible family match, in which case the results for that month will be discarded. In some embodiments, the current month is considered before the end of the current month. If there is a best match for the month, the detection record for the temporary address is relocated.

A given address is identified via clustering. Filter retention cluster: 1. the radius is not too large. In some embodiments, clusters greater than 100 meters are not good clusters to pick as a home. 2. There were sufficient observations. In some embodiments, greater than 200 observations in the cluster are sufficient. 3. Close to the spatio-temporal cluster (time-space cluster). 4. The ratio of weekday observations to weekend observations was good. If the cluster has predominantly workday observations, it is unlikely that someone will live there, but rather work there.

Clusters were further scored based on: 1. what fraction (fraction) of all observations are observations in the cluster? A larger fraction is better. 2. A weekday/weekend ratio was used. Lower values (more weekends than weekdays in the cluster) give higher scores.

For each cluster, the HHMM determines a list of nearby residential addresses. Home matches are scored based on how close the home is to the cluster and a comparison to the addressing lines (propertylines). The address score is higher if the observations in the cluster are closer to the home and within the address line. A given home is selected as a monthly address based on the cluster and home matches that give the best combination score. In some embodiments, a threshold score is needed to match the address to the month.

In step 860, the movement detection identifies the new address. The history of previous and newly created changes of address is analyzed to find new addresses and remove duplicate entries. In some embodiments, the change of address occurs when the device matches a new address for two of three months. At the same time, the date on which a given device needs to be analyzed next is calculated. Devices with fewer observations may be allowed to be analyzed more frequently, while devices with more observations need to be analyzed less frequently. In some embodiments, all devices are analyzed for movement detection at least once every 30 days.

In some embodiments, the change of address record is stored in a database with a set of data belonging to categories that can each be queried by a database search engine. These records allow queries to be filtered by the date of the move and where the device moved/moved. The data categories include: i. device identifier, ii. mobile date, iii. previous zip code, iv. new zip code, v. previous address, and vi. new address. The date on which a given device needs to be re-analyzed is also stored in the database.

Example queries include:

all devices that moved into zip code (1, 2, 3) in month 6 of 2019 were found.

All devices that left zip code (1, 2, 3) at 6 months in 2019 were found.

All devices that moved from zip code (1, 2, 3) to zip code (8, 9, 10) during the period from 2019, month 1 to 2019, month 6 were found.

The zip code that moved in/out the most in 6 months in 2019 was found.

Fig. 9 is a table showing an example of movement detection. The representation shows two address changes. The first change was to address "a" in month 1 of 2019 (based on month 1 to month 3), and the second change was to address "D" in month 6 of 2019 (based on month 6 to month 8). Matches with addresses B and C are discarded because the devices are not matched a sufficient number of times, or are not matched close enough within two months, failing to satisfy the two-out-of-three rule. Movement in month 6 of 2019 cannot be determined until sometime in month 8 of 2019.

In some embodiments, there is no fixed time with respect to when the movement detection may identify the movement. When there is "enough" data to make the decision, the movement detection will assign the device to the home address. By "sufficient" is meant that the device has sufficient observation data so that the HHMM can assign the device to a residential location, and that this assignment is better than any other choice it can make. In some embodiments, "sufficient" is hundreds of observations, and the amount of time a device will produce that many observations is largely dependent on the device and the application on the device. Some devices require a full month to create this much data, while others may complete in one or two days.

In theory, a change of address can be detected within one day of two months. One day of the last two and third months. One of the first two months is matched to "third month". Once enough observations are collected at a given home, the monthly address of the third month can be identified. While it is unlikely that enough observations will be collected on the first day of the month, it is theoretically possible. In some embodiments, there is a lower limit for the number of days for a given month for which the monthly address is calculated.

Overview of an exemplary computer System

Embodiments of the present technology include various steps and operations that have been described above. Various of these steps and operations may be performed by hardware components or may be embodied in machine-executable instructions, which may be used to cause a general-purpose or special-purpose processor programmed with the instructions to perform the steps. Alternatively, the steps may be performed by a combination of hardware, software, and/or firmware. Accordingly, FIG. 10 is an example of a computer system 1000 that can utilize embodiments of the present technology. Computer system 1000 is an example of a means for performing functions and performing several of the operations described above. According to this example, the computer system includes a bus 1010, at least one processor 1020, at least one communication port 1030, a main memory 1040, a removable storage medium 1050, a read-only memory 1060, and a mass storage 1070.

Processor 1020 may be any known processor, such as but not limited to

A series of processors;

a series of processors; or

A series of processors. The communication port 1030 may be an RS-232 port for use with a modem-based dial-up connection, 60/100 toAn ethernet port, or any of the gigabit ports using copper cable or fiber. The communication port 1030 may be selected according to a network, such as a Local Area Network (LAN), a Wide Area Network (WAN), or any network to which the computer system 1000 is connected.

The main memory 1040 may be a Random Access Memory (RAM) or any other dynamic storage device known in the art. The read only memory 1060 may be any static storage device such as a Programmable Read Only Memory (PROM) chip that stores static information such as instructions for the processor 1020.

Mass storage 1070 may be used to store information and instructions. For example, SCSI drives, for example, may be used

A family of hard disks, optical disks, an array of disks (such as the Adaptec family of RAID drives), or any other mass storage device.

The bus 1010 communicatively couples the processor 1020 to other memory, storage, and communication blocks. The bus 1010 may be a PCI/PCI-X or SCSI based system bus depending on the storage device used.

Removable storage media 1050 may be any type of external hard disk drive, floppy disk drive, solid state storage drive, cloud storage system,

a Zip drive, a compact disk read-only memory (CD-ROM), a compact disk rewritable (CD-RW), and/or a digital video disk read-only memory (DVD-ROM).

The above components are meant to illustrate some types of possibilities. The above examples should in no way limit the scope of the present technology as they are merely exemplary embodiments.

Embodiments of the present technology may be implemented using a combination of one or more modules or engines. For example, embodiments provide a graphical user interface generation module that generates one or more graphical user interface screens to convey results/information and retrieve instructions, a general or specialized "communications module" for interfacing with various components and databases, "data collection module" for collecting information from various sources, "anonymization module" for anonymization data, "rating module" for assessing home matching quality, "linking module" for linking addresses to mobile devices, "social graphics module" for grouping devices based on one or more spatial and temporal analyses, "reporting module" for generating device and location reports, and other modules and engines for providing the various functions required by embodiments of the present technology. Still, various embodiments may combine two or more of these modules into a single module and/or associate some of the functionality of one or more of these modules with different modules. Each of these modules and engines provide examples of means for performing the functions and operations described herein.

Various modifications and additions may be made to the discussed embodiments without departing from the scope of the present technology. For example, while the embodiments described above refer to particular features, the scope of the present technology also includes embodiments having different combinations of features, as well as embodiments that do not include all of the described features. Accordingly, the scope of the present technology is intended to embrace all such alternatives, modifications, and variations and all equivalents thereof.

Claims

1. A method of building a location social network, comprising:

collecting location data from a set of mobile devices a plurality of times over a period of time, each time including a timestamp;

detecting, based on the location data, a first mobile device and a second mobile device in the set of mobile devices that are co-located with each other at the same time;

incrementing a counter associated with the first mobile device and the second mobile device based on the detecting; and

in response to the counter exceeding a threshold amount, generating a link between the first mobile device and the second mobile device in a social network.

2. The method of claim 1, further comprising:

identifying media content of the first respective user delivered to the first mobile device; and

corresponding media content is delivered to respective users of mobile devices in the set of mobile devices having a link with the first mobile device, including a second respective user of the second mobile device.

3. The method of claim 1, further comprising:

classifying the links in the social network into tiers based on an amount of the counter, wherein progressively larger threshold amounts associated with the counter correspond to progressively higher tiers.

4. The method of claim 3, further comprising:

identifying first media content delivered to a first respective user of the first mobile device; and

varying media content is delivered to respective users of mobile devices in the set of mobile devices based on the first media content, with a link to the first mobile device, including a second respective user of the second mobile device.

5. The method of claim 1, wherein the same location is based on any of:

the first mobile device and the second mobile device are within a threshold distance of each other;

the first and second mobile devices are both located within a predetermined bounded area; or

The first mobile device and the second mobile device are both connected to the same local wireless network.

6. The method of claim 1, wherein the simultaneous based on:

rounded to the nearest hour; or

Within a threshold time of each other.

7. The method of claim 1, wherein the same location is based on both the first mobile device and the second mobile device being within a predetermined bounded area, and the incrementing the counter is further based on satisfying no more than a maximum threshold number of mobile devices in the set of mobile devices while being within the predetermined bounded area.

8. The method of claim 1, further comprising:

receiving a search query for the first mobile device; and

in response to the search query, returning a list of mobile devices in the set of mobile devices linked to the first mobile device, the list of mobile devices including the second mobile device.

9. The method of claim 8, wherein the return additionally includes in the list of mobile devices all devices within the query that are a specified number of links away from the first mobile device.

10. A method of constructing a location social network, comprising:

collecting a set of location coordinates from a set of mobile devices during a mobile device application cooperation;

associating location coordinates with the point of interest based on the mapping metadata; and

generating, in a social graph, a link between respective users of the first and second mobile devices of the set of mobile devices based on simultaneous physical presence of the first and second mobile devices at the same point of interest during the same period of time.

11. The method of claim 10, further comprising:

classifying the link between the respective users of the first and second mobile devices, or the respective users of the first and second mobile devices themselves, based on characteristics of the same point of interest.

12. The method of claim 11, wherein the classification is any one of:

a friend;

family members;

co-workers;

a friend of a friend; or

Any combination thereof.

13. The method of claim 10, further comprising:

classifying the link between the respective users of the first and second mobile devices, or the respective users of the first and second mobile devices themselves, based on the characteristics of the same time.

14. The method of claim 12, wherein the link is classified as a colleague when the same time is afternoon on weekdays and as a friend when the same time is on a weekend.

15. The method of claim 10, further comprising:

receiving a search query comprising the first mobile device, a first point of interest, and a time period; and

in response to the search query, returning a list of mobile devices in the set of mobile devices that are linked to the first mobile device and that are at the first point of interest for the period of time.

16. A system for building a location social network, comprising:

a processor; and

a memory comprising instructions that, when executed, cause the processor to:

incrementing a counter associated with respective users of the first mobile device and the second mobile device based on the detecting; and

in response to the counter exceeding a threshold amount, generating a link in a social network between the respective users of the first mobile device and the second mobile device.

17. The system of claim 16, wherein being at the same location is based on any of:

the first mobile device and the second mobile device are both located within a predetermined bounded area; or

18. The system of claim 16, wherein the simultaneous based on:

rounded to the nearest hour; or

Within a threshold time of each other.

19. The system of claim 16, wherein co-location is based on the first mobile device and the second mobile device both being within a predetermined bounded area, and incrementing the counter is further based on satisfying no more than a maximum threshold number of mobile devices in the set of mobile devices while being within the predetermined bounded area.

20. The system of claim 16, wherein the system further comprises:

a user interface console configured to receive a search query for the first mobile device and return a list of mobile devices in the set of mobile devices linked to the first mobile device, the list of mobile devices including the second mobile device.

21. The system of claim 16, wherein the memory further includes instructions that cause the processor to identify media content delivered to a first respective user of the first mobile device, the system further comprising:

a network server configured to deliver corresponding media content to respective users of mobile devices of the set of mobile devices having a link with the first mobile device, including a second respective user of the second mobile device.