WO2013126281A1 - Systèmes et procédés pour l'analyse présumée de groupement - Google Patents

Systèmes et procédés pour l'analyse présumée de groupement Download PDF

Info

Publication number
WO2013126281A1
WO2013126281A1 PCT/US2013/026343 US2013026343W WO2013126281A1 WO 2013126281 A1 WO2013126281 A1 WO 2013126281A1 US 2013026343 W US2013026343 W US 2013026343W WO 2013126281 A1 WO2013126281 A1 WO 2013126281A1
Authority
WO
WIPO (PCT)
Prior art keywords
cluster
distinct data
connections
data points
clusters
Prior art date
Application number
PCT/US2013/026343
Other languages
English (en)
Inventor
Johannes Philippus de Villiers PRICHARD
David Alan Bayliss
Original Assignee
Lexisnexis Risk Solutions Fl Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lexisnexis Risk Solutions Fl Inc. filed Critical Lexisnexis Risk Solutions Fl Inc.
Priority to US13/848,850 priority Critical patent/US9412141B2/en
Publication of WO2013126281A1 publication Critical patent/WO2013126281A1/fr
Priority to US15/202,099 priority patent/US10438308B2/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0204Market segmentation

Definitions

  • Bayliss I U.S. Patent No. 7,403,942 to Bayliss, David et al.
  • Bayliss II US Patent Application Serial No. 10/357,489 to Bayliss, David et al.
  • Various embodiments of the systems and methods described herein relate to data mining and, more particularly, to systems and methods for efficiently mining data to identify collusion, fraud, and organized groups of entities.
  • Applications for exploiting collected data include, but are not limited to: national security; law enforcement; immigration and border control; locating missing persons and property; firearms tracking; civil and criminal investigations; person and property location and verification; governmental and agency record handling; entity searching and location; package delivery; telecommunications; consumer related applications; credit reporting, scoring, and/or evaluating; debt collection; entity identification verification; account establishment, scoring and monitoring; fraud detection; health industry (patient record maintenance); biometric and other forms of authentication; insurance and risk management; marketing, including direct to consumer marketing; human resources/employment; and financial/banking industries.
  • the applications may span an enterprise or agency or extend across multiple agencies, businesses, industries, etc.
  • various embodiments of the disclosed technology may include putative cluster analysis systems and methods for identifying various connected entities and organizations.
  • an analytical system may be provided that includes a database, a clustering unit, a scoring unit, and a filtering unit.
  • Certain implementations of the disclosed technology may include systems, methods, and computer-readable media for identifying connected organizations in a collection of distinct data points.
  • a method is provided for determining, from a collection of records comprising a plurality of distinct data points, connections between one or more of the plurality of distinct data points.
  • the method includes identifying, from the plurality of distinct data points, a plurality of clusters, each of the clusters comprising a cluster centroid, each cluster centroid comprising a distinct data point, wherein each cluster comprises the determined connections between the one or more of the plurality of distinct data points and the cluster centroid.
  • the method further includes identifying cluster connections among the plurality of clusters, scoring the cluster connections based on predetermined criteria, and identifying one or more of the distinct data points associated with the scored cluster connections.
  • a system may include one or more processors; at least one memory in communication with the one or more processors.
  • the at least one memory may include an operating system, a database a clustering unit, and a scoring unit.
  • the memory in communication with the one or more processors may be configured for storing data and instructions which, when executed by the at least one processor under control of the operating system, enable the system to determine, from a collection of records in the database, wherein the collection of records comprise a plurality of distinct data points, connections between one or more of the plurality of distinct data points.
  • the instructions, when executed by the at least one processor under control of the operating system may further identify, by the clustering unit, and from the plurality of distinct data points, a plurality of clusters, each of the clusters comprising a cluster centroid, each cluster centroid comprising a distinct data point, wherein each cluster comprises the determined connections between the one or more of the plurality of distinct data points and the cluster centroid.
  • the instructions, when executed by the at least one processor under control of the operating system may further identify, by the cluster unit, cluster connections among the plurality of clusters, score, by the scoring unit, the cluster connections based on predetermined criteria; and identify one or more of the distinct data points associated with the scored cluster connections.
  • a computer- readable media for a method.
  • the method includes determining, from a collection of records comprising a plurality of distinct data points, connections between one or more of the plurality of distinct data points.
  • the method includes identifying, from the plurality of distinct data points, a plurality of clusters, each of the clusters comprising a cluster centroid, each cluster centroid comprising a distinct data point, wherein each cluster comprises the determined connections between the one or more of the plurality of distinct data points and the cluster centroid.
  • the method further includes identifying cluster connections among the plurality of clusters, scoring the cluster connections based on predetermined criteria, and identifying one or more of the distinct data points associated with the scored cluster connections.
  • the database may store a plurality of records to be analyzed.
  • Each record may include data related to an entity or transaction.
  • a record may include data related to a real estate purchase, an insurance claim, or an income tax return.
  • the putative cluster analysis system may be directed to identify organizations related to a single industry. In that case, each record in the database, for the purpose of the putative cluster analysis system, may be related to that single industry. For example, if an embodiment of the putative cluster analysis system is directed to identifying insurance fraud, then various records may be related to insurance claims.
  • Some embodiments of the disclosed technology may include a database.
  • Other example embodiments of the disclosed technology may include systems and/or methods for accessing a database or other collection of data to be analyzed.
  • the clustering unit may group the various records into distinct, putative clusters.
  • the term "putative clusters" as discussed herein may mean groups of records that are supposed, presumed, and/or reputed as having some type of a connection to one another, no matter how tenuous that connection may prove to be in actuality.
  • each record, or data point may be deemed the central point of a cluster. For that data point, relatives of that data point may be identified up to a predetermined distance from the central data point, where "distance" between points is predefined and, in some embodiments, relates to a degree of connectivity between data points.
  • the scoring unit may have access to a predetermined feature set, and may be configured to analyze each putative cluster based on the feature set.
  • a direct link exists between each pair of data points with a direct relationship. For example, if a pair of data points represents two real estate transactions with the same seller, then these data points may be connected by a direct link within a cluster. Data points within a cluster may be indirectly connected when the data points are connected by a series of links.
  • the scoring unit may analyze the attributes of the various links or data points in the cluster to provide a score with respect to the feature in question.
  • each data point or each link may be assigned a score for each feature.
  • the cluster as a whole may be assigned a total score comprising a combination of the scores of the various features applicable to the cluster.
  • the total score may be one of various combinations calculated from the feature scores, such as, for example, a sum, a weighted sum, or another formula based on the various features.
  • the filtering unit may filter the putative clusters into real clusters and false clusters, where the real ones will be deemed to be those of interest for potential collusion.
  • the filtering unit may utilize a predetermined algorithm for separating the clusters into two groups based on the results of the scoring.
  • the algorithm may include a filter that significantly reduces the data set by selecting a subset of the putative clusters to deem real clusters.
  • the algorithm may be embodied in various forms according to certain embodiments.
  • the algorithm may examine the result of the scoring for each feature, and may select a subset of the clusters based on the various feature scores.
  • the filtering unit may have a target score, and real clusters may be those that meet a criterion, e.g., greater than, less than, with respect to that score for the combination of feature scores.
  • the putative cluster analysis system may calculate a set of putative clusters and filter those putative clusters into a set of high- interest real clusters.
  • FIG. 1 illustrates an analytics method according to an example implementation of the disclosed technology.
  • FIG. 2 illustrates a putative cluster evidencing connectedness between entities represented in the cluster and sub-clusters.
  • FIG. 3 illustrates a representative computer architecture, according to an example embodiment of the disclosed technology.
  • FIG. 4 illustrates a diagram of potentially fraudulent transactions identified by an example implementation of the disclosed technology during a test analysis.
  • FIG. 5 is a flow-diagram of a method, according to an example embodiment of the disclosed technology.
  • Example systems and methods described herein may utilize various forms of data to identify connected entities and/or organizations. Certain embodiments of the disclosed technology may provide improved accuracy over conventional data mining and putative cluster analysis systems and techniques. For example, insurance companies and other industries attempting to identify fraud may utilize conventional focused analysis techniques that examine each event in isolation. The conventional techniques typically utilize high thresholds to filter the large number of events to be analyzed. In other words, because the data that entities must analyze with conventional techniques is so large, a high degree of suspicious activity may be required in order to identify fraud. Without a high threshold, conventional techniques may have too many potentially fraudulent events to investigate. As a result, entities using conventional techniques often overlook collusion from groups that are able to stay below these high thresholds with respect to certain suspicious activities.
  • the putative cluster analysis systems and methods disclosed herein may perform one or more or the following tasks: drive improved investigative and due diligence workflows; evaluate and segment loan files to identify notable risks; identify non-obvious relationships between entities, within and external to loan transactions; expose key perpetrators to improve remediation and recourse opportunities; augment existing fraud detection and scoring models during origination and loan pool acquisition; and enhance internal fraud and risk controls with a flexible pattern selection process.
  • the putative cluster analysis system may start with large quantity of data and group that data into smaller, distinct clusters.
  • the proximity of seemingly low risk activity within each cluster may be measured using lower thresholds than is reasonably possible in the methods used by conventional systems.
  • the putative cluster analysis system may identify potentially organized groups without having to apply low thresholds to the large amounts of data as a whole.
  • high interest clusters may be identified from a plurality of data.
  • High interest clusters may represent connected organizations, entities, and or people.
  • the putative cluster analysis system disclosed herein may rely upon relatively large amounts of data to measure proximity of seemingly low risk events commonly associated with high risk activities to detect potentially fraudulent activities.
  • a domain of entities may be identified for analysis. For example, data associated with a large number (perhaps millions) of property deeds may be gathered for analysis.
  • the associated data may include identities of individuals, organizations, companies, etc., that are associated with the deeds.
  • the associated data may include information such as addresses, mortgage lenders, names of law firms, dates of transactions, etc.
  • one or more types of relationships between the entities may then be collected.
  • a non- partitioning clustering algorithm may be utilized to form clusters for each of the domain entities, wherein copies of the domain entity may be created, as required, for populating clusters associated with neighboring clusters.
  • a filtering mechanism may operate against the clusters and may retain those clusters that have outlying behavior.
  • Such filtering may conventionally utilize graph-or network analysis, and queries/filtering of this form may utilize sub-graph matching routines or fuzzy sub-graphs matching.
  • sub-graph matching routines or fuzzy-subgraphs matching techniques may be NP-complete, and thus, impractical for analyzing large sets of data.
  • the most notable characteristic of NP-complete problems is that no fast solution to them is known. That is, the time required to solve the problem using any currently known algorithm increases very quickly as the size of the problem grows.
  • Embodiments of the disclosed technology may be utilized to provide clusters and connections between entities even though the set of data analyzed may be extremely large.
  • entities may be identified and may include people, companies, places, objects, virtual identities, etc.
  • relationships may be formed in many ways, and with many qualities. For example, co-occurrence of values in common fields database may be utilized, such as the same last name. Relationships may also be formed using multiple co-occurrence of an entity with one or more other properties, such as people who have lived at two or more addresses. [0032] Relationships may also be formed based on a high reoccurrence and/or frequency of a common relationship, according to an example embodiment. For example, records of person X sending an email to person Y greater than N times may indicate a relationship between person X and person Y.
  • person X sends an email to or receives an email from person Y
  • person Z sends an email or receives an email from person Y
  • a relationship may be implied between person X and person Z.
  • relationships between entities may comprise Boolean, weighted, directed, undirected, and/or combinations of multiple relationships.
  • clustering of the entities may rely on relationships steps.
  • entities may be related by at least two different relationship types.
  • relationships for the clustering may be established by examining weights or strengths of connections between entities in certain directions and conditional upon other relationships, including temporal relationships. For example, in one embodiment, the directional relationships between entities X, Y, and Z may be examined and the connection between X, Y, and Z may be followed if there is a link between Y and Z happened (in time) after the link was established between X and Y.
  • clusters may be scored.
  • a threshold may be utilized to identify clusters of interest.
  • a model may be utilized to compute a number of statistics on each cluster.
  • the model may be as simple as determining counts.
  • the model may detect relationships within a cluster, for example, entities that are related to the centroid of the cluster that are also related to each other. This analysis may provide a measure of cohesiveness of relationships that exist inside the cluster.
  • scoring and weighting of each cluster may be utilized to determine which clusters rise above a particular threshold, and may be classified as "interesting.”
  • scoring and weighting of the determined statistics may be accomplished using a heuristic scoring model, such as linear regression, neural network analysis, etc.
  • An example analytics method may be implemented by a putative cluster analysis system 100, as illustrated in FIG. 1. It will be understood that the method illustrated herein is provided for illustrative purposes only and does not limit the scope of the disclosed technology.
  • the putative cluster analysis system 100 may receive a plurality of data 102 to be analyzed.
  • the data may be processed 104, and output 106 may be generated.
  • the data may include identities and property deeds 108.
  • the data may also include information 110, for example, that may include data related to a bank portfolio.
  • the system 100 may receive the data 102 in its various forms (which may include identities, property deeds portfolios, etc.), and may process 104 the data 102 to derive relationships 112 and perform analytics 114.
  • the relationships 112 and analytics 114 may be used to determine particular attributes 116.
  • the attributes 116 may include one or more of the following: property status; property deed transfer history; buyer history; and/or the previous seller's cluster activity.
  • the determined attributes 116 may go through a scoring and filtering process 118, which may result in an output 106 that may include one or more primary attributes 120, features 122, and risk segmentation 124.
  • the primary attributes 120 may include entity and property characteristics, such as suspicious deeds, associations with businesses and other entities, seller address history, etc.
  • the features 122 may be derived from aggregating characteristics such as store code deeds, defaults, transfer activity, etc. In one example embodiment, such features 122 may be derived by combining primary attributes 120.
  • the risk segmentation 124 may be utilized to augment current scoring models.
  • the clustering unit of the putative cluster analysis system 100 may treat each data point in the data as a centroid of its own cluster.
  • the total number of clusters may be equal to the total number of data points, and each cluster may be uniquely represented by its centroid data point.
  • the distance between the centroid and any data point within each cluster may be limited, such that the clusters are limited in size and, for some analyses, may be treated as being disconnected from one another.
  • An example method of clustering data for the purposes of the example implementation of the disclosed technology of the putative cluster analysis systems and methods is disclosed in Bayliss I and II, which are incorporated herein.
  • scoring and filter 118 may be applied, for example, to analyze each cluster and assign one or more scores to each cluster.
  • a scoring unit may utilize a predetermined scoring algorithm for scoring some or all of the clusters.
  • the scoring unit may utilize a dynamic scoring algorithm for scoring some or all of the clusters.
  • the scoring algorithm may be based on seemingly low-risk events that tend to be associated with organizations, such as fraud organizations. The algorithm may thus also be based on research into what events tend to be indicative of fraud in the industry or application to which the putative cluster analysis system is directed.
  • each putative cluster may be scored individually. For example, a plurality of predetermined attributes, or variables, may be calculated for each cluster based on the data points in the cluster. For each attribute, the putative cluster as a whole may be considered, or each data point or link between pairs of data points may be considered. An attribute may be evaluated and scored depending on the nature of the attribute.
  • the property status attribute may include one or more of the following: the date of subject property last deed; the sale amount of subject property, the last recorded deed transfer; the number of months subject property was owned by previous owner; and/or the number of potential flip deed transfers (for example, property being owned less than 6 months or having a greater than 10% appreciation., etc.).
  • the property deed transfer history attribute may include one or more of the following information: the previous owner is a member of a network having high volume or suspicious deed transfer activity; the number of properties ever sold by previous owner that then resulted in default; and/or the previous owner's count of historical deed transfers within a network of associates.
  • the buyer history attribute may include one or more of the following information: the number of properties ever owned by the buyer(s); the number of properties ever owned by the buyer(s) business; and/or the number of properties ever sold by the buyer(s) that resulted in default.
  • the previous seller's cluster activity attribute may include one or more of the following information: buyer(s') count of historical deed transfers within a network of associates; and/or number of potential flip deed transfers (for example, property being owned less than 6 months or having a greater than 10% appreciation., etc.).
  • These or other features may be integrated into the scoring unit, so as to score the various putative clusters provided by the clustering unit.
  • Core transaction measurements which may be incorporated into the above list of features, may include velocity, profit, and buyer or seller relationship.
  • the filtering unit may filter out and those clusters that are deemed to represent real organizations based on the scoring.
  • the putative cluster analysis system may leverage publicly available data, such as property deeds and assessments, which may include several hundred million records.
  • the putative cluster analysis system may also clean and standardize data to reduce the possibility that matching entities are considered as distinct.
  • the putative cluster analysis system may use this data to build a large-scale network map of the population in question and its associated flow of property.
  • the putative cluster analysis system may leverage a relatively large-scale of supercomputing power and analytics to target organized collusion.
  • Example implementation of the disclosed technology of the putative cluster analysis systems and methods may rely upon open-source large scale parallel-processing computing platforms to increase the agility and scale of solutions.
  • centroids may be derived from a public database of around fifty terabytes for the U.S. population.
  • a cluster network map may be created with around four hundred million clusters with seventeen billion relationships.
  • Example implementation of the disclosed technology of the putative cluster analysis systems and methods may measure behavior and relationships that traditionally may be used to obscure activities to more actively and effectively expose syndicates and rings of collusion. Unlike many conventional systems, the putative cluster analysis system need not be limited to rings operating in a single geographic location, and it need not be limited to short time periods. Further, the putative cluster analysis system need not be limited to measuring only individually high value transactions, as banks do when identifying potential fraud that they consider to be worth their resources. The putative cluster analysis systems and methods disclosed herein thus may enable investigations to prioritize efforts on organized groups more effectively, rather than investigating individual transactions to determine whether they fall within an organized ring.
  • Table 1 A list of example attributes is shown in Table 1, below. It will be understood that these attributes are provided for illustrative purposes only and do not limit the scope of the putative cluster analysis systems and methods. Not all of these attributes need be used, and other attributes may be used as well, such as those described above with respect to the attributes 116 in reference to FIG. 1.
  • Buyer's Cluster in network high profit byr cl in net hi prof transfers byr cl in net hi prof flip cnt Buyer's Cluster in network high profit flips byr cl flop cnt Buyer's Cluster flop count
  • the scoring of the clusters may include accessing features, which may be based on research or knowledge about behaviors that suggest collusive activity.
  • each feature may represent a risky activity or characteristic.
  • the features may include the number of automobiles involved in an accident, the number of people injured, the value of vehicles involved in the accident, and/or number and extent of injuries.
  • each feature may be computed for each cluster.
  • a feature for example, may be calculated as a composite of one or more attributes of the cluster in question. For example, an attribute for detecting mortgage fraud may be "date of last deed transfer.” A feature that is based on this attribute may be "whether previous owner is a member of a network that shows high volume or suspicious deed transfer activity.” Thus, this feature may be a composite of the "date of last deed transfer" attribute, along with other attributes.
  • the scoring of the clusters may include calculating a score for each cluster, based on the features computed for the cluster. With wisely-chosen features, the resulting score for a cluster may be indicative of the connectedness of the various data points within a cluster.
  • filtering may be utilized to examine the cluster scores and filter, or identify, which putative clusters are real clusters, i.e., represent organized groups of entities. Organized groups may be flagged as being potentially involved in collusion-based fraud.
  • a filter may be utilized to reduce the data set to identify groups that evidence the greatest connectedness based on the scoring algorithm.
  • putative clusters with scores that match a predetermined set of criteria may be flagged for evaluation.
  • filtering may utilize one or more target scores, which may be selected based on the industry, goals of the putative cluster analysis system, or the scoring algorithm.
  • putative clusters having scores greater than or equal to a target score may be flagged as being potentially collusive.
  • the threshold for identifying fraud is too high, so as to prevent identifying too many entities for examination.
  • the features and scoring algorithm may be chosen to identify connectedness without the concern that too many individuals will be identified.
  • groups, instead of individuals, may be identified.
  • FIG. 2 illustrates an example putative cluster 200 where certain connectedness between entities may be determined according to the systems and methods disclosed herein.
  • This particular example may be directed toward identifying potential mortgage fraud, and at centroid of this example putative cluster 200 is a specific first property 202, which may be a house, for example.
  • This particular example is over-simplified for clarity, and it should be realized that such putative clusters in practice may actually contain hundreds of thousands of properties and associated entities having a dense web of connections among the properties, entities, etc.
  • the first property 202 may have certain characteristics (historical or otherwise) associated with it, for example, flipping (i.e., fast turnover), high sales profit, and/or transactions in which parties appeared to be associated with each other even outside of the transaction.
  • flipping i.e., fast turnover
  • high sales profit i.e., high sales profit
  • a first bank 206 that is considering providing a mortgage on this first property 202 to a potential buyer 204 may have certain visibility to the aforementioned characteristics but, using a conventional fraud- identification system, the bank 206 may not be able to detect the various connections that actually exist.
  • Other properties within the same putative cluster 200 may show similar characteristics: flipping, high sales profit, and relationships between parties.
  • connections between entities may be established based on public record documents, property deeds, etc., and such connections may be represented by lines connecting the entities, property, banks, etc.
  • a potential buyer 204 of a first property 202 may be in communication with a first bank 206.
  • the potential buyer 204, the first property, and the first bank may represent a first sub cluster 207.
  • the entire putative cluster 200 may include multiple sub clusters, each established with a property, person, etc., at its particular centroid.
  • the putative cluster 200 of FIG. 2 illustrates a number of sub clusters 207, 208, 209, 212, 214, 226, 208.
  • a particular entity may be at this centroid of its own cluster, and that same particular entity may be duplicated in the putative cluster to show connections with other entities that are set at the centroid of their own cluster.
  • the potential buyer 204 is shown connected to the first sub cluster 207 in which the first house 202 is at the centroid.
  • the potential buyer 204 is also shown in figure as being the centroid of a second sub cluster 209.
  • the fourth sub cluster 214 includes a second bank 215 at its centroid, and the potential buyer 204 is duplicated and shown as having a connection to the second bank 215.
  • the connection between first and second instances of the potential buyer 204 is represented in this figure by a thick line 205. Focusing now on the fourth sub cluster 214, in which the second bank 215 is at its centroid, we see that a second entity 216 is connected with the potential buyer 204, and with second bank 215.
  • a third entity 218 is connected to the second bank 215 and to the second entity 216.
  • the third entity 218 is a member of the fourth sub cluster 214 and the fifth sub cluster 226. Again, the connection between the duplicated third entity 218 is signified by the thick line 219.
  • the third entity 218 within the fifth sub cluster 226 is shown as being connected to a fourth entity 220, who is connected to a fifth entity 222. Therefore according to this example putative cluster 200, a connection may be determined to exist between the potential buyer 204 and the fifth entity 222, and this connection is shown by the dotted line 224.
  • a single property may have changed ownership between multiple entities in the sub cluster, as shown in the first sub-cluster 207, the fifth sub-cluster 226 and the sixth sub-cluster 208.
  • the centroid property 202 being analyzed has been subject to a number of transfers between related entities, which is often an indicator of fraudulent activities. Again, the movement of this property among these various entities would likely be overlooked in a conventional fraud-detection system.
  • FIG. 3 depicts a block diagram of an illustrative computer system architecture 300 according to an example implementation of the disclosed technology.
  • Various implementations and methods herein may be embodied in non-transitory computer readable media for execution by a processor. It will be understood that the architecture 300 is provided for example purposes only and does not limit the scope of the various implementations of the communication systems and methods.
  • the architecture 300 of FIG. 3 includes a central processing unit (CPU) 302, where computer instructions are processed; a display interface 304 that acts as a communication interface and provides functions for rendering video, graphics, images, and texts on the display.
  • the display interface 304 may be directly connected to a local display.
  • the display interface 304 may be configured for providing data, images, and other information for an external/remote display or computer that is not necessarily connected to the particular CPU 302.
  • the display interface 304 may wirelessly communicate, for example, via a Wi-Fi channel or other available network connection interface 312 to an external/remote display.
  • the architecture 300 may include a keyboard interface 306 that provides a communication interface to a keyboard; and a pointing device interface 308 that provides a communication interface to a pointing device, mouse, and/or touch screen.
  • Example implementations of the architecture 300 may include an antenna interface 310 that provides a communication interface to an antenna; a network connection interface 312 that provides a communication interface to a network.
  • the display interface 304 may be in communication with the network connection interface 312, for example, to provide information for display on a remote display that is not directly connected or attached to the system.
  • a camera interface 314 may be provided that may act as a communication interface and/or provide functions for capturing digital images from a camera.
  • a sound interface 316 is provided as a communication interface for converting sound into electrical signals using a microphone and for converting electrical signals into sound using a speaker.
  • a random access memory (RAM) 318 is provided, where computer instructions and data may be stored in a volatile memory device for processing by the CPU 302.
  • the architecture 300 includes a read-only memory (ROM) 320 where invariant low-level system code or data for basic system functions such as basic input and output (I/O), startup, or reception of keystrokes from a keyboard are stored in a non-volatile memory device.
  • ROM read-only memory
  • I/O basic input and output
  • the architecture 300 includes a storage medium 322 or other suitable type of memory (e.g.
  • the application programs 326 may include putative clustering instructions for organizing, storing, retrieving, comparing, and/or analyzing the various connections associated with the properties and entities associated with embodiments of the disclosed technology.
  • the putative cluster analysis system, the clustering unit, and/or the scoring unit may be embodied, at least in part, via the application programs 326 interacting with data from the ROM 320 or other memory storage medium 322, and may be enabled by interaction with the operating system 324 via the CPU 302 and bus 334.
  • the architecture 300 includes a power source 330 that provides an appropriate alternating current (AC) or direct current (DC) to power components.
  • the architecture 300 may include and a telephony subsystem 332 that allows the device 300 to transmit and receive sound over a telephone network.
  • the constituent devices and the CPU 302 communicate with each other over a bus 334.
  • the CPU 302 has appropriate structure to be a computer processor.
  • the computer CPU 302 may include more than one processing unit.
  • the RAM 318 interfaces with the computer bus 334 to provide quick RAM storage to the CPU 302 during the execution of software programs such as the operating system application programs, and device drivers. More specifically, the CPU 302 loads computer-executable process steps from the storage medium 322 or other media into a field of the RAM 318 in order to execute software programs. Data may be stored in the RAM 318, where the data may be accessed by the computer CPU 302 during execution.
  • the device 300 includes at least 128 MB of RAM, and 256 MB of flash memory.
  • the storage medium 322 itself may include a number of physical drive units, such as a redundant array of independent disks (RAID), a floppy disk drive, a flash memory, a USB flash drive, an external hard disk drive, thumb drive, pen drive, key drive, a High-Density Digital Versatile Disc (HD-DVD) optical disc drive, an internal hard disk drive, a Blu-Ray optical disc drive, or a Holographic Digital Data Storage (HDDS) optical disc drive, an external mini-dual inline memory module (DIMM) synchronous dynamic random access memory (SDRAM), or an external micro-DIMM SDRAM.
  • RAID redundant array of independent disks
  • HD-DVD High-Density Digital Versatile Disc
  • HD-DVD High-Density Digital Versatile Disc
  • HDDS Holographic Digital Data Storage
  • DIMM mini-dual inline memory module
  • SDRAM synchronous dynamic random access memory
  • micro-DIMM SDRAM an external micro-DIMM SDRAM
  • Such computer readable storage media allow the device 300 to access computer-executable process steps, application programs and the like, stored on removable and non-removable memory media, to off-load data from the device 300 or to upload data onto the device 300.
  • a computer program product, such as one utilizing a communication system may be tangibly embodied in storage medium 322, which may comprise a machine- readable storage medium.
  • FIG. 4 illustrates a Venn diagram 400 of potentially fraudulent real estate transactions that may be identified, categorized, and/or flagged by putative cluster analysis, according to certain example embodiments of the disclosed technology.
  • the putative cluster analysis system may identify high-risk transactions that are performed within a network of associates that involve flipped properties and that result in high profits.
  • the term "flipping" is used herein to describe purchasing a revenue-generating asset and quickly reselling it for profit. This term is frequently used both as a descriptive term for legal real estate investing strategies that are perceived by some to be unethical or socially destructive. Certain embodiments of the disclosed technology may be applied for sensing schemes involving market manipulation and other illegal conduct including potentially collusive behavior.
  • the Venn diagram 400 of FIG. 4 illustrates the overlap of certain attributes that may be determined from a number of transactions involving certain properties.
  • related entities that are identified as being in the same network 402 may comprise a subset of transactions.
  • Certain transactions may be flagged as extracting high profit 404, and other transactions may be characterized as flipping or flopping 406.
  • flipping or flopping 406 may have the characteristic of a purchase, followed by a sale within a short period of time after the purchase.
  • Certain flipping or flopping 406 transactions may have low profit, and certain transactions may have a high profit.
  • the overlap (designated by the letter Y) of flipping or flopping 406 transactions with those that are high profit 404 may provide loan profiles with the characteristic of loan files that were flipped and resulted in high profit gains 410.
  • the overlap (designated by the letter W) of high profit 404 transactions with in network 402 transactions may provide loan profiles with a high profit gain having no flip 408.
  • the overlap (designated by the letter X) of in network 402 transactions and flipping or flopping 406 transactions may provide a loan profile with flip flops that are not high profit 412.
  • the overlap of the in-network 402, the flipping or flopping 406, and the high profit 404 transactions may be designated (by the letter Z) as having the characteristic of in cluster loans that were flipped and had high profit gains 414.
  • such overlap of characteristics, attributes data, etc. may be utilized to identify potential collusion within a network that may otherwise be very difficult to detect.
  • the method 500 starts in block 502, and according to an example implementation includes determining, from a collection of records comprising a plurality of distinct data points, connections between one or more of the plurality of distinct data points.
  • the method 500 includes identifying, from the plurality of distinct data points, a plurality of clusters, each of the clusters comprising a cluster centroid, each cluster centroid comprising a distinct data point, wherein each cluster comprises the determined connections between the one or more of the plurality of distinct data points and the cluster centroid.
  • the method 500 includes identifying cluster connections among the plurality of clusters.
  • the method 500 includes scoring the cluster connections based on predetermined criteria.
  • the method 500 includes identifying one or more of the distinct data points associated with the scored cluster connections..
  • the identified ringleader was not listed on any of the deeds of flipped properties, but could be identified by the test putative cluster analysis system by indirect connections with the flipped properties, and by other metrics disclosed herein.
  • Example implementations of the disclosed technology may be able to detect criminal activities that would not likely be identified if the involved individual or individuals intentionally avoid the type of behavior and connections that would be identifiable by conventional means.
  • Certain implementations of the disclosed technology of the putative cluster analysis systems and methods may be used to identify potential organizations of health insurance fraud, such as Medicaid fraud.
  • the input data to the clustering unit may be derived from historical address history of a population to be examined and such address history may be used to link individuals based on, for example, familial, residential, and business relationships.
  • the clustering unit may then take this input data and output clusters for use by the scoring unit.
  • some features considered for the scoring algorithm with respect to health insurance fraud may include: (1) the number of people within a cluster who lived in expensive residences, owned expensive property, or drove expensive cars; (2) the number of insurance recipients within the cluster who are contacts of medical providers; (3) the number of medical businesses associated with people in the cluster; (4) the number of people in cluster currently receiving benefits; and/or (5) the number of recipients associated with excluded providers.
  • These features may enable the putative cluster system to identify, among others, clusters that have dense clusters of recipients who appear to be colluding and transferring knowledge of how to claim Medicaid benefits and bypass eligibility requirements, as well as clusters that have close ties to medical providers who have the knowledge and means to defraud Medicaid.
  • the putative cluster analysis system may consider the following features to identify potential drug-seeking behavior: (1) prescription filling distance deviation; and (2) watchlist drug prescriptions. Such features may enable the putative cluster system to identify, among others, clusters that include patients who deviate when filling prescriptions for certain watchlist drugs, as well as clusters that include providers and prescribers with patterns of prescribing to the drug-seeking clusters.
  • certain technical effects can be provided, such as creating certain systems and methods that are able to identify an entity that is connected to various other entities evidencing suspicious behavior.
  • Embodiments of the disclosed technology may be utilized to examine related data in addition to data that is indicative of whether or not an individual entity is an active recipient of health insurance.
  • the putative cluster analysis system disclosed herein may also consider other recipients in an individual's cluster, which may be indicative of collusion.
  • the above or other features may be integrated into the scoring unit, so as to score the various clusters provided by the clustering unit.
  • the filtering unit may then filter out and those clusters that are deemed to represent real organizations based on the scoring.
  • Example implementations of the disclosed putative cluster analysis systems and methods may also be used to identify potential organizations of automobile insurance fraud.
  • automobile insurance fraud may include multiple victims, expensive vehicles, or multiple injuries.
  • some features considered for the scoring algorithm with respect to automobile insurance fraud may include: (1) the number of involved parties; (2) the number of claimants requiring medical treatment; (3) individual claim amounts; (4) vehicle damage; and (5) makes or models of involved automobiles. Analysis of these features, according to an example embodiment, may enable the putative cluster system to identify, among others, clusters that have a high number of collective claims with low standard deviation of claim counts, as well as clusters that have a statistically higher number of claims with soft tissue injuries, multiple passengers, low vehicle damage, or common passengers across multiple claims in the cluster.
  • the above or other features may be integrated into the scoring unit, so as to score the various clusters provided by the clustering unit.
  • the filtering unit may then filter out and those clusters that are deemed to represent real organizations based on the scoring.
  • Example implementations of the disclosed putative cluster analysis systems and methods may also be used to identify potential organizations involved in tax fraud.
  • some features considered for the scoring algorithm with respect to tax fraud may include: (1) a significant change in income between tax years; (2) a significant increase in deductions; (3) a change in filing status; (4); a change in number or nature of dependents; (5) and the number of self-employed individuals in cluster.
  • These or other features may be integrated into the scoring unit, so as to score the various clusters provided by the clustering unit.
  • the filtering unit may then filter out and those clusters that are deemed to represent real organizations based on the scoring.
  • Various embodiments of the putative cluster analysis systems and methods may be embodied, in whole or in part, in a computer program product stored on non-transitory computer- readable media for execution by one or more processors.
  • various aspects of the disclosed technology such as the clustering unit, the scoring unit, and the filtering unit, may comprise hardware or software of a computer system, as discussed above with respect to FIG. 3.
  • these units may be discussed herein as being distinct from one another, they may be implemented in various ways. The distinctions between them throughout this disclosure are made for illustrative purposes only, based on operational distinctiveness.
  • putative cluster analysis systems and methods need not be limited to those above.
  • an example implementation of the putative cluster analysis system may be used to identify potential fraud related to credit cards applications, identity theft, investments, and various other fraud types that might involve an organization of connected entities.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Certains modes de réalisation de la technologie de l'invention peuvent comprendre des systèmes, des procédés et des supports lisibles par ordinateur destinés à identifier des organisations connectées à partir d'une collection d'enregistrements. L'invention se rapporte à un procédé pour déterminer, à partir d'une collection d'enregistrements comprenant une pluralité de points de données distincts, les connexions entre un ou plusieurs de la pluralité de points de données distincts. L'invention concerne un procédé d'identification, à partir de la pluralité de points de données distincts, d'une pluralité de groupements, chacun des groupements comprenant un groupement centroïde, chaque groupement centroïde comprenant un point de données distinct, chaque groupement comprenant les connexions déterminées entre un ou plusieurs de la pluralité de points de données distincts et le groupement centroïde. Le procédé comprend en outre l'identification des connexions de groupements parmi la pluralité de groupements, le classement des connexions des groupements en fonction des critères prédéterminés, et l'identification d'un ou plusieurs des points de données distincts associés aux connexions classées des groupements.
PCT/US2013/026343 2003-02-04 2013-02-15 Systèmes et procédés pour l'analyse présumée de groupement WO2013126281A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US13/848,850 US9412141B2 (en) 2003-02-04 2013-03-22 Systems and methods for identifying entities using geographical and social mapping
US15/202,099 US10438308B2 (en) 2003-02-04 2016-07-05 Systems and methods for identifying entities using geographical and social mapping

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201261603068P 2012-02-24 2012-02-24
US61/603,068 2012-02-24

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US13/541,092 Continuation-In-Part US8549590B1 (en) 2003-02-04 2012-07-03 Systems and methods for identity authentication using a social network

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US13/848,850 Continuation-In-Part US9412141B2 (en) 2003-02-04 2013-03-22 Systems and methods for identifying entities using geographical and social mapping

Publications (1)

Publication Number Publication Date
WO2013126281A1 true WO2013126281A1 (fr) 2013-08-29

Family

ID=49006122

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2013/026343 WO2013126281A1 (fr) 2003-02-04 2013-02-15 Systèmes et procédés pour l'analyse présumée de groupement

Country Status (1)

Country Link
WO (1) WO2013126281A1 (fr)

Cited By (58)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8788405B1 (en) 2013-03-15 2014-07-22 Palantir Technologies, Inc. Generating data clusters with customizable analysis strategies
US8855999B1 (en) 2013-03-15 2014-10-07 Palantir Technologies Inc. Method and system for generating a parser and parsing complex data
US8930897B2 (en) 2013-03-15 2015-01-06 Palantir Technologies Inc. Data integration tool
US9009827B1 (en) 2014-02-20 2015-04-14 Palantir Technologies Inc. Security sharing system
US9021260B1 (en) 2014-07-03 2015-04-28 Palantir Technologies Inc. Malware data item analysis
US9043894B1 (en) 2014-11-06 2015-05-26 Palantir Technologies Inc. Malicious software detection in a computing system
US9202249B1 (en) 2014-07-03 2015-12-01 Palantir Technologies Inc. Data item clustering and analysis
US9202178B2 (en) 2014-03-11 2015-12-01 Sas Institute Inc. Computerized cluster analysis framework for decorrelated cluster identification in datasets
US9230280B1 (en) 2013-03-15 2016-01-05 Palantir Technologies Inc. Clustering data based on indications of financial malfeasance
US9367872B1 (en) 2014-12-22 2016-06-14 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive investigation of bad actor behavior based on automatic clustering of related data in various data structures
US9424337B2 (en) 2013-07-09 2016-08-23 Sas Institute Inc. Number of clusters estimation
US9454785B1 (en) 2015-07-30 2016-09-27 Palantir Technologies Inc. Systems and user interfaces for holistic, data-driven investigation of bad actor behavior based on clustering and scoring of related data
US9456000B1 (en) 2015-08-06 2016-09-27 Palantir Technologies Inc. Systems, methods, user interfaces, and computer-readable media for investigating potential malicious communications
US9535974B1 (en) 2014-06-30 2017-01-03 Palantir Technologies Inc. Systems and methods for identifying key phrase clusters within documents
US9552615B2 (en) 2013-12-20 2017-01-24 Palantir Technologies Inc. Automated database analysis to detect malfeasance
US9785773B2 (en) 2014-07-03 2017-10-10 Palantir Technologies Inc. Malware data item analysis
US9817563B1 (en) 2014-12-29 2017-11-14 Palantir Technologies Inc. System and method of generating data points from one or more data stores of data items for chart creation and manipulation
US9875293B2 (en) 2014-07-03 2018-01-23 Palanter Technologies Inc. System and method for news events detection and visualization
US9898509B2 (en) 2015-08-28 2018-02-20 Palantir Technologies Inc. Malicious activity detection system capable of efficiently processing data accessed from databases and generating alerts for display in interactive user interfaces
US9898528B2 (en) 2014-12-22 2018-02-20 Palantir Technologies Inc. Concept indexing among database of documents using machine learning techniques
US9965937B2 (en) 2013-03-15 2018-05-08 Palantir Technologies Inc. External malware data item clustering and analysis
US10120857B2 (en) 2013-03-15 2018-11-06 Palantir Technologies Inc. Method and system for generating a parser and parsing complex data
CN108985950A (zh) * 2018-07-13 2018-12-11 平安科技(深圳)有限公司 电子装置、用户骗保风险预警方法及存储介质
US10230746B2 (en) 2014-01-03 2019-03-12 Palantir Technologies Inc. System and method for evaluating network threats and usage
US10235461B2 (en) 2017-05-02 2019-03-19 Palantir Technologies Inc. Automated assistance for generating relevant and valuable search results for an entity of interest
US10275778B1 (en) 2013-03-15 2019-04-30 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive investigation based on automatic malfeasance clustering of related data in various data structures
US10318630B1 (en) 2016-11-21 2019-06-11 Palantir Technologies Inc. Analysis of large bodies of textual data
US10325224B1 (en) 2017-03-23 2019-06-18 Palantir Technologies Inc. Systems and methods for selecting machine learning training data
US10356032B2 (en) 2013-12-26 2019-07-16 Palantir Technologies Inc. System and method for detecting confidential information emails
US10362133B1 (en) 2014-12-22 2019-07-23 Palantir Technologies Inc. Communication data processing architecture
US10482382B2 (en) 2017-05-09 2019-11-19 Palantir Technologies Inc. Systems and methods for reducing manufacturing failure rates
US10489391B1 (en) 2015-08-17 2019-11-26 Palantir Technologies Inc. Systems and methods for grouping and enriching data items accessed from one or more databases for presentation in a user interface
US10552994B2 (en) 2014-12-22 2020-02-04 Palantir Technologies Inc. Systems and interactive user interfaces for dynamic retrieval, analysis, and triage of data items
US10572496B1 (en) 2014-07-03 2020-02-25 Palantir Technologies Inc. Distributed workflow system and database with access controls for city resiliency
US10572487B1 (en) 2015-10-30 2020-02-25 Palantir Technologies Inc. Periodic database search manager for multiple data sources
US10579647B1 (en) 2013-12-16 2020-03-03 Palantir Technologies Inc. Methods and systems for analyzing entity performance
US10593004B2 (en) 2011-02-18 2020-03-17 Csidentity Corporation System and methods for identifying compromised personally identifiable information on the internet
US10592982B2 (en) 2013-03-14 2020-03-17 Csidentity Corporation System and method for identifying related credit inquiries
US10606866B1 (en) 2017-03-30 2020-03-31 Palantir Technologies Inc. Framework for exposing network activities
US10620618B2 (en) 2016-12-20 2020-04-14 Palantir Technologies Inc. Systems and methods for determining relationships between defects
US10699028B1 (en) 2017-09-28 2020-06-30 Csidentity Corporation Identity security architecture systems and methods
US10719527B2 (en) 2013-10-18 2020-07-21 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive simultaneous querying of multiple data stores
US10838987B1 (en) 2017-12-20 2020-11-17 Palantir Technologies Inc. Adaptive and transparent entity screening
US10896472B1 (en) 2017-11-14 2021-01-19 Csidentity Corporation Security and identity verification system and architecture
US10911234B2 (en) 2018-06-22 2021-02-02 Experian Information Solutions, Inc. System and method for a token gateway environment
US10909617B2 (en) 2010-03-24 2021-02-02 Consumerinfo.Com, Inc. Indirect monitoring and reporting of a user's credit data
US10990979B1 (en) 2014-10-31 2021-04-27 Experian Information Solutions, Inc. System and architecture for electronic fraud detection
US11030562B1 (en) 2011-10-31 2021-06-08 Consumerinfo.Com, Inc. Pre-data breach monitoring
US11074641B1 (en) 2014-04-25 2021-07-27 Csidentity Corporation Systems, methods and computer-program products for eligibility verification
US11119630B1 (en) 2018-06-19 2021-09-14 Palantir Technologies Inc. Artificial intelligence assisted evaluations and user interface for same
US11120519B2 (en) 2013-05-23 2021-09-14 Consumerinfo.Com, Inc. Digital identity
US11151468B1 (en) 2015-07-02 2021-10-19 Experian Information Solutions, Inc. Behavior analysis using distributed representations of event data
US11157872B2 (en) 2008-06-26 2021-10-26 Experian Marketing Solutions, Llc Systems and methods for providing an integrated identifier
US11164271B2 (en) 2013-03-15 2021-11-02 Csidentity Corporation Systems and methods of delayed authentication and billing for on-demand products
US11232413B1 (en) 2011-06-16 2022-01-25 Consumerinfo.Com, Inc. Authentication alerts
US11288677B1 (en) 2013-03-15 2022-03-29 Consumerlnfo.com, Inc. Adjustment of knowledge-based authentication
US11341178B2 (en) 2014-06-30 2022-05-24 Palantir Technologies Inc. Systems and methods for key phrase characterization of documents
US11941065B1 (en) 2019-09-13 2024-03-26 Experian Information Solutions, Inc. Single identifier platform for storing entity data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5404561A (en) * 1989-04-19 1995-04-04 Hughes Aircraft Company Clustering and associate processor
US20020107858A1 (en) * 2000-07-05 2002-08-08 Lundahl David S. Method and system for the dynamic analysis of data
US20030212519A1 (en) * 2002-05-10 2003-11-13 Campos Marcos M. Probabilistic model generation
US20060093222A1 (en) * 1999-09-30 2006-05-04 Battelle Memorial Institute Data processing, analysis, and visualization system for use with disparate data types
US20090177589A1 (en) * 1999-12-30 2009-07-09 Marc Thomas Edgar Cross correlation tool for automated portfolio descriptive statistics

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5404561A (en) * 1989-04-19 1995-04-04 Hughes Aircraft Company Clustering and associate processor
US20060093222A1 (en) * 1999-09-30 2006-05-04 Battelle Memorial Institute Data processing, analysis, and visualization system for use with disparate data types
US20090177589A1 (en) * 1999-12-30 2009-07-09 Marc Thomas Edgar Cross correlation tool for automated portfolio descriptive statistics
US20020107858A1 (en) * 2000-07-05 2002-08-08 Lundahl David S. Method and system for the dynamic analysis of data
US20030212519A1 (en) * 2002-05-10 2003-11-13 Campos Marcos M. Probabilistic model generation

Cited By (114)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11769112B2 (en) 2008-06-26 2023-09-26 Experian Marketing Solutions, Llc Systems and methods for providing an integrated identifier
US11157872B2 (en) 2008-06-26 2021-10-26 Experian Marketing Solutions, Llc Systems and methods for providing an integrated identifier
US10909617B2 (en) 2010-03-24 2021-02-02 Consumerinfo.Com, Inc. Indirect monitoring and reporting of a user's credit data
US10593004B2 (en) 2011-02-18 2020-03-17 Csidentity Corporation System and methods for identifying compromised personally identifiable information on the internet
US11232413B1 (en) 2011-06-16 2022-01-25 Consumerinfo.Com, Inc. Authentication alerts
US11954655B1 (en) 2011-06-16 2024-04-09 Consumerinfo.Com, Inc. Authentication alerts
US11030562B1 (en) 2011-10-31 2021-06-08 Consumerinfo.Com, Inc. Pre-data breach monitoring
US11568348B1 (en) 2011-10-31 2023-01-31 Consumerinfo.Com, Inc. Pre-data breach monitoring
US10592982B2 (en) 2013-03-14 2020-03-17 Csidentity Corporation System and method for identifying related credit inquiries
US9135658B2 (en) 2013-03-15 2015-09-15 Palantir Technologies Inc. Generating data clusters
US8855999B1 (en) 2013-03-15 2014-10-07 Palantir Technologies Inc. Method and system for generating a parser and parsing complex data
US9177344B1 (en) 2013-03-15 2015-11-03 Palantir Technologies Inc. Trend data clustering
US9165299B1 (en) 2013-03-15 2015-10-20 Palantir Technologies Inc. User-agent data clustering
US8788407B1 (en) 2013-03-15 2014-07-22 Palantir Technologies Inc. Malware data clustering
US9230280B1 (en) 2013-03-15 2016-01-05 Palantir Technologies Inc. Clustering data based on indications of financial malfeasance
US10120857B2 (en) 2013-03-15 2018-11-06 Palantir Technologies Inc. Method and system for generating a parser and parsing complex data
US10721268B2 (en) 2013-03-15 2020-07-21 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive investigation based on automatic clustering of related data in various data structures
US10834123B2 (en) * 2013-03-15 2020-11-10 Palantir Technologies Inc. Generating data clusters
US10937034B2 (en) * 2013-03-15 2021-03-02 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive investigation based on automatic malfeasance clustering of related data in various data structures
US8930897B2 (en) 2013-03-15 2015-01-06 Palantir Technologies Inc. Data integration tool
US8788405B1 (en) 2013-03-15 2014-07-22 Palantir Technologies, Inc. Generating data clusters with customizable analysis strategies
US11164271B2 (en) 2013-03-15 2021-11-02 Csidentity Corporation Systems and methods of delayed authentication and billing for on-demand products
US9171334B1 (en) 2013-03-15 2015-10-27 Palantir Technologies Inc. Tax data clustering
US11288677B1 (en) 2013-03-15 2022-03-29 Consumerlnfo.com, Inc. Adjustment of knowledge-based authentication
US20190205897A1 (en) * 2013-03-15 2019-07-04 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive investigation based on automatic malfeasance clustering of related data in various data structures
US9965937B2 (en) 2013-03-15 2018-05-08 Palantir Technologies Inc. External malware data item clustering and analysis
US20190166135A1 (en) * 2013-03-15 2019-05-30 Palantir Technologies Inc. Generating data clusters
US11790473B2 (en) 2013-03-15 2023-10-17 Csidentity Corporation Systems and methods of delayed authentication and billing for on-demand products
US10275778B1 (en) 2013-03-15 2019-04-30 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive investigation based on automatic malfeasance clustering of related data in various data structures
US10264014B2 (en) 2013-03-15 2019-04-16 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive investigation based on automatic clustering of related data in various data structures
US11775979B1 (en) 2013-03-15 2023-10-03 Consumerinfo.Com, Inc. Adjustment of knowledge-based authentication
US8818892B1 (en) 2013-03-15 2014-08-26 Palantir Technologies, Inc. Prioritizing data clusters with customizable scoring strategies
US10216801B2 (en) 2013-03-15 2019-02-26 Palantir Technologies Inc. Generating data clusters
US11803929B1 (en) 2013-05-23 2023-10-31 Consumerinfo.Com, Inc. Digital identity
US11120519B2 (en) 2013-05-23 2021-09-14 Consumerinfo.Com, Inc. Digital identity
US9424337B2 (en) 2013-07-09 2016-08-23 Sas Institute Inc. Number of clusters estimation
US10719527B2 (en) 2013-10-18 2020-07-21 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive simultaneous querying of multiple data stores
US10579647B1 (en) 2013-12-16 2020-03-03 Palantir Technologies Inc. Methods and systems for analyzing entity performance
US9552615B2 (en) 2013-12-20 2017-01-24 Palantir Technologies Inc. Automated database analysis to detect malfeasance
US10356032B2 (en) 2013-12-26 2019-07-16 Palantir Technologies Inc. System and method for detecting confidential information emails
US10805321B2 (en) 2014-01-03 2020-10-13 Palantir Technologies Inc. System and method for evaluating network threats and usage
US10230746B2 (en) 2014-01-03 2019-03-12 Palantir Technologies Inc. System and method for evaluating network threats and usage
US10873603B2 (en) 2014-02-20 2020-12-22 Palantir Technologies Inc. Cyber security sharing and identification system
US9009827B1 (en) 2014-02-20 2015-04-14 Palantir Technologies Inc. Security sharing system
US9923925B2 (en) 2014-02-20 2018-03-20 Palantir Technologies Inc. Cyber security sharing and identification system
US9202178B2 (en) 2014-03-11 2015-12-01 Sas Institute Inc. Computerized cluster analysis framework for decorrelated cluster identification in datasets
US11587150B1 (en) 2014-04-25 2023-02-21 Csidentity Corporation Systems and methods for eligibility verification
US11074641B1 (en) 2014-04-25 2021-07-27 Csidentity Corporation Systems, methods and computer-program products for eligibility verification
US11341178B2 (en) 2014-06-30 2022-05-24 Palantir Technologies Inc. Systems and methods for key phrase characterization of documents
US10180929B1 (en) 2014-06-30 2019-01-15 Palantir Technologies, Inc. Systems and methods for identifying key phrase clusters within documents
US9535974B1 (en) 2014-06-30 2017-01-03 Palantir Technologies Inc. Systems and methods for identifying key phrase clusters within documents
US9021260B1 (en) 2014-07-03 2015-04-28 Palantir Technologies Inc. Malware data item analysis
US10929436B2 (en) 2014-07-03 2021-02-23 Palantir Technologies Inc. System and method for news events detection and visualization
US9344447B2 (en) 2014-07-03 2016-05-17 Palantir Technologies Inc. Internal malware data item clustering and analysis
US10798116B2 (en) 2014-07-03 2020-10-06 Palantir Technologies Inc. External malware data item clustering and analysis
US9881074B2 (en) 2014-07-03 2018-01-30 Palantir Technologies Inc. System and method for news events detection and visualization
US9202249B1 (en) 2014-07-03 2015-12-01 Palantir Technologies Inc. Data item clustering and analysis
US9785773B2 (en) 2014-07-03 2017-10-10 Palantir Technologies Inc. Malware data item analysis
US10572496B1 (en) 2014-07-03 2020-02-25 Palantir Technologies Inc. Distributed workflow system and database with access controls for city resiliency
US9875293B2 (en) 2014-07-03 2018-01-23 Palanter Technologies Inc. System and method for news events detection and visualization
US9998485B2 (en) 2014-07-03 2018-06-12 Palantir Technologies, Inc. Network intrusion data item clustering and analysis
US11941635B1 (en) 2014-10-31 2024-03-26 Experian Information Solutions, Inc. System and architecture for electronic fraud detection
US10990979B1 (en) 2014-10-31 2021-04-27 Experian Information Solutions, Inc. System and architecture for electronic fraud detection
US11436606B1 (en) 2014-10-31 2022-09-06 Experian Information Solutions, Inc. System and architecture for electronic fraud detection
US10728277B2 (en) 2014-11-06 2020-07-28 Palantir Technologies Inc. Malicious software detection in a computing system
US10135863B2 (en) 2014-11-06 2018-11-20 Palantir Technologies Inc. Malicious software detection in a computing system
US9043894B1 (en) 2014-11-06 2015-05-26 Palantir Technologies Inc. Malicious software detection in a computing system
US9558352B1 (en) 2014-11-06 2017-01-31 Palantir Technologies Inc. Malicious software detection in a computing system
US11252248B2 (en) 2014-12-22 2022-02-15 Palantir Technologies Inc. Communication data processing architecture
US9367872B1 (en) 2014-12-22 2016-06-14 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive investigation of bad actor behavior based on automatic clustering of related data in various data structures
US10447712B2 (en) 2014-12-22 2019-10-15 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive investigation of bad actor behavior based on automatic clustering of related data in various data structures
US10362133B1 (en) 2014-12-22 2019-07-23 Palantir Technologies Inc. Communication data processing architecture
US10552994B2 (en) 2014-12-22 2020-02-04 Palantir Technologies Inc. Systems and interactive user interfaces for dynamic retrieval, analysis, and triage of data items
US9589299B2 (en) 2014-12-22 2017-03-07 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive investigation of bad actor behavior based on automatic clustering of related data in various data structures
EP3037991A1 (fr) * 2014-12-22 2016-06-29 Palantir Technologies, Inc. Systèmes et interfaces utilisateur pour recherche dynamique et interactive de comportement intervenant défectueux sur la base de regroupement automatique de données connexes dans diverses structures de données
US9898528B2 (en) 2014-12-22 2018-02-20 Palantir Technologies Inc. Concept indexing among database of documents using machine learning techniques
US10552998B2 (en) 2014-12-29 2020-02-04 Palantir Technologies Inc. System and method of generating data points from one or more data stores of data items for chart creation and manipulation
US9817563B1 (en) 2014-12-29 2017-11-14 Palantir Technologies Inc. System and method of generating data points from one or more data stores of data items for chart creation and manipulation
US11151468B1 (en) 2015-07-02 2021-10-19 Experian Information Solutions, Inc. Behavior analysis using distributed representations of event data
US10223748B2 (en) 2015-07-30 2019-03-05 Palantir Technologies Inc. Systems and user interfaces for holistic, data-driven investigation of bad actor behavior based on clustering and scoring of related data
US11501369B2 (en) 2015-07-30 2022-11-15 Palantir Technologies Inc. Systems and user interfaces for holistic, data-driven investigation of bad actor behavior based on clustering and scoring of related data
US9454785B1 (en) 2015-07-30 2016-09-27 Palantir Technologies Inc. Systems and user interfaces for holistic, data-driven investigation of bad actor behavior based on clustering and scoring of related data
US9456000B1 (en) 2015-08-06 2016-09-27 Palantir Technologies Inc. Systems, methods, user interfaces, and computer-readable media for investigating potential malicious communications
US9635046B2 (en) 2015-08-06 2017-04-25 Palantir Technologies Inc. Systems, methods, user interfaces, and computer-readable media for investigating potential malicious communications
US10484407B2 (en) 2015-08-06 2019-11-19 Palantir Technologies Inc. Systems, methods, user interfaces, and computer-readable media for investigating potential malicious communications
US10489391B1 (en) 2015-08-17 2019-11-26 Palantir Technologies Inc. Systems and methods for grouping and enriching data items accessed from one or more databases for presentation in a user interface
US11048706B2 (en) 2015-08-28 2021-06-29 Palantir Technologies Inc. Malicious activity detection system capable of efficiently processing data accessed from databases and generating alerts for display in interactive user interfaces
US10346410B2 (en) 2015-08-28 2019-07-09 Palantir Technologies Inc. Malicious activity detection system capable of efficiently processing data accessed from databases and generating alerts for display in interactive user interfaces
US9898509B2 (en) 2015-08-28 2018-02-20 Palantir Technologies Inc. Malicious activity detection system capable of efficiently processing data accessed from databases and generating alerts for display in interactive user interfaces
US10572487B1 (en) 2015-10-30 2020-02-25 Palantir Technologies Inc. Periodic database search manager for multiple data sources
US10318630B1 (en) 2016-11-21 2019-06-11 Palantir Technologies Inc. Analysis of large bodies of textual data
US11681282B2 (en) 2016-12-20 2023-06-20 Palantir Technologies Inc. Systems and methods for determining relationships between defects
US10620618B2 (en) 2016-12-20 2020-04-14 Palantir Technologies Inc. Systems and methods for determining relationships between defects
US10325224B1 (en) 2017-03-23 2019-06-18 Palantir Technologies Inc. Systems and methods for selecting machine learning training data
US11481410B1 (en) 2017-03-30 2022-10-25 Palantir Technologies Inc. Framework for exposing network activities
US11947569B1 (en) 2017-03-30 2024-04-02 Palantir Technologies Inc. Framework for exposing network activities
US10606866B1 (en) 2017-03-30 2020-03-31 Palantir Technologies Inc. Framework for exposing network activities
US11714869B2 (en) 2017-05-02 2023-08-01 Palantir Technologies Inc. Automated assistance for generating relevant and valuable search results for an entity of interest
US10235461B2 (en) 2017-05-02 2019-03-19 Palantir Technologies Inc. Automated assistance for generating relevant and valuable search results for an entity of interest
US11210350B2 (en) 2017-05-02 2021-12-28 Palantir Technologies Inc. Automated assistance for generating relevant and valuable search results for an entity of interest
US10482382B2 (en) 2017-05-09 2019-11-19 Palantir Technologies Inc. Systems and methods for reducing manufacturing failure rates
US11537903B2 (en) 2017-05-09 2022-12-27 Palantir Technologies Inc. Systems and methods for reducing manufacturing failure rates
US11954607B2 (en) 2017-05-09 2024-04-09 Palantir Technologies Inc. Systems and methods for reducing manufacturing failure rates
US11157650B1 (en) 2017-09-28 2021-10-26 Csidentity Corporation Identity security architecture systems and methods
US10699028B1 (en) 2017-09-28 2020-06-30 Csidentity Corporation Identity security architecture systems and methods
US11580259B1 (en) 2017-09-28 2023-02-14 Csidentity Corporation Identity security architecture systems and methods
US10896472B1 (en) 2017-11-14 2021-01-19 Csidentity Corporation Security and identity verification system and architecture
US10838987B1 (en) 2017-12-20 2020-11-17 Palantir Technologies Inc. Adaptive and transparent entity screening
US11119630B1 (en) 2018-06-19 2021-09-14 Palantir Technologies Inc. Artificial intelligence assisted evaluations and user interface for same
US10911234B2 (en) 2018-06-22 2021-02-02 Experian Information Solutions, Inc. System and method for a token gateway environment
US11588639B2 (en) 2018-06-22 2023-02-21 Experian Information Solutions, Inc. System and method for a token gateway environment
CN108985950A (zh) * 2018-07-13 2018-12-11 平安科技(深圳)有限公司 电子装置、用户骗保风险预警方法及存储介质
CN108985950B (zh) * 2018-07-13 2023-04-18 平安科技(深圳)有限公司 电子装置、用户骗保风险预警方法及存储介质
US11941065B1 (en) 2019-09-13 2024-03-26 Experian Information Solutions, Inc. Single identifier platform for storing entity data

Similar Documents

Publication Publication Date Title
WO2013126281A1 (fr) Systèmes et procédés pour l'analyse présumée de groupement
Gaitonde et al. Interventions to reduce corruption in the health sector
Sithic et al. Survey of insurance fraud detection using data mining techniques
Kirlidog et al. A fraud detection approach with data mining in health insurance
Harrell et al. Victims of identity theft, 2012
Ekin et al. Statistical medical fraud assessment: exposition to an emerging field
Kowshalya et al. Predicting fraudulent claims in automobile insurance
US20140081652A1 (en) Automated Healthcare Risk Management System Utilizing Real-time Predictive Models, Risk Adjusted Provider Cost Index, Edit Analytics, Strategy Management, Managed Learning Environment, Contact Management, Forensic GUI, Case Management And Reporting System For Preventing And Detecting Healthcare Fraud, Abuse, Waste And Errors
CN113994323A (zh) 智能警报系统
US8429050B2 (en) Method for detecting ineligibility of a beneficiary and system
Anbarasi et al. Fraud detection using outlier predictor in health insurance data
WO2022228688A1 (fr) Système automatisé de surveillance de fraude et de déclenchement pour détecter des motifs inhabituels associés à une activité frauduleuse, et procédé correspondant
Schmidtlein et al. Disaster declarations and major hazard occurrences in the United States
Reddy et al. Entropic analysis in financial forensics
Yange A Fraud Detection System for Health Insurance in Nigeria
Khurjekar et al. Detection of fraudulent claims using hierarchical cluster analysis
Kajwang Implications for big data analytics on claims fraud management in insurance sector
Skidmore et al. Vulnerability as a driver of the police response to fraud
Luell Employee fraud detection under real world conditions
Power et al. Sharing and analyzing data to reduce insurance fraud
Desi et al. Forensic Accounting, a Veritable Financial Tool for Qualitative Financial Reporting Systems in the 21st Century
Timofeyev et al. Current trends in insurance fraud in Russia: Evidence from a survey of industry experts
Aiken Analyzing proactive fraud detection software tools and the push for quicker Solutions
Shekhar et al. Unsupervised Machine Learning for Explainable Health Care Fraud Detection
Şen et al. Detecting falsified financial statements using data mining: empirical research on finance sector in Turkey

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13752433

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13752433

Country of ref document: EP

Kind code of ref document: A1