US20190243840A1 - Identification and compiling of information relating to an entity - Google Patents
Identification and compiling of information relating to an entity Download PDFInfo
- Publication number
- US20190243840A1 US20190243840A1 US16/389,300 US201916389300A US2019243840A1 US 20190243840 A1 US20190243840 A1 US 20190243840A1 US 201916389300 A US201916389300 A US 201916389300A US 2019243840 A1 US2019243840 A1 US 2019243840A1
- Authority
- US
- United States
- Prior art keywords
- records
- entity
- record
- search
- interface
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
- G06F16/24573—Query processing with adaptation to user needs using data annotations, e.g. user-defined metadata
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
- G06F16/24578—Query processing with adaptation to user needs using ranking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
Definitions
- This disclosure relates to approaches for identifying and compiling information relating to an entity for investigative analysis.
- Collection of all available digital records of an entity is useful for investigation, such as by the police department or a potential employer as part of a background check.
- each record may not be associated with the complete or correct identifying information for the entity, and different databases may have entirely different structure or ontology, making collection of such record challenging.
- Various embodiments of the present disclosure can include systems, methods, and non-transitory computer readable media configured to identify and compile information relating to an entity for investigative analysis.
- the systems, methods, and non-transitory computer readable media are configured to implement a method that entails searching, in one or more data sources, with a plurality of known characteristics of an entity to obtain a first plurality of records, identifying, from the first plurality of records, a subset of records that match the known characteristics with a substantial confidence, compiling the subset of records to form a unified record representing the entity, and conducting a second search with information from the unified record to obtain a second plurality of search results.
- the method further comprises presenting, on an interface, at least part of the records from the first plurality and the second plurality, wherein the interface is configured to allow a user to annotate the records.
- the annotation comprises confirmation that a record is associated with the entity.
- the method further comprises storing the annotation in a library in a non-transitory medium.
- the method further comprises ranking the records before presenting the records on the interface.
- the records of the subset are those that have perfect match to the known characteristics.
- the method further comprises generating variations of the known characteristics as additional queries for the search.
- FIG. 1 illustrates a procedure for obtaining, compiling and presenting information relating to an entity for user analysis.
- FIG. 2 illustrates a flowchart of an example method for obtaining, compiling and presenting information relating to an entity for user analysis.
- FIG. 3 is a block diagram that illustrates a computer system upon which any of the embodiments described herein may be implemented.
- Information relating to entities is scattered in different databases. Different records of an entity, such as financial transactions, are often stored individually rather than collectively, which makes the retrieval, visualization and analysis difficult for end users. Moreover, the entities in each record may be identified with different identifications or characteristics of the entity. This further presents a challenge for identifying all relevant information for effective investigation of the entity. Also, redundant or duplicative information about the entity can present challenges for data management or even skew the analysis. A solution is needed for identifying and compiling all available information relating to the entity and enabling an investigator to conduct investigation with the information.
- a method entails collecting basic information (e.g., name, phone number, date of birth, social security number, email address and address) of an entity and generating one or more search queries.
- the search queries are used to search against a plurality of data sources for exact as well as approximate matches.
- the different data sources may be de-centralized, or federated where no master data management systems or defined standards are employed to manage the data sources. All of the matches are collected, and those that most likely relate to the entity (e.g., having perfect matches on name and social security number) are combined into a single record. Optionally, identical records can be merged to remove redundancy.
- additional searches can be formulated based on the initial search results.
- the search query can include the basic information of the entity as used in the previous step and can also include relevant information returned from the last search. All of the search results can be combined and presented, on a user interface, to an investigator. The search results can be ranked based on significance or relevance, facilitating analysis by the investigator. The interface can also enable the investigator to annotate the search results, and delete certain results as needed. Any annotation or change that the investigator makes can be optionally saved to a library, which can be shared with other investigators or archived for future use.
- all or part of the search results can be added back to a data repository serving to enrich the knowledge of the entity.
- the addition and accumulation of such added information can improve further searches of the entity.
- an alert can be set up by a user such that a search can be carried out on a predetermined schedule and the search results can be presented to the user.
- only new results are returned to the user.
- the searches are automatically updated to incorporate additional information relating to the entity after such information becomes available from the searches.
- the information identified and compiled as described represents a comprehensive collection of information relating to the entity and all the records of the final results represent potential connections between the entity and activities which may be worth further investigation.
- the present technology therefore, provides a fast, automated, convenient, and comprehensive method to compile information from different data sources relating to an entity, and to present to an investigator potential connections between data records for investigation.
- entity refers to any real world object that has attributes useful for identifying the object.
- An entity can be a person or an organization, and can also be an account, a place, or an event. Attributes for the entity include, for example, names, identification number, characteristics and address, without limitation.
- database may refer to any data structure for storing and/or organizing data, including, but not limited to, relational databases (Oracle database, mySQL database, Cassandra database, etc.), spreadsheets, XML files, and text file, among others.
- relational databases Oracle database, mySQL database, Cassandra database, etc.
- spreadsheets XML files
- text file among others.
- a database schema of a database system is its structure described in a formal language supported by the database management system.
- FIG. 1 illustrates a process for identifying and compiling information relating to an entity that is implemented by a computing system.
- the system receives one or more search queries relating to an entity 101 . If the entity is a person, the search queries include certain basic attributes of the person, such as name, social security number, date of birth, email address, address, or passport number, or their combinations.
- the searches can be carried out on one or multiple platforms or multiple databases 110 .
- Each database may have different schema, structure, or content of information. Nevertheless, each entry in a database that can be retrieved as relating to the entity can be commonly referred to as a “record”.
- a record for the entity may be a historic record of an action, such as a financial transaction, associated with the entity, or simply some basic information about the entity, e.g., being listed as a registered voter.
- the searches can be carried out asynchronically or synchronically, and in any manner suitable for the queries and the databases. In some embodiments, the search queries can be broadened up to maximize the chance of returning potentially relevant records, such as using variations of the attributes or wild-cards.
- the multiple databases are parts of a de-centralized database system where a systemically designed database is partitioned into multiple portions each of which can be hosted at a different location.
- at least some of the databases are autonomous and thereby constituting part of a federated database system.
- a federated database system maps multiple autonomous and disparate databases into a single federated database.
- the disparate databases can be interconnected via a computer network and may be geographically decentralized. In some embodiments, there is no data integration between the disparate databases.
- the multiple databases are independently hosted and managed, and may have different access control.
- the multiple databases may be databases owned or managed by different banks, companies, or government agencies.
- the present technology in some embodiments, is configured to interface with each of the disparate or independent databases to identify information that may be related to the entity.
- the searches will produce a number of records as potentially relating to the entity, e.g., records r 1 through r 7 , as shown in FIG. 1 .
- records r 1 through r 7 Upon retrieval of these records, which can be optionally saved in a computer medium or temporarily kept in the memory 120 , the system can conduct certain basic and automated analysis of the records.
- records r 1 , r 2 and r 3 have close-to-perfect matches to the attributes of the entity, e.g., with total match of name, data of birth and social security number. Such records are considered to match the attributes of the entities used in the search query with a substantial confidence.
- “Matching with a substantial confidence” as used herein means that the similarity between one or more attributes (e.g., name, address, social security number) of a record in a search result and one or more attributes used in the search query is statistically significant.
- matching with substantial confidence requires a perfect match of at least one attribute.
- matching with substantial confidence requires a perfect match of at least two attributes.
- matching with substantial confidence requires a perfect match of at least one or two attribute and a partial match of another attribute with a mismatch of no more than one character (e.g., letter or digit), or no more than two characters.
- a “unified record” as used herein refers to a record generated by the system by compiling information from two or more records in the search results.
- the compilation can collapse attributes that are identical in all of the two or more records. For instance, if every record has the same social security number, then only one social security number needs to be saved in the unified record. On the other hand, for attributes that have variations, (e.g., different addresses or different spelling of first name), the variants are all saved in the unified record.
- the system conducts automatic compilation for records r 1 -r 3 .
- the compiled record likely includes additional information about the entity that was not apparent before the search. For instance, the search by social security number to return aliases or secondary addresses of the entity. The search by name may return fraudulent password numbers used by the entity. Such additional information (see underlined words in 122 ) can then be used for a second round of searches. It is likely that the first round of search will return a large number of records, some of important ones of which may be presented late in the list or buried in the list. The second round of search can likely bring such records back to a user's attention. Without limitation, a third, fourth or even more rounds of searches can be carried out to further enrich or refine the information relating to the entity.
- the system now has collected information relating to the entity, with each record providing potentially relevant connection to activities of the entity, such as suspicious financial activities.
- the entity is now represented by all the information compiled from the search results relating to the entity.
- Each record represents a “potential connection” between the entity and the activity.
- the system can present the records on a user interface (e.g., 131 ) to a user.
- a user interface e.g., 131
- the system can rank the records before presenting them on the user interface. The ranking method may be dependent upon the type of the entity. For instance, for a system that is set up to detect suspicious activities, a record that includes an activity will be ranked higher than a record that only includes basic information about the entity.
- the interface can optionally further enable the user to mark or annotate the records (as illustrated in FIG. 1 ).
- the user can mark a record, say r 6 , as not relevant to the entity by checking the content of the record, and thereby allowing the record to be deleted from the system.
- the user may also mark a record, say r 11 , as highly relevant to the entity and includes important information for further investigation. Such marking also confirms the record as a potential connection to the entity.
- this technology provides an efficient approach to build a comprehensive repository of information relating to an entity of interest, and establish potential connections between the entity and activities or transactions of value for further investigation.
- the system can optionally record the annotation in a library for future use or to be shared with other users.
- the annotations can also serve as feedback for the search and be used to improve the search algorithm.
- the annotation can further trigger another round of search with information identified by the user as highly important or relevant.
- Search alerts can be generated automatically or upon user request, in some embodiments.
- the user upon completion of a search for an entity, the user can request to save the search as an alert.
- the search will be automated by the system at a default schedule (e.g., daily or weekly) or at a schedule set by the user. If the schedule search returns information that has a timestamp newer than the previous search time, then an alert is sent (e.g., by email) to the user with the new information. Alternatively, in another example, the new search result is compared to the previous one and any new information is included in the alert.
- the search can be automatically updated, after each search, to include newly discovered information relating to the entity, such as information with a high confidence and/or relevant level.
- the update requires confirmation or optimization by the user.
- a search can be requested based on a complex subject.
- complex subject refers to a collection of different types of entities, such as a case report, a transaction record, or security log.
- the transaction record may include identifying information of multiple persons (e.g., name and SSN), multiple accounts (e.g., account type and number), and locations of transactions (e.g., address, zip code, and branch name). Each of these entities can be subject to a search.
- the system when a user enters such a complex subject for the search, the system is configured to identify and extract some or all of the entities included in the complex subject and conducts a search for each of the entities. Upon completion of all the searches, the system can compile the search results and present them to the user optionally in a single feed. In some embodiments, the system can use information from the complex subject and/or the search results to understand the relationship and thereby compile and/or present the search results taking advantage of the knowledge of such relationship.
- FIG. 2 illustrates a flowchart of an example method 200 for identifying and compiling information relating to an entity for investigative analysis, according to various embodiments of the present disclosure.
- the method 200 may be implemented in various environments including, for example, the system of FIG. 3 .
- the operations of method 200 presented below are intended to be illustrative. Depending on the implementation, the example method 200 may include additional, fewer, or alternative steps performed in various orders or in parallel.
- the example method 200 may be implemented in various computing systems or devices including one or more processors.
- a computer system receives name, identification or another basic characteristic of an entity as keywords for a search for information relating to the entity.
- the system generates one or more search quires optionally with variations of the keywords, and then at block 205 , the system conducts searches in one or more data sources with the search queries.
- search results would have near-perfect match to the basic information of the entity. Such matches are identified and compiled to form a compiled record representing the entity (block 207 ).
- additional information from the search results is selected to be used for a second round of searches, followed by the actual searches (block 211 ). With the second round of searches, all the search results can be presented to a user for further investigation and analysis.
- the search results are ranked to facilitate the user analysis (block 213 ).
- the system can update the search results with respect to potential connection to the entity (block 215 ).
- the techniques described herein are implemented by one or more special-purpose computing devices.
- the special-purpose computing devices may be hard-wired to perform the techniques, or may include circuitry or digital electronic devices such as one or more application-specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) that are persistently programmed to perform the techniques, or may include one or more hardware processors programmed to perform the techniques pursuant to program instructions in firmware, memory, other storage, or a combination.
- ASICs application-specific integrated circuits
- FPGAs field programmable gate arrays
- Such special-purpose computing devices may also combine custom hard-wired logic, ASICs, or FPGAs with custom programming to accomplish the techniques.
- the special-purpose computing devices may be desktop computer systems, server computer systems, portable computer systems, handheld devices, networking devices or any other device or combination of devices that incorporate hard-wired and/or program logic to implement the techniques.
- Computing device(s) are generally controlled and coordinated by operating system software, such as iOS, Android, Chrome OS, Windows XP, Windows Vista, Windows 7 , Windows 8 , Windows Server, Windows CE, Unix, Linux, SunOS, Solaris, iOS, Blackberry OS, VxWorks, or other compatible operating systems.
- operating system software such as iOS, Android, Chrome OS, Windows XP, Windows Vista, Windows 7 , Windows 8 , Windows Server, Windows CE, Unix, Linux, SunOS, Solaris, iOS, Blackberry OS, VxWorks, or other compatible operating systems.
- the computing device may be controlled by a proprietary operating system.
- Conventional operating systems control and schedule computer processes for execution, perform memory management, provide file system, networking, I/O services, and provide a user interface functionality, such as a graphical user interface (“GUI”), among other things.
- GUI graphical user interface
- FIG. 3 is a block diagram that illustrates a computer system 300 upon which any of the embodiments described herein may be implemented.
- the computer system 300 includes a bus 302 or other communication mechanism for communicating information, one or more hardware processors 304 coupled with bus 302 for processing information.
- Hardware processor(s) 304 may be, for example, one or more general purpose microprocessors.
- the computer system 300 also includes a main memory 306 , such as a random access memory (RAM), cache and/or other dynamic storage devices, coupled to bus 302 for storing information and instructions to be executed by processor 304 .
- Main memory 306 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 304 .
- Such instructions when stored in storage media accessible to processor 304 , render computer system 300 into a special-purpose machine that is customized to perform the operations specified in the instructions.
- the computer system 300 further includes a read only memory (ROM) 308 or other static storage device coupled to bus 302 for storing static information and instructions for processor 304 .
- ROM read only memory
- a storage device 310 such as a magnetic disk, optical disk, or USB thumb drive (Flash drive), etc., is provided and coupled to bus 302 for storing information and instructions.
- the computer system 300 may be coupled via bus 302 to a display 312 , such as a cathode ray tube (CRT) or LCD display (or touch screen), for displaying information to a computer user.
- a display 312 such as a cathode ray tube (CRT) or LCD display (or touch screen)
- An input device 314 is coupled to bus 302 for communicating information and command selections to processor 304 .
- cursor control 316 is Another type of user input device, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 304 and for controlling cursor movement on display 312 .
- This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
- a first axis e.g., x
- a second axis e.g., y
- the same direction information and command selections as cursor control may be implemented via receiving touches on a touch screen without a cursor.
- the computing system 300 may include a user interface module to implement a GUI that may be stored in a mass storage device as executable software codes that are executed by the computing device(s).
- This and other modules may include, by way of example, components, such as software components, object-oriented software components, class components and task components, processes, functions, attributes, procedures, subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays, and variables.
- module refers to logic embodied in hardware or firmware, or to a collection of software instructions, possibly having entry and exit points, written in a programming language, such as, for example, Java, C or C++.
- a software module may be compiled and linked into an executable program, installed in a dynamic link library, or may be written in an interpreted programming language such as, for example, BASIC, Perl, or Python. It will be appreciated that software modules may be callable from other modules or from themselves, and/or may be invoked in response to detected events or interrupts.
- Software modules configured for execution on computing devices may be provided on a computer readable medium, such as a compact disc, digital video disc, flash drive, magnetic disc, or any other tangible medium, or as a digital download (and may be originally stored in a compressed or installable format that requires installation, decompression or decryption prior to execution).
- Such software code may be stored, partially or fully, on a memory device of the executing computing device, for execution by the computing device.
- Software instructions may be embedded in firmware, such as an EPROM.
- hardware modules may be comprised of connected logic units, such as gates and flip-flops, and/or may be comprised of programmable units, such as programmable gate arrays or processors.
- the modules or computing device functionality described herein are preferably implemented as software modules, but may be represented in hardware or firmware. Generally, the modules described herein refer to logical modules that may be combined with other modules or divided into sub-modules despite their physical organization or storage.
- the computer system 300 may implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs computer system 300 to be a special-purpose machine. According to one embodiment, the techniques herein are performed by computer system 300 in response to processor(s) 304 executing one or more sequences of one or more instructions contained in main memory 306 . Such instructions may be read into main memory 306 from another storage medium, such as storage device 310 . Execution of the sequences of instructions contained in main memory 306 causes processor(s) 304 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions.
- non-transitory media refers to any media that store data and/or instructions that cause a machine to operate in a specific fashion. Such non-transitory media may comprise non-volatile media and/or volatile media.
- Non-volatile media includes, for example, optical or magnetic disks, such as storage device 310 .
- Volatile media includes dynamic memory, such as main memory 306 .
- non-transitory media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge, and networked versions of the same.
- Non-transitory media is distinct from but may be used in conjunction with transmission media.
- Transmission media participates in transferring information between non-transitory media.
- transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 302 .
- transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.
- Various forms of media may be involved in carrying one or more sequences of one or more instructions to processor 304 for execution.
- the instructions may initially be carried on a magnetic disk or solid state drive of a remote computer.
- the remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem.
- a modem local to computer system 300 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal.
- An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus 302 .
- Bus 302 carries the data to main memory 306 , from which processor 304 retrieves and executes the instructions.
- the instructions received by main memory 306 may retrieve and execute the instructions.
- the instructions received by main memory 306 may optionally be stored on storage device 310 either before or after execution by processor 304 .
- the computer system 300 also includes a communication interface 318 coupled to bus 302 .
- Communication interface 318 provides a two-way data communication coupling to one or more network links that are connected to one or more local networks.
- communication interface 318 may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line.
- ISDN integrated services digital network
- communication interface 318 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN (or WAN component to communicated with a WAN).
- LAN local area network
- Wireless links may also be implemented.
- communication interface 318 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
- a network link typically provides data communication through one or more networks to other data devices.
- a network link may provide a connection through local network to a host computer or to data equipment operated by an Internet Service Provider (ISP).
- ISP Internet Service Provider
- the ISP in turn provides data communication services through the world wide packet data communication network now commonly referred to as the “Internet”.
- Internet Internet
- Local network and Internet both use electrical, electromagnetic or optical signals that carry digital data streams.
- the signals through the various networks and the signals on network link and through communication interface 318 which carry the digital data to and from computer system 300 , are example forms of transmission media.
- the computer system 300 can send messages and receive data, including program code, through the network(s), network link and communication interface 318 .
- a server might transmit a requested code for an application program through the Internet, the ISP, the local network and the communication interface 318 .
- the received code may be executed by processor 304 as it is received, and/or stored in storage device 310 , or other non-volatile storage for later execution.
- Conditional language such as, among others, “can,” “could,” “might,” or “may,” unless specifically stated otherwise, or otherwise understood within the context as used, is generally intended to convey that certain embodiments include, while other embodiments do not include, certain features, elements and/or steps. Thus, such conditional language is not generally intended to imply that features, elements and/or steps are in any way required for one or more embodiments or that one or more embodiments necessarily include logic for deciding, with or without user input or prompting, whether these features, elements and/or steps are included or are to be performed in any particular embodiment.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Economics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Library & Information Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Stored Programmes (AREA)
Abstract
Description
- This application is a continuation of U.S. application Ser. No. 15/590,956, filed on May 9, 2017, which claims the benefit under 35 U.S.C. § 119(e) of the United States Provisional Application Ser. No. 62/434,936 filed Dec. 15, 2016, the content of which is hereby incorporated by reference in its entirety.
- This disclosure relates to approaches for identifying and compiling information relating to an entity for investigative analysis.
- Collection of all available digital records of an entity is useful for investigation, such as by the police department or a potential employer as part of a background check. There is no centralized database that includes all of the relevant records. Further, each record may not be associated with the complete or correct identifying information for the entity, and different databases may have entirely different structure or ontology, making collection of such record challenging.
- Various embodiments of the present disclosure can include systems, methods, and non-transitory computer readable media configured to identify and compile information relating to an entity for investigative analysis. In some embodiments, the systems, methods, and non-transitory computer readable media are configured to implement a method that entails searching, in one or more data sources, with a plurality of known characteristics of an entity to obtain a first plurality of records, identifying, from the first plurality of records, a subset of records that match the known characteristics with a substantial confidence, compiling the subset of records to form a unified record representing the entity, and conducting a second search with information from the unified record to obtain a second plurality of search results.
- In some embodiments, the method further comprises presenting, on an interface, at least part of the records from the first plurality and the second plurality, wherein the interface is configured to allow a user to annotate the records. In some embodiments, the annotation comprises confirmation that a record is associated with the entity. In some embodiments, the method further comprises storing the annotation in a library in a non-transitory medium. In some embodiments, the method further comprises ranking the records before presenting the records on the interface.
- In some embodiments, the records of the subset are those that have perfect match to the known characteristics. In some embodiments, the method further comprises generating variations of the known characteristics as additional queries for the search.
- These and other features of the systems, methods, and non-transitory computer readable media disclosed herein, as well as the methods of operation and functions of the related elements of structure and the combination of parts and economies of manufacture, will become more apparent upon consideration of the following description and the appended claims with reference to the accompanying drawings, all of which form a part of this specification, wherein like reference numerals designate corresponding parts in the various figures. It is to be expressly understood, however, that the drawings are for purposes of illustration and description only and are not intended as a definition of the limits of the invention.
- Certain features of various embodiments of the present technology are set forth with particularity in the appended claims. A better understanding of the features and advantages of the technology will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention are utilized, and the accompanying drawings of which:
-
FIG. 1 illustrates a procedure for obtaining, compiling and presenting information relating to an entity for user analysis. -
FIG. 2 illustrates a flowchart of an example method for obtaining, compiling and presenting information relating to an entity for user analysis. -
FIG. 3 is a block diagram that illustrates a computer system upon which any of the embodiments described herein may be implemented. - The figures depict various embodiments of the disclosed technology for purposes of illustration only, wherein the figures use like reference numerals to identify like elements. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the structures and methods illustrated in the figures can be employed without departing from the principles of the disclosed technology described herein.
- Information relating to entities, such as a subject of an investigation, is scattered in different databases. Different records of an entity, such as financial transactions, are often stored individually rather than collectively, which makes the retrieval, visualization and analysis difficult for end users. Moreover, the entities in each record may be identified with different identifications or characteristics of the entity. This further presents a challenge for identifying all relevant information for effective investigation of the entity. Also, redundant or duplicative information about the entity can present challenges for data management or even skew the analysis. A solution is needed for identifying and compiling all available information relating to the entity and enabling an investigator to conduct investigation with the information.
- A claimed solution rooted in computer technology overcomes problems specifically arising in the realm of computer technology. In various implementations, a method entails collecting basic information (e.g., name, phone number, date of birth, social security number, email address and address) of an entity and generating one or more search queries. The search queries are used to search against a plurality of data sources for exact as well as approximate matches. The different data sources may be de-centralized, or federated where no master data management systems or defined standards are employed to manage the data sources. All of the matches are collected, and those that most likely relate to the entity (e.g., having perfect matches on name and social security number) are combined into a single record. Optionally, identical records can be merged to remove redundancy.
- In some embodiments, additional searches can be formulated based on the initial search results. The search query can include the basic information of the entity as used in the previous step and can also include relevant information returned from the last search. All of the search results can be combined and presented, on a user interface, to an investigator. The search results can be ranked based on significance or relevance, facilitating analysis by the investigator. The interface can also enable the investigator to annotate the search results, and delete certain results as needed. Any annotation or change that the investigator makes can be optionally saved to a library, which can be shared with other investigators or archived for future use.
- In some embodiments, all or part of the search results can be added back to a data repository serving to enrich the knowledge of the entity. The addition and accumulation of such added information can improve further searches of the entity. In some embodiments, an alert can be set up by a user such that a search can be carried out on a predetermined schedule and the search results can be presented to the user. In some embodiments, only new results are returned to the user. In some embodiments, the searches are automatically updated to incorporate additional information relating to the entity after such information becomes available from the searches.
- The information identified and compiled as described represents a comprehensive collection of information relating to the entity and all the records of the final results represent potential connections between the entity and activities which may be worth further investigation. The present technology, therefore, provides a fast, automated, convenient, and comprehensive method to compile information from different data sources relating to an entity, and to present to an investigator potential connections between data records for investigation.
- The term “entity” refers to any real world object that has attributes useful for identifying the object. An entity can be a person or an organization, and can also be an account, a place, or an event. Attributes for the entity include, for example, names, identification number, characteristics and address, without limitation.
- The term “database” may refer to any data structure for storing and/or organizing data, including, but not limited to, relational databases (Oracle database, mySQL database, Cassandra database, etc.), spreadsheets, XML files, and text file, among others. In some embodiments, a database schema of a database system is its structure described in a formal language supported by the database management system.
-
FIG. 1 illustrates a process for identifying and compiling information relating to an entity that is implemented by a computing system. The system receives one or more search queries relating to anentity 101. If the entity is a person, the search queries include certain basic attributes of the person, such as name, social security number, date of birth, email address, address, or passport number, or their combinations. - The searches can be carried out on one or multiple platforms or
multiple databases 110. Each database may have different schema, structure, or content of information. Nevertheless, each entry in a database that can be retrieved as relating to the entity can be commonly referred to as a “record”. A record for the entity may be a historic record of an action, such as a financial transaction, associated with the entity, or simply some basic information about the entity, e.g., being listed as a registered voter. The searches can be carried out asynchronically or synchronically, and in any manner suitable for the queries and the databases. In some embodiments, the search queries can be broadened up to maximize the chance of returning potentially relevant records, such as using variations of the attributes or wild-cards. - The multiple databases, in some embodiments, are parts of a de-centralized database system where a systemically designed database is partitioned into multiple portions each of which can be hosted at a different location. In some embodiments, at least some of the databases are autonomous and thereby constituting part of a federated database system. A federated database system maps multiple autonomous and disparate databases into a single federated database. The disparate databases can be interconnected via a computer network and may be geographically decentralized. In some embodiments, there is no data integration between the disparate databases.
- In some embodiments, at least some of the multiple databases are independently hosted and managed, and may have different access control. For instance, the multiple databases may be databases owned or managed by different banks, companies, or government agencies. The present technology, in some embodiments, is configured to interface with each of the disparate or independent databases to identify information that may be related to the entity.
- The searches will produce a number of records as potentially relating to the entity, e.g., records r1 through r7, as shown in
FIG. 1 . Upon retrieval of these records, which can be optionally saved in a computer medium or temporarily kept in thememory 120, the system can conduct certain basic and automated analysis of the records. In the example ofFIG. 1 , records r1, r2 and r3 have close-to-perfect matches to the attributes of the entity, e.g., with total match of name, data of birth and social security number. Such records are considered to match the attributes of the entities used in the search query with a substantial confidence. “Matching with a substantial confidence” as used herein means that the similarity between one or more attributes (e.g., name, address, social security number) of a record in a search result and one or more attributes used in the search query is statistically significant. In one embodiment, matching with substantial confidence requires a perfect match of at least one attribute. In another embodiment, matching with substantial confidence requires a perfect match of at least two attributes. In one embodiment, matching with substantial confidence requires a perfect match of at least one or two attribute and a partial match of another attribute with a mismatch of no more than one character (e.g., letter or digit), or no more than two characters. - The records that are matched to the entity with a substantial confidence can be considered as belonging to the entity and thus all the information from the records can be combined into a unified record. Optionally, during the compilation, redundant information or records can be merged to reduce redundancy. A “unified record” as used herein refers to a record generated by the system by compiling information from two or more records in the search results. The compilation can collapse attributes that are identical in all of the two or more records. For instance, if every record has the same social security number, then only one social security number needs to be saved in the unified record. On the other hand, for attributes that have variations, (e.g., different addresses or different spelling of first name), the variants are all saved in the unified record.
- As shown in
FIG. 1 , the system conducts automatic compilation for records r1-r3. The compiled record likely includes additional information about the entity that was not apparent before the search. For instance, the search by social security number to return aliases or secondary addresses of the entity. The search by name may return fraudulent password numbers used by the entity. Such additional information (see underlined words in 122) can then be used for a second round of searches. It is likely that the first round of search will return a large number of records, some of important ones of which may be presented late in the list or buried in the list. The second round of search can likely bring such records back to a user's attention. Without limitation, a third, fourth or even more rounds of searches can be carried out to further enrich or refine the information relating to the entity. - With the two or more rounds of searches, the system now has collected information relating to the entity, with each record providing potentially relevant connection to activities of the entity, such as suspicious financial activities. In this context, the entity is now represented by all the information compiled from the search results relating to the entity. Each record represents a “potential connection” between the entity and the activity.
- Building and confirming the potential connections can benefit from human input. To this end, the system can present the records on a user interface (e.g., 131) to a user. To further facilitate user analysis, the system can rank the records before presenting them on the user interface. The ranking method may be dependent upon the type of the entity. For instance, for a system that is set up to detect suspicious activities, a record that includes an activity will be ranked higher than a record that only includes basic information about the entity.
- When the records, preferably sorted, are presented on the user interface, the interface can optionally further enable the user to mark or annotate the records (as illustrated in
FIG. 1 ). The user can mark a record, say r6, as not relevant to the entity by checking the content of the record, and thereby allowing the record to be deleted from the system. The user may also mark a record, say r11, as highly relevant to the entity and includes important information for further investigation. Such marking also confirms the record as a potential connection to the entity. - With the automated search process carried out by the system and the further input facilitated by the interface provided by the system, this technology provides an efficient approach to build a comprehensive repository of information relating to an entity of interest, and establish potential connections between the entity and activities or transactions of value for further investigation.
- Further, upon receiving the annotations from the user, the system can optionally record the annotation in a library for future use or to be shared with other users. The annotations can also serve as feedback for the search and be used to improve the search algorithm. Yet the annotation can further trigger another round of search with information identified by the user as highly important or relevant.
- Search alerts can be generated automatically or upon user request, in some embodiments. In one example, upon completion of a search for an entity, the user can request to save the search as an alert. Accordingly, in some embodiments, the search will be automated by the system at a default schedule (e.g., daily or weekly) or at a schedule set by the user. If the schedule search returns information that has a timestamp newer than the previous search time, then an alert is sent (e.g., by email) to the user with the new information. Alternatively, in another example, the new search result is compared to the previous one and any new information is included in the alert.
- In some embodiments, the search can be automatically updated, after each search, to include newly discovered information relating to the entity, such as information with a high confidence and/or relevant level. In some embodiments, the update requires confirmation or optimization by the user.
- In some embodiments, a search can be requested based on a complex subject. The term “complex subject” as used herein refers to a collection of different types of entities, such as a case report, a transaction record, or security log. Taking a transaction record as an example, the transaction record may include identifying information of multiple persons (e.g., name and SSN), multiple accounts (e.g., account type and number), and locations of transactions (e.g., address, zip code, and branch name). Each of these entities can be subject to a search.
- In one embodiment, when a user enters such a complex subject for the search, the system is configured to identify and extract some or all of the entities included in the complex subject and conducts a search for each of the entities. Upon completion of all the searches, the system can compile the search results and present them to the user optionally in a single feed. In some embodiments, the system can use information from the complex subject and/or the search results to understand the relationship and thereby compile and/or present the search results taking advantage of the knowledge of such relationship.
-
FIG. 2 illustrates a flowchart of anexample method 200 for identifying and compiling information relating to an entity for investigative analysis, according to various embodiments of the present disclosure. Themethod 200 may be implemented in various environments including, for example, the system ofFIG. 3 . The operations ofmethod 200 presented below are intended to be illustrative. Depending on the implementation, theexample method 200 may include additional, fewer, or alternative steps performed in various orders or in parallel. Theexample method 200 may be implemented in various computing systems or devices including one or more processors. - At
block 201, a computer system receives name, identification or another basic characteristic of an entity as keywords for a search for information relating to the entity. Atblock 203, the system generates one or more search quires optionally with variations of the keywords, and then atblock 205, the system conducts searches in one or more data sources with the search queries. - Some of the search results would have near-perfect match to the basic information of the entity. Such matches are identified and compiled to form a compiled record representing the entity (block 207). At
block 209, additional information from the search results is selected to be used for a second round of searches, followed by the actual searches (block 211). With the second round of searches, all the search results can be presented to a user for further investigation and analysis. Optionally, the search results are ranked to facilitate the user analysis (block 213). Upon receiving user input, the system can update the search results with respect to potential connection to the entity (block 215). - The techniques described herein are implemented by one or more special-purpose computing devices. The special-purpose computing devices may be hard-wired to perform the techniques, or may include circuitry or digital electronic devices such as one or more application-specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) that are persistently programmed to perform the techniques, or may include one or more hardware processors programmed to perform the techniques pursuant to program instructions in firmware, memory, other storage, or a combination. Such special-purpose computing devices may also combine custom hard-wired logic, ASICs, or FPGAs with custom programming to accomplish the techniques. The special-purpose computing devices may be desktop computer systems, server computer systems, portable computer systems, handheld devices, networking devices or any other device or combination of devices that incorporate hard-wired and/or program logic to implement the techniques.
- Computing device(s) are generally controlled and coordinated by operating system software, such as iOS, Android, Chrome OS, Windows XP, Windows Vista, Windows 7,
Windows 8, Windows Server, Windows CE, Unix, Linux, SunOS, Solaris, iOS, Blackberry OS, VxWorks, or other compatible operating systems. In other embodiments, the computing device may be controlled by a proprietary operating system. Conventional operating systems control and schedule computer processes for execution, perform memory management, provide file system, networking, I/O services, and provide a user interface functionality, such as a graphical user interface (“GUI”), among other things. -
FIG. 3 is a block diagram that illustrates acomputer system 300 upon which any of the embodiments described herein may be implemented. Thecomputer system 300 includes a bus 302 or other communication mechanism for communicating information, one ormore hardware processors 304 coupled with bus 302 for processing information. Hardware processor(s) 304 may be, for example, one or more general purpose microprocessors. - The
computer system 300 also includes amain memory 306, such as a random access memory (RAM), cache and/or other dynamic storage devices, coupled to bus 302 for storing information and instructions to be executed byprocessor 304.Main memory 306 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed byprocessor 304. Such instructions, when stored in storage media accessible toprocessor 304, rendercomputer system 300 into a special-purpose machine that is customized to perform the operations specified in the instructions. - The
computer system 300 further includes a read only memory (ROM) 308 or other static storage device coupled to bus 302 for storing static information and instructions forprocessor 304. Astorage device 310, such as a magnetic disk, optical disk, or USB thumb drive (Flash drive), etc., is provided and coupled to bus 302 for storing information and instructions. - The
computer system 300 may be coupled via bus 302 to adisplay 312, such as a cathode ray tube (CRT) or LCD display (or touch screen), for displaying information to a computer user. Aninput device 314, including alphanumeric and other keys, is coupled to bus 302 for communicating information and command selections toprocessor 304. Another type of user input device iscursor control 316, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections toprocessor 304 and for controlling cursor movement ondisplay 312. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane. In some embodiments, the same direction information and command selections as cursor control may be implemented via receiving touches on a touch screen without a cursor. - The
computing system 300 may include a user interface module to implement a GUI that may be stored in a mass storage device as executable software codes that are executed by the computing device(s). This and other modules may include, by way of example, components, such as software components, object-oriented software components, class components and task components, processes, functions, attributes, procedures, subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays, and variables. - In general, the word “module,” as used herein, refers to logic embodied in hardware or firmware, or to a collection of software instructions, possibly having entry and exit points, written in a programming language, such as, for example, Java, C or C++. A software module may be compiled and linked into an executable program, installed in a dynamic link library, or may be written in an interpreted programming language such as, for example, BASIC, Perl, or Python. It will be appreciated that software modules may be callable from other modules or from themselves, and/or may be invoked in response to detected events or interrupts. Software modules configured for execution on computing devices may be provided on a computer readable medium, such as a compact disc, digital video disc, flash drive, magnetic disc, or any other tangible medium, or as a digital download (and may be originally stored in a compressed or installable format that requires installation, decompression or decryption prior to execution). Such software code may be stored, partially or fully, on a memory device of the executing computing device, for execution by the computing device. Software instructions may be embedded in firmware, such as an EPROM. It will be further appreciated that hardware modules may be comprised of connected logic units, such as gates and flip-flops, and/or may be comprised of programmable units, such as programmable gate arrays or processors. The modules or computing device functionality described herein are preferably implemented as software modules, but may be represented in hardware or firmware. Generally, the modules described herein refer to logical modules that may be combined with other modules or divided into sub-modules despite their physical organization or storage.
- The
computer system 300 may implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes orprograms computer system 300 to be a special-purpose machine. According to one embodiment, the techniques herein are performed bycomputer system 300 in response to processor(s) 304 executing one or more sequences of one or more instructions contained inmain memory 306. Such instructions may be read intomain memory 306 from another storage medium, such asstorage device 310. Execution of the sequences of instructions contained inmain memory 306 causes processor(s) 304 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions. - The term “non-transitory media,” and similar terms, as used herein refers to any media that store data and/or instructions that cause a machine to operate in a specific fashion. Such non-transitory media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as
storage device 310. Volatile media includes dynamic memory, such asmain memory 306. Common forms of non-transitory media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge, and networked versions of the same. - Non-transitory media is distinct from but may be used in conjunction with transmission media. Transmission media participates in transferring information between non-transitory media. For example, transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 302. Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.
- Various forms of media may be involved in carrying one or more sequences of one or more instructions to
processor 304 for execution. For example, the instructions may initially be carried on a magnetic disk or solid state drive of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local tocomputer system 300 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal. An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus 302. Bus 302 carries the data tomain memory 306, from whichprocessor 304 retrieves and executes the instructions. The instructions received bymain memory 306 may retrieve and execute the instructions. The instructions received bymain memory 306 may optionally be stored onstorage device 310 either before or after execution byprocessor 304. - The
computer system 300 also includes acommunication interface 318 coupled to bus 302.Communication interface 318 provides a two-way data communication coupling to one or more network links that are connected to one or more local networks. For example,communication interface 318 may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line. As another example,communication interface 318 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN (or WAN component to communicated with a WAN). Wireless links may also be implemented. In any such implementation,communication interface 318 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information. - A network link typically provides data communication through one or more networks to other data devices. For example, a network link may provide a connection through local network to a host computer or to data equipment operated by an Internet Service Provider (ISP). The ISP in turn provides data communication services through the world wide packet data communication network now commonly referred to as the “Internet”. Local network and Internet both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link and through
communication interface 318, which carry the digital data to and fromcomputer system 300, are example forms of transmission media. - The
computer system 300 can send messages and receive data, including program code, through the network(s), network link andcommunication interface 318. In the Internet example, a server might transmit a requested code for an application program through the Internet, the ISP, the local network and thecommunication interface 318. - The received code may be executed by
processor 304 as it is received, and/or stored instorage device 310, or other non-volatile storage for later execution. - Each of the processes, methods, and algorithms described in the preceding sections may be embodied in, and fully or partially automated by, code modules executed by one or more computer systems or computer processors comprising computer hardware. The processes and algorithms may be implemented partially or wholly in application-specific circuitry.
- The various features and processes described above may be used independently of one another, or may be combined in various ways. All possible combinations and sub-combinations are intended to fall within the scope of this disclosure. In addition, certain method or process blocks may be omitted in some implementations. The methods and processes described herein are also not limited to any particular sequence, and the blocks or states relating thereto can be performed in other sequences that are appropriate. For example, described blocks or states may be performed in an order other than that specifically disclosed, or multiple blocks or states may be combined in a single block or state. The example blocks or states may be performed in serial, in parallel, or in some other manner. Blocks or states may be added to or removed from the disclosed example embodiments. The example systems and components described herein may be configured differently than described. For example, elements may be added to, removed from, or rearranged compared to the disclosed example embodiments.
- Conditional language, such as, among others, “can,” “could,” “might,” or “may,” unless specifically stated otherwise, or otherwise understood within the context as used, is generally intended to convey that certain embodiments include, while other embodiments do not include, certain features, elements and/or steps. Thus, such conditional language is not generally intended to imply that features, elements and/or steps are in any way required for one or more embodiments or that one or more embodiments necessarily include logic for deciding, with or without user input or prompting, whether these features, elements and/or steps are included or are to be performed in any particular embodiment.
- Any process descriptions, elements, or blocks in the flow diagrams described herein and/or depicted in the attached figures should be understood as potentially representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps in the process. Alternate implementations are included within the scope of the embodiments described herein in which elements or functions may be deleted, executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those skilled in the art.
- It should be emphasized that many variations and modifications may be made to the above-described embodiments, the elements of which are to be understood as being among other acceptable examples. All such modifications and variations are intended to be included herein within the scope of this disclosure. The foregoing description details certain embodiments of the invention. It will be appreciated, however, that no matter how detailed the foregoing appears in text, the invention can be practiced in many ways. As is also stated above, it should be noted that the use of particular terminology when describing certain features or aspects of the invention should not be taken to imply that the terminology is being re-defined herein to be restricted to including any specific characteristics of the features or aspects of the invention with which that terminology is associated. The scope of the invention should therefore be construed in accordance with the appended claims and any equivalents thereof.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/389,300 US11475031B2 (en) | 2016-12-15 | 2019-04-19 | Identification and compiling of information relating to an entity |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201662434936P | 2016-12-15 | 2016-12-15 | |
US15/590,956 US10311074B1 (en) | 2016-12-15 | 2017-05-09 | Identification and compiling of information relating to an entity |
US16/389,300 US11475031B2 (en) | 2016-12-15 | 2019-04-19 | Identification and compiling of information relating to an entity |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/590,956 Continuation US10311074B1 (en) | 2016-12-15 | 2017-05-09 | Identification and compiling of information relating to an entity |
Publications (2)
Publication Number | Publication Date |
---|---|
US20190243840A1 true US20190243840A1 (en) | 2019-08-08 |
US11475031B2 US11475031B2 (en) | 2022-10-18 |
Family
ID=66673571
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/590,956 Active US10311074B1 (en) | 2016-12-15 | 2017-05-09 | Identification and compiling of information relating to an entity |
US16/389,300 Active 2039-02-19 US11475031B2 (en) | 2016-12-15 | 2019-04-19 | Identification and compiling of information relating to an entity |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/590,956 Active US10311074B1 (en) | 2016-12-15 | 2017-05-09 | Identification and compiling of information relating to an entity |
Country Status (1)
Country | Link |
---|---|
US (2) | US10311074B1 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10311074B1 (en) * | 2016-12-15 | 2019-06-04 | Palantir Technologies Inc. | Identification and compiling of information relating to an entity |
US10713329B2 (en) * | 2018-10-30 | 2020-07-14 | Longsand Limited | Deriving links to online resources based on implicit references |
US11657100B2 (en) * | 2020-10-29 | 2023-05-23 | Kyndryl, Inc. | Cognitively rendered event timeline display |
Family Cites Families (45)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5875446A (en) | 1997-02-24 | 1999-02-23 | International Business Machines Corporation | System and method for hierarchically grouping and ranking a set of objects in a query context based on one or more relationships |
US5995973A (en) | 1997-08-29 | 1999-11-30 | International Business Machines Corporation | Storing relationship tables identifying object relationships |
US7062483B2 (en) | 2000-05-18 | 2006-06-13 | Endeca Technologies, Inc. | Hierarchical data-driven search and navigation system and method for information retrieval |
US20100281364A1 (en) | 2005-01-11 | 2010-11-04 | David Sidman | Apparatuses, Methods and Systems For Portable Universal Profile |
US6980984B1 (en) | 2001-05-16 | 2005-12-27 | Kanisa, Inc. | Content provider systems and methods using structured data |
AU2003298616A1 (en) | 2002-11-06 | 2004-06-03 | International Business Machines Corporation | Confidential data sharing and anonymous entity resolution |
US7657540B1 (en) | 2003-02-04 | 2010-02-02 | Seisint, Inc. | Method and system for linking and delinking data records |
US20040243613A1 (en) | 2003-05-30 | 2004-12-02 | Mohammad Pourheidari | System and method for creating a custom view from information in a managed data store |
KR101312190B1 (en) * | 2004-03-15 | 2013-09-27 | 야후! 인크. | Search systems and methods with integration of user annotations |
US7899796B1 (en) | 2004-11-23 | 2011-03-01 | Andrew Borthwick | Batch automated blocking and record matching |
US20070130206A1 (en) | 2005-08-05 | 2007-06-07 | Siemens Corporate Research Inc | System and Method For Integrating Heterogeneous Biomedical Information |
US20080215557A1 (en) | 2005-11-05 | 2008-09-04 | Jorey Ramer | Methods and systems of mobile query classification |
US7672833B2 (en) | 2005-09-22 | 2010-03-02 | Fair Isaac Corporation | Method and apparatus for automatic entity disambiguation |
CN101145152B (en) | 2006-09-14 | 2010-08-11 | 国际商业机器公司 | System and method for automatically refining reality in specific context |
US20080162544A1 (en) | 2006-12-27 | 2008-07-03 | Salesforce.Com, Inc. | Systems and methods for implementing many object to object relationships in a multi-tenant environment |
US7917489B2 (en) | 2007-03-14 | 2011-03-29 | Yahoo! Inc. | Implicit name searching |
US7996210B2 (en) | 2007-04-24 | 2011-08-09 | The Research Foundation Of The State University Of New York | Large-scale sentiment analysis |
US8271477B2 (en) | 2007-07-20 | 2012-09-18 | Informatica Corporation | Methods and systems for accessing data |
US8117208B2 (en) | 2007-09-21 | 2012-02-14 | The Board Of Trustees Of The University Of Illinois | System for entity search and a method for entity scoring in a linked document database |
US8566327B2 (en) | 2007-12-19 | 2013-10-22 | Match.Com, L.L.C. | Matching process system and method |
WO2009089487A1 (en) * | 2008-01-11 | 2009-07-16 | Drubner Jeffrey M | Method and system for uniquely identifying a person to the exclusion of all others |
US8972463B2 (en) * | 2008-07-25 | 2015-03-03 | International Business Machines Corporation | Method and apparatus for functional integration of metadata |
US20100114887A1 (en) | 2008-10-17 | 2010-05-06 | Google Inc. | Textual Disambiguation Using Social Connections |
US9454606B2 (en) * | 2009-09-11 | 2016-09-27 | Lexisnexis Risk & Information Analytics Group Inc. | Technique for providing supplemental internet search criteria |
US8719267B2 (en) | 2010-04-19 | 2014-05-06 | Alcatel Lucent | Spectral neighborhood blocking for entity resolution |
US8417696B2 (en) | 2010-06-10 | 2013-04-09 | Microsoft Corporation | Contact information merger and duplicate resolution |
US9336184B2 (en) | 2010-12-17 | 2016-05-10 | Microsoft Technology Licensing, Llc | Representation of an interactive document as a graph of entities |
US9245049B2 (en) | 2011-02-16 | 2016-01-26 | Empire Technology Development Llc | Performing queries using semantically restricted relations |
US20120246154A1 (en) | 2011-03-23 | 2012-09-27 | International Business Machines Corporation | Aggregating search results based on associating data instances with knowledge base entities |
US9317584B2 (en) | 2011-12-30 | 2016-04-19 | Certona Corporation | Keyword index pruning |
US8972336B2 (en) | 2012-05-03 | 2015-03-03 | Salesforce.Com, Inc. | System and method for mapping source columns to target columns |
EP2662782A1 (en) | 2012-05-10 | 2013-11-13 | Siemens Aktiengesellschaft | Method and system for storing data in a database |
US9183310B2 (en) | 2012-06-12 | 2015-11-10 | Microsoft Technology Licensing, Llc | Disambiguating intents within search engine result pages |
US20140258014A1 (en) | 2013-03-05 | 2014-09-11 | Google Inc. | Entity-based searching with content selection |
US8818892B1 (en) * | 2013-03-15 | 2014-08-26 | Palantir Technologies, Inc. | Prioritizing data clusters with customizable scoring strategies |
IN2013CH01237A (en) | 2013-03-21 | 2015-08-14 | Infosys Ltd | |
US9767127B2 (en) | 2013-05-02 | 2017-09-19 | Outseeker Corp. | Method for record linkage from multiple sources |
US10019519B2 (en) * | 2013-10-30 | 2018-07-10 | Gordon E. Seay | Methods and systems for utilizing global entities in software applications |
US9245057B1 (en) | 2014-10-09 | 2016-01-26 | Splunk Inc. | Presenting a graphical visualization along a time-based graph lane using key performance indicators derived from machine data |
US20160171507A1 (en) | 2014-12-11 | 2016-06-16 | Connectivity, Inc. | Systems and Methods for Identifying Customers of Businesses Through Gathered Named Entity Data |
US9348920B1 (en) | 2014-12-22 | 2016-05-24 | Palantir Technologies Inc. | Concept indexing among database of documents using machine learning techniques |
US9537504B1 (en) | 2015-09-25 | 2017-01-03 | Intel Corporation | Heterogeneous compression architecture for optimized compression ratio |
US10311074B1 (en) * | 2016-12-15 | 2019-06-04 | Palantir Technologies Inc. | Identification and compiling of information relating to an entity |
US10216811B1 (en) | 2017-01-05 | 2019-02-26 | Palantir Technologies Inc. | Collaborating using different object models |
US10235461B2 (en) | 2017-05-02 | 2019-03-19 | Palantir Technologies Inc. | Automated assistance for generating relevant and valuable search results for an entity of interest |
-
2017
- 2017-05-09 US US15/590,956 patent/US10311074B1/en active Active
-
2019
- 2019-04-19 US US16/389,300 patent/US11475031B2/en active Active
Also Published As
Publication number | Publication date |
---|---|
US10311074B1 (en) | 2019-06-04 |
US11475031B2 (en) | 2022-10-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9449074B1 (en) | Determining and extracting changed data from a data source | |
US11475031B2 (en) | Identification and compiling of information relating to an entity | |
US11714869B2 (en) | Automated assistance for generating relevant and valuable search results for an entity of interest | |
US20210382885A1 (en) | Collaborating using different object models | |
US12099509B2 (en) | Systems and methods for constraint driven database searching | |
US20230020057A1 (en) | Systems and methods for context-based keyword searching | |
US11803532B2 (en) | Integrated data analysis | |
US11860831B2 (en) | Systems and methods for data entry | |
US11301499B2 (en) | Systems and methods for providing an object platform for datasets | |
US11874849B2 (en) | Systems and methods for creating a data layer based on content from data sources | |
US11615071B2 (en) | Methods and systems for data synchronization | |
US11494454B1 (en) | Systems and methods for searching a schema to identify and visualize corresponding data | |
US11829380B2 (en) | Ontological mapping of data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: PALANTIR TECHNOLOGIES INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BEARD, MITCHELL;CHANG, ALLEN;HAMMETT, CHRIS;AND OTHERS;SIGNING DATES FROM 20170816 TO 20180720;REEL/FRAME:050764/0523 |
|
AS | Assignment |
Owner name: ROYAL BANK OF CANADA, AS ADMINISTRATIVE AGENT, CANADA Free format text: SECURITY INTEREST;ASSIGNOR:PALANTIR TECHNOLOGIES INC.;REEL/FRAME:051709/0471 Effective date: 20200127 Owner name: MORGAN STANLEY SENIOR FUNDING, INC., AS ADMINISTRATIVE AGENT, NEW YORK Free format text: SECURITY INTEREST;ASSIGNOR:PALANTIR TECHNOLOGIES INC.;REEL/FRAME:051713/0149 Effective date: 20200127 |
|
AS | Assignment |
Owner name: PALANTIR TECHNOLOGIES INC., CALIFORNIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:ROYAL BANK OF CANADA;REEL/FRAME:052856/0382 Effective date: 20200604 Owner name: MORGAN STANLEY SENIOR FUNDING, INC., NEW YORK Free format text: SECURITY INTEREST;ASSIGNOR:PALANTIR TECHNOLOGIES INC.;REEL/FRAME:052856/0817 Effective date: 20200604 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
AS | Assignment |
Owner name: PALANTIR TECHNOLOGIES INC., CALIFORNIA Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE ERRONEOUSLY LISTED PATENT BY REMOVING APPLICATION NO. 16/832267 FROM THE RELEASE OF SECURITY INTEREST PREVIOUSLY RECORDED ON REEL 052856 FRAME 0382. ASSIGNOR(S) HEREBY CONFIRMS THE RELEASE OF SECURITY INTEREST;ASSIGNOR:ROYAL BANK OF CANADA;REEL/FRAME:057335/0753 Effective date: 20200604 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PRE-INTERVIEW COMMUNICATION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
AS | Assignment |
Owner name: WELLS FARGO BANK, N.A., NORTH CAROLINA Free format text: ASSIGNMENT OF INTELLECTUAL PROPERTY SECURITY AGREEMENTS;ASSIGNOR:MORGAN STANLEY SENIOR FUNDING, INC.;REEL/FRAME:060572/0640 Effective date: 20220701 Owner name: WELLS FARGO BANK, N.A., NORTH CAROLINA Free format text: SECURITY INTEREST;ASSIGNOR:PALANTIR TECHNOLOGIES INC.;REEL/FRAME:060572/0506 Effective date: 20220701 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |