US20190243897A1 - Analysis of large bodies of textual data - Google Patents

Analysis of large bodies of textual data Download PDF

Info

Publication number
US20190243897A1
US20190243897A1 US16/354,688 US201916354688A US2019243897A1 US 20190243897 A1 US20190243897 A1 US 20190243897A1 US 201916354688 A US201916354688 A US 201916354688A US 2019243897 A1 US2019243897 A1 US 2019243897A1
Authority
US
United States
Prior art keywords
textual
interest
elements
data
corpora
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US16/354,688
Inventor
Maxim Kesin
Paul Gribelyuk
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Palantir Technologies Inc
Original Assignee
Palantir Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Palantir Technologies Inc filed Critical Palantir Technologies Inc
Priority to US16/354,688 priority Critical patent/US20190243897A1/en
Assigned to Palantir Technologies Inc. reassignment Palantir Technologies Inc. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KESIN, MAXIM, GRIBELYUK, PAUL
Publication of US20190243897A1 publication Critical patent/US20190243897A1/en
Assigned to MORGAN STANLEY SENIOR FUNDING, INC., AS ADMINISTRATIVE AGENT reassignment MORGAN STANLEY SENIOR FUNDING, INC., AS ADMINISTRATIVE AGENT SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: Palantir Technologies Inc.
Assigned to ROYAL BANK OF CANADA, AS ADMINISTRATIVE AGENT reassignment ROYAL BANK OF CANADA, AS ADMINISTRATIVE AGENT SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: Palantir Technologies Inc.
Assigned to MORGAN STANLEY SENIOR FUNDING, INC. reassignment MORGAN STANLEY SENIOR FUNDING, INC. SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: Palantir Technologies Inc.
Assigned to Palantir Technologies Inc. reassignment Palantir Technologies Inc. RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: ROYAL BANK OF CANADA
Assigned to Palantir Technologies Inc. reassignment Palantir Technologies Inc. CORRECTIVE ASSIGNMENT TO CORRECT THE ERRONEOUSLY LISTED PATENT BY REMOVING APPLICATION NO. 16/832267 FROM THE RELEASE OF SECURITY INTEREST PREVIOUSLY RECORDED ON REEL 052856 FRAME 0382. ASSIGNOR(S) HEREBY CONFIRMS THE RELEASE OF SECURITY INTEREST. Assignors: ROYAL BANK OF CANADA
Assigned to WELLS FARGO BANK, N.A. reassignment WELLS FARGO BANK, N.A. ASSIGNMENT OF INTELLECTUAL PROPERTY SECURITY AGREEMENTS Assignors: MORGAN STANLEY SENIOR FUNDING, INC.
Assigned to WELLS FARGO BANK, N.A. reassignment WELLS FARGO BANK, N.A. SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: Palantir Technologies Inc.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • G06F17/2715
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • G06F17/2241
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/015Input arrangements based on nervous system activity detection, e.g. brain waves [EEG] detection, electromyograms [EMG] detection, electrodermal response detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/0346Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor with detection of the device orientation or free movement in a 3D space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/137Hierarchical processing, e.g. outlines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]

Definitions

  • the subject matter disclosed herein generally relates to machines configured to the technical field of special-purpose machines that facilitate analysis of large bodies of textual data including computerized variants of such special-purpose machines and improvements to such variants, and to the technologies by which such special-purpose machines become improved compared to other special-purpose machines that facilitate analysis of large bodies of textual data.
  • Embodiments of the present disclosure relate generally to searching large sets of data and, more particularly, but not by way of limitation, to a system and method of identifying documents and additional elements of interest based on search terms.
  • Machine learning processes are often useful in making predictions based on data sets. Users may want to explore a large quantity of text or documents as part of a data set. Typically, an individual performs a series of searches, with the help of a search engine or search tool, to target individual specified aspects, things, entities, or people referenced in the documents. The series of searches may provide a separate lists of results from which the user manually identifies relevant documents. However, manual review of results within the list is often time consuming and prohibitive where the list of results is large.
  • FIG. 1 is a block diagram illustrating a networked system, according to some example embodiments.
  • FIG. 2 is a block diagram illustrating various modules of a textual data identification system, according to various example embodiments.
  • FIG. 3 is a flowchart illustrating individual operations of a method for processing and identifying elements of interest from content within a set of retrieved text sets, according to various example embodiments.
  • FIG. 4 is a graphical user interface displaying textual identifications and elements of interest in differing portions based on received search terms, according to various example embodiments.
  • FIG. 5 is a flowchart illustrating operations of a method of processing and identifying elements of interest from content within a set of retrieved text sets, according to various example embodiments.
  • FIG. 6 is a flowchart illustrating operations of a method of processing and identifying elements of interest from content within a set of retrieved text sets, according to various example embodiments.
  • FIG. 7 is a flowchart illustrating operations of a method for processing and identifying elements of interest from content within a set of retrieved text sets, according to various example embodiments.
  • FIG. 8 is a block diagram illustrating an example of a software architecture that may be installed on a machine, according to some example embodiments.
  • FIG. 9 illustrates a diagrammatic representation of a machine in the form of a computer system within which a set of instructions may be executed for causing the machine to perform any one or more of the methodologies discussed herein, according to an example embodiment.
  • Example embodiments described herein disclose a textual identification system configured to identify texts within a set of textual data and elements of interest from within the identified texts.
  • the textual identification system provides unique and semantically meaningful elements of interest from within the textual data to expand or focus searches performed on the set of textual data.
  • the identification of elements of interest may eliminate or consolidate deviations within usage, context, and spelling of the elements of interest to improve types, accuracy, and semantically related content of the elements of interest with respect to an initial set of search terms.
  • the textual identification system may initially present a graphical user interface at a client device.
  • search terms e.g., selections from predetermined terms or freely entered term
  • the textual identification system identifies texts (e.g., text documents, video documents, audio documents, publications, or multimedia documents) from textual data accessible by the textual identification system.
  • the textual identification system identifies and presents elements of interest (e.g., additional terms) associated with, or included in, the identified texts.
  • the textual identification system parses the texts within the set of textual data to identify terms contained within the texts, the context in which the terms are used, deviations among usage and form of the terms, and meaningful semantic relationships among two or more terms within the texts. Based on the context, deviations, and meaningful semantic relationships of terms within the identified texts, the textual identification system generates a list of elements of interest and presents the elements of interest along with identifications of the identified texts.
  • the textual identification system provides technical improvements to previous search suggestion systems by identifying multiple disparate contextual uses and semantically meaningful combinations of terms within identified texts and with respect to the search terms used to identify the texts.
  • Use of the indices, matrices, and data structures described herein may also increase the speed and precision with which additional terms are identified.
  • the textual identification system may better identify additional terms by merging or eliminating presentation of additional terms to remove extraneous terms, merge deviant uses of terms, and merging or separating terms based on contextual or semantically meaningful usage, thereby improving previous suggested search systems.
  • a networked system 102 in the example forms of a network-based recommendation system, provides server-side functionality via a network 104 (e.g., the Internet or wide area network (WAN)) to one or more client devices 110 .
  • FIG. 1 illustrates, for example, a web client 112 (e.g., a browser, such as the Internet Explorer® browser developed by Microsoft® Corporation of Redmond, Wash. State), a client application 114 , and a programmatic client 116 executing on client device 110 .
  • a web client 112 e.g., a browser, such as the Internet Explorer® browser developed by Microsoft® Corporation of Redmond, Wash. State
  • client application 114 e.g., a browser, such as the Internet Explorer® browser developed by Microsoft® Corporation of Redmond, Wash. State
  • programmatic client 116 executing on client device 110 .
  • the client device 110 may comprise, but is not limited to, a mobile phone, desktop computer, laptop, portable digital assistants (PDAs), smart phones, tablets, ultra books, netbooks, laptops, multi-processor systems, microprocessor-based or programmable consumer electronics, game consoles, set-top boxes, or any other communication device that a user may utilize to access the networked system 102 .
  • the client device 110 may comprise a display module (not shown) to display information (e.g., in the form of user interfaces).
  • the client device 110 may comprise one or more of a touch screens, accelerometers, gyroscopes, cameras, microphones, global positioning system (GP S) devices, and so forth.
  • GP S global positioning system
  • the client device 110 may be a device of a user that is used to perform a transaction involving digital items within the networked system 102 .
  • One or more users 106 may be a person, a machine, or other means of interacting with client device 110 .
  • the user 106 is not part of the network architecture 100 , but may interact with the network architecture 100 via client device 110 or another means.
  • one or more portions of network 104 may be an ad hoc network, an intranet, an extranet, a virtual private network (VPN), a local area network (LAN), a wireless LAN (WLAN), a wide area network (WAN), a wireless WAN (WWAN), a metropolitan area network (MAN), a portion of the Internet, a portion of the Public Switched Telephone Network (PSTN), a cellular telephone network, a wireless network, a WiFi network, a WiMax network, another type of network, or a combination of two or more such networks.
  • VPN virtual private network
  • LAN local area network
  • WLAN wireless LAN
  • WAN wide area network
  • WWAN wireless WAN
  • MAN metropolitan area network
  • PSTN Public Switched Telephone Network
  • PSTN Public Switched Telephone Network
  • Each of the client device 110 may include one or more applications (also referred to as “apps”) such as, but not limited to, a web browser, messaging application, electronic mail (email) application, and the like.
  • applications also referred to as “apps”
  • a web browser such as, but not limited to, a web browser, messaging application, electronic mail (email) application, and the like.
  • One or more users 106 may be a person, a machine, or other means of interacting with the client device 110 .
  • the user 106 is not part of the network architecture 100 , but may interact with the network architecture 100 via the client device 110 or other means.
  • the user provides input (e.g., touch screen input or alphanumeric input) to the client device 110 and the input is communicated to the networked system 102 via the network 104 .
  • the networked system 102 in response to receiving the input from the user, communicates information to the client device 110 via the network 104 to be presented to the user. In this way, the user can interact with the networked system 102 using the client device 110 .
  • An application program interface (API) server 120 and a web server 122 are coupled to, and provide programmatic and web interfaces respectively to, one or more application servers 140 .
  • the application servers 140 may host one or more publication systems comprising a textual identification system 150 , which may comprise one or more modules or applications and which may be embodied as hardware, software, firmware, or any combination thereof.
  • the application servers 140 are, in turn, shown to be coupled to one or more database servers 124 that facilitate access to one or more information storage repositories or database(s) 126 .
  • the databases 126 are storage devices that store information to be posted (e.g., publications or listings) to the networked system 102 .
  • the databases 126 may also store digital item information in accordance with example embodiments.
  • a third-party application 132 executing on third-party server(s) 130 , is shown as having programmatic access to the networked system 102 via the programmatic interface provided by the API server 120 .
  • the third-party application 132 utilizing information retrieved from the networked system 102 , supports one or more features or functions on a website hosted by the third party.
  • the third-party website for example, provides one or more functions that are supported by the relevant systems or servers of the networked system 102 .
  • the textual identification system 150 provides functionality operable to identify and retrieve documents or data and elements of interest in response to receiving search terms. For example, the textual identification system 150 may access sets of data (e.g., document corpora) stored in a structured format from the databases 126 , the third-party servers 130 , the client device 110 , and other sources. In some example embodiments, the textual identification system 150 analyzes the set of data in order to determine portions of the data associated with the search terms and additional terms (e.g., elements of interest).
  • sets of data e.g., document corpora
  • the textual identification system 150 analyzes the set of data in order to determine portions of the data associated with the search terms and additional terms (e.g., elements of interest).
  • network architecture 100 shown in FIG. 1 employs a client-server architecture
  • present inventive subject matter is of course not limited to such an architecture, and could equally well find application in a distributed, or peer-to-peer, architecture system, for example.
  • the textual identification system 150 could also be implemented as standalone software programs, which do not necessarily have networking capabilities.
  • a third-party application 132 executing on a third-party server(s) 130 , is shown as having programmatic access to the networked system 102 via the programmatic interface provided by the API server 120 .
  • the third-party application 128 utilizing information retrieved from the networked system 102 , may support one or more features or functions on a website hosted by the third party.
  • the third-party website may, for example, provide one or more promotional, marketplace, or payment functions that are supported by the relevant applications of the networked system 102 .
  • FIG. 2 is a block diagram illustrating components of the textual identification system 150 .
  • Components of the textual identification system 150 configure the textual identification system 150 to access sets of textual data to identify texts or text sets within the textual data, identify data elements within the texts, and identify elements of interest based on data structures generated from the set of textual data.
  • the components configure the textual identification system 150 to generate initial data structures and modify the data structures to process the data within the data structure; increase accuracy and efficacy of elements of interest identified from the texts; and increase the speed with which a machine forming all or part of the textual identification system 150 identifies and presents the elements of interest at the machine.
  • the textual identification system 150 comprises an access component 210 , a document component 220 , a database component 230 , an element component 240 , a presentation component 250 , a context component 260 , and the normalization component 270 . Any one or more of these components may be implemented using one or more processors and hence may include one or more processors (e.g., by configuring such one or more processors to perform functions described for that component).
  • the access component 210 accesses or otherwise receives selections of at least one document corpus from a set of document corpora.
  • the access component 210 may access the set of textual corpora by accessing a set of metadata identifying the set of textual corpora.
  • the access component 210 accesses the set of textual corpora by accessing one or more database directly or via a network connection.
  • the access component 210 may also access or otherwise receive one or more search terms within a graphical user interface.
  • the access component 210 retrieves a data structure including textual identifications for a set of textual data and an indication of one or more data elements within one or more texts included in the set of textual data.
  • the document component 220 identifies a set of textual data based on one or more search terms.
  • the document component 220 may use search algorithms to identify the set of textual data based on an index of keywords associated with the content and metadata of the document.
  • the document component 220 dynamically partitions the set of textual data to identify a textual, textual component, a text or text set within a textual corpus, or a textual corpus containing the set of documents associated with the one or more search terms.
  • the document component 220 partitions a set of textual corpora or merges two or more textual corpora based on search terms received from the access component 210 .
  • the database component 230 generates data structures and modified data structures.
  • the database component 230 generates data structures including textual identifications for texts within the set of textual data.
  • the data structures may also include indications of one or more data elements within the texts.
  • the data elements may be words, titles, names, addresses, numbers, or any other suitable information contained within a text.
  • the database component 230 may generate modified data structures from the data structures generated to represent texts within the textual data.
  • the database component 230 generates modified data structures by assigning index numbers to each element, term, combinations of elements, or combination of terms within a data structure. A full index (e.g., the data structure) may be reduced to include texts within the set of textual data identified based on the search terms.
  • the database component 230 may sum rows within the modified data structure.
  • the database component 230 also processes counts for the terms using one or more processes to transform the modified data structure and remove or discount popular or common entries adding little value to analysis based on high frequency of occurrence.
  • the element component 240 identifies elements of interest within modified data structures generated by the database component 230 .
  • the elements of interest are identified, at least in part, based on the summed rows of the modified data structures.
  • the element component 240 may map textual identifications of sets of textual data to rows in transformed or modified data structures generated by the database component 230 .
  • the element component 240 selects elements of interest by summing values from transformed matrices based on comparison of values associated with the elements to an interest threshold.
  • the element component 240 may also identify element types for each element of interest.
  • the presentation component 250 causes presentations of graphical user interfaces, visual indicators, portions of texts, and other elements described herein.
  • the presentation component 250 causes presentation of a graphical user interface including selectable interface elements configured to receive search terms or provide search terms for selection and subsequent query of the set of textual data.
  • the presentation component 250 may cause presentation of elements of interest within the graphical user interface as well as portions of texts accessed or retrieved from the set of textual data based on the search terms provided to the access component 210 .
  • the presentation component 250 causes presentation of unique and tailored graphical user interfaces based on a combination of the texts, the search terms, and the elements of interest.
  • the tailored graphical user interfaces may be presented differently to different users based on the information retrieved by the textual identification system 150 , the user performing the search, element relationships or collocations, combinations thereof, and other suitable information.
  • portions of the graphical user interface are dynamically generated, such that a portion of the graphical user interface may only appear when information relevant to the portion is retrieved, identified, or generated by the textual identification system 150 .
  • the graphical user interface may automatically resize, reorient, repartition, or otherwise adjust one or more initially presented portions of the graphical user interface to accommodate addition of a new portion based on the inclusion of additional information from the textual identification system 150 .
  • the context component 260 determines context occurrences for elements of interest within texts of the set of textual data. In some instances, the context component 260 tokenizes the context to provide an index number for terms included in a textual proximate to another term for which context is being determined. The context component 260 may associate index numbers for terms surrounding a specified term and may link instances of a term surrounding a specified term that have a lexical similarity.
  • the normalization component 270 normalizes elements of interest by removing redundant elements of interest based on the context occurrence of two or more elements of interest.
  • the normalization component 270 may generate a normalized set of elements of interest by identifying deviations among the instances. In some embodiments, normalization of the elements of interest occurs without removing or merging instances of the terms within the data structures described herein.
  • the normalization component 270 may pass the normalized set of elements of interest to the presentation component 250 , such that the presentation component 250 presents the elements of interest without duplication of elements of interest having deviating instances.
  • any one or more of the components described may be implemented using hardware alone (e.g., one or more of the processors of a machine) or a combination of hardware and software.
  • any component described in the textual identification system 150 may physically include an arrangement of one or more processors (e.g., a subset of or among the one or more processors of the machine) configured to perform the operations described herein for that component.
  • any component of the textual identification system 150 may include software, hardware, or both, that configure an arrangement of one or more processors (e.g., among the one or more processors of the machine) to perform the operations described herein for that component.
  • different components of the textual identification system 150 may include and configure different arrangements of such processors or a single arrangement of such processors as different points in time. Moreover, any two or more components of the textual identification system 150 may be logically or physically combined into a single component, and the functions described herein for a single component may be subdivided among multiple components. Furthermore, according to various example embodiments, components described herein as being implemented within a single machine, database, or device may be distributed across multiple machines, databases, or devices.
  • FIG. 3 is a flowchart illustrating operations of the textual identification system 150 in performing a method 300 of processing and identifying elements of interest from content within a set of retrieved documents, according to some example embodiments. Operations of the method 300 may be performed by the modules described above with respect to FIG. 2 .
  • the access component 210 receives a selection of a textual corpus from a set of textual corpora.
  • Each textual corpus of the set of textual corpora contains one or more texts.
  • the texts or text sets may include documents of varying types.
  • the document types may include text documents, video documents, audio documents, multimedia documents, and other suitable documents.
  • “textual” is used interchangeably with a broad number of document types, publications (e.g., documents published or otherwise accessible directly or by a network connection).
  • publications e.g., documents published or otherwise accessible directly or by a network connection.
  • the set of texts (e.g., set of documents) may be identified from the selected textual corpus (e.g., document corpus).
  • the presentation component 250 is activated by a selection of a graphical interface element to initiate presentation of a graphical user interface, as shown in FIG. 4 .
  • the graphical user interface includes one or more graphical interface elements representing available selections within the graphical user interface.
  • the selections made available by the graphical interface elements include search term input, document corpus selection, search result selection, document selection, element of interest selection, and search type selection or entry.
  • the access component 210 may receive the selection of the document corpus from an input device receiving a selection from a set of document corpora represented as discrete selectable graphical interface elements.
  • the set of document corpora may be presented on a display device of a client device accessing one or more document servers.
  • the access component 210 receives or otherwise accesses one or more search terms displayed within the graphical user interface.
  • the access component 210 receives the search terms from an input device of a client device on which the graphical user interface is presented.
  • the access component 210 receives the one or more search terms as one or more differing types of user input through the input device.
  • the one or more search terms may be received within a text input field (e.g., a text box presented in the graphical user interface), as a selection from a set of radio buttons, as a selection from a drop down menu, as a selection from a scroll menu, or any other suitable input type.
  • the document component 220 identifies a set of textual data (e.g., a set of documents or set of publication data) based on the one or more search terms.
  • the set of documents are identified based on a presence of the one or more search terms within the document or within metadata associated with the document.
  • a content of the document and metadata associated with the document may be parsed and indexed to identify keywords. Keywords may include words, named individuals, named entities (e.g., a city name, a project name, an organization name), titles, authors, fields (e.g., From, To, Carbon Copy, and Blind Carbon Copy fields), dates, and other suitable terms.
  • the keywords and the metadata may be extracted from the documents and accompanying data using information extraction and machine learning algorithms.
  • the document component 220 uses one or more search engine algorithms to identify the set of documents based on the index of the keywords associated with the content and metadata of the document.
  • the access component 210 retrieves a data structure including textual identifications (e.g., document identifications) for the set of textual data and an indication of the one or more data elements within the documents (e.g., texts within the textual data).
  • the data structure may be the index of keywords in the content and metadata of the documents identified based on the search terms.
  • the index may include semantically meaningful collocations as well as the keywords from the content and the metadata.
  • the index is generated as a table having counts of the terms and semantically meaningful collocations within the document content and metadata. In some embodiments, the counts are a number of instances that a given term or semantically meaningful collocation occurs within the document content or the metadata for the document.
  • the indexes for the documents within the document corpus include collocations of semantically meaningful n-grams.
  • Semantically meaningful collocations may include frequently occurring compositions of words having semantic meaning. For example, “strong” and “coffee” may occur together more often than a predetermined instance threshold and, when occurring together within a predefined distance, contain a semantic meaning, “strong coffee,” which may not occur in collocations of synonyms of the two terms.
  • the semantically meaningful n-grams or collocations may be determined heuristically.
  • the semantically meaningful n-grams or collocations may also be identified using semantic analysis, stochastic semantic analysis, natural language processing, natural language understanding, or any other suitable algorithmic identification of the meaningful semantic relation between collocated terms.
  • the database component 230 processes the data structure to generate a modified data structure.
  • the database component 230 assigns an index number to each term or semantically meaningful n-gram.
  • the index number may be obtained by sorting the textual representations of the terms and semantically meaningful n-grams and associating each with a position in a sort order. For example, “Aardvark” may receive an index number of zero and “Zena” may receive an index number of one thousand.
  • the modified data structure may be generated by reducing the full index to documents included in the set of documents identified based on the one or more search terms.
  • the terms, semantically meaningful n-grams, entity names, and the like are provided values within the modified data structure to construct a count matrix.
  • the count matrix may include documents (e.g., documents identified within a specified document corpus or set of document corpora) as rows and elements of interest (e.g., terms and semantically meaningful n-grams) as columns.
  • the documents in the rows may be represented by a document identification (e.g., a numerical value, an alphanumeric combination, or a set of characters).
  • the terms and semantically meaningful collocations may be represented within a cell of the columns by the term or terms and an indication of a term type.
  • the term type may indicate a category for the term.
  • the intersections between the rows and columns may include a value for a number of occurrences of the specified element of interest within the specified document.
  • the database component 230 sums rows within the modified data structure.
  • the rows include values for data elements included in each of the identified set of documents.
  • the counts (e.g., values at the intersections of specified rows and columns) are processed using a Term Frequency-Inverse Document Frequency (TF-IDF) transformation.
  • TF-IDF Term Frequency-Inverse Document Frequency
  • the TF-IDF transformation may discount popular items as less interesting.
  • the TF-IDF transformation is a two-step process. First, the database component 230 sums the number of documents in which each item occurs. Second, the database component 230 divides the entries in each of the table rows of the modified data structure by the sum. The database component 230 thereby decreases weights of less informative but more popular or frequent terms.
  • the TF-IDF transformation may generate a transformed data structure. In some embodiments, the transformed data structure is used as the basis for identifying potentially interesting elements or terms.
  • the element component 240 identifies one or more elements of interest based on the summed rows of the modified data structure.
  • the element component 240 maps document identifications of the set of documents identified in operation 330 to rows of the transformed data structure. Using the mapping, the element component 240 creates a smaller matrix (e.g., an element matrix) composed of the document rows returned as query results.
  • the element component 240 selects terms of interest by summing the values from the transformed matrix and identifying the terms having a summed value above an interest threshold.
  • the interest threshold is predetermined. In some instances, the interest threshold is dynamic.
  • the dynamic interest threshold may be set as a function of the summed values for the terms.
  • the dynamic interest threshold may be set, at the time of summing the values for the terms, to select terms and to return a set number of terms (e.g., elements of interest) for each document of the set of documents identified in operation 330 .
  • the presentation component 250 causes presentation of the elements of interest in a first portion of the graphical user interface and the textual identifications for the set of documents in a second portion of the graphical user interface.
  • the presentation component 250 causes presentation of the elements of interest and the document identifications in the graphical user interface depicted in FIG. 4 .
  • a first portion 410 depicted in FIG. 4 , displays the elements of interest (e.g., the terms from the transformed matrix).
  • a second portion 420 displays the textual identifications.
  • the document identifications include one or more of the values from the transformed matrix, a title of the document, an identifying subset of content of the document, a selectable representation (e.g., a graphical or textual representation) of the document, or any other suitable identifying information for the documents of the set of documents.
  • the elements of interest in the first portion 410 and the document identifications in the second portion 420 are presented distinctly from one another and without an indication of a relationship between the items included in the first portion 410 and those included in the second portion 420 .
  • the presentation component 250 generates and presents the elements of interest and the document identifications to indicate a relationship between specified elements of interest and specified document identifications.
  • the document identifications may be spaced a distance apart enabling the elements of interest found within each identified document to be presented adjacent to the document identification of the document in which the elements of interest are found.
  • FIG. 5 is a flowchart illustrating operations of the textual identification system 150 in performing a method 500 of processing and identifying elements of interest from content within a set of retrieved documents, according to some example embodiments.
  • Operations of the method 500 may be performed by the modules described above with respect to FIG. 2 .
  • one or more operations of the method 500 are performed as part or sub-operations of one or more operations of the method 300 .
  • the method 500 may include one or more operations of the method 300 .
  • the context component 260 determines a context occurrence for each element of interest within the set of documents.
  • the context occurrence represents a number of related times a term occurs in a document.
  • a context around each term may be tokenized.
  • the context component 260 in tokenizing the context, may identify an index number for terms included in the document proximate to the term for which context is being determined.
  • the context component 260 may then associate, in a matrix, one or more index numbers for the terms surrounding the specified term for which the context is being identified.
  • the context component 260 associates the index numbers of surrounding terms for each instance of a term for which the context is being identified.
  • the context component 260 may identify three sets of terms, with a set of terms surrounding each of the instances of the term “cheese.”
  • the context component 260 may identify the index number for each of the terms within the three sets of terms and associate the index numbers with the instance of the term that they surround.
  • the context component 260 determines the context of an instance of the term by comparing the associated index numbers.
  • the context component 260 may link two or more instances of the term for which the surrounding terms are determined to have a lexical similarity.
  • the lexical similarity of surrounding terms may be identified based on an overlap of terms identified within the surrounding terms. Overlap of terms may be identified where the same term occurs in two or more of the surrounding terms. Lexical similarity may also be identified where terms in sets of surrounding terms are synonyms, have similar definitions, or are otherwise semantically related. In some instances, the lexical similarity may be determined based on Jaccard coefficients determined for the sets of surrounding terms defined by a size of set intersection divided by a size of a set union.
  • the normalization component 270 normalizes the elements of interest by removing redundant elements of interest based on the context occurrence of two or more elements of interest.
  • the normalization component 270 generates a normalized set of elements of interest.
  • the normalization component 270 may normalize instances of an element of interest within a document by identifying one or more deviations among the instances. Deviations may include misspellings, different case usage, partial omissions (e.g., omitting a term forming a linked set of terms such as a full name), or other suitable deviations.
  • normalizing the elements of interest removes redundant instances of the same element of interest within a list presented at a client device.
  • the normalization component 270 normalizes the elements of interest for presentation without removing or merging instances of the terms within one or more of the matrices or indices described above. By maintaining separate instances of the element of interest, the normalization component 270 prevents the database component 230 from erroneously reducing a term's likelihood of being deemed important based on overrepresentation due to merged instances.
  • the presentation component 250 causes presentation of the normalized set of elements of interest in the first portion of the graphical user interface.
  • the presentation component 250 presents the normalized set of elements of interest similarly to or the same as described above with respect to operation 380 .
  • the normalized set of elements may be presented in the first portion of the graphical user interface.
  • the elements of the normalized set of elements are presented in an order according to their association with the documents identified based on the search terms.
  • the normalized set of elements may be presented as an ordered list independent of a relationship to the identified documents presented in the second portion 420 .
  • the element component 240 identifies an element type for each of the elements of interest. In some embodiments, the element component 240 identifies the element type for the elements of interest by determining the elements of interest identified from the set of documents retrieved based on the search terms. The element component 240 may then parse one or more of the matrices or indices described above to identify the element type for each element of interest.
  • the presentation component 250 causes presentation of a visual indicator differentiating the elements of interest based on the element types.
  • the visual indicator may be a graphical indicator or a textual indicator.
  • the visual indicator is coded to indicate the element type without including all of the characters or words for the element type.
  • the presentation component 250 may identify an element type as a city name and abbreviate or otherwise code the element type as “CN.”
  • the presentation component 250 may code the element type in any suitable manner. Further, in some embodiments, the presentation component may generate and cause presentation of key mapping codes and full names for element types.
  • the presentation component 250 causes presentation of at least a portion of a document of the set of documents in a third portion of the graphical user interface.
  • the third portion 430 may be positioned proximate to the second portion 420 .
  • the portion of the document presented in the third portion 430 may include text from a text, text set, publication, or document selected or otherwise specified in the second portion. For example, where a user selects a graphical interface element representing a document identification in the second portion 420 of the graphical user interface, the presentation component 250 generates and causes presentation of the portion of the selected document in the third portion 430 .
  • the third portion 430 of the graphical user interface may include selectable interface elements configured to display or play the video or audio document within the third portion 430 of the graphical user interface.
  • the third portion 430 of the graphical user interface includes an instance of an application configured to display or play the audio or video document.
  • the third portion 430 may also include textual information representing, or included within, the video, audio, or multimedia document.
  • FIG. 6 is a flowchart illustrating operations of the textual identification system 150 in performing a method 600 of processing and identifying elements of interest from content within a set of retrieved documents, according to some example embodiments. Operations of the method 600 may be performed by the modules described above with respect to FIG. 2 . The method 600 may include or be performed as part or sub-operations of one or more operations of the methods 300 or 500 .
  • the context component 260 generates a set of tokens for each element of interest.
  • the set of tokens may represent the context occurrence of a specified element.
  • operation 610 is performed in response to determining the context of occurrence for each element of interest, as described above with respect to operation 510 of the method 500 .
  • the context component 260 may tokenize each element of interest using the index numbers described above or may generate a separate set of context tokens.
  • the tokens may be a numerical value or any other suitable value to identify the term and associate the term with the term for which the context is being identified.
  • the context component 260 identifies an overlap of two or more elements of interest based on the set of tokens for the two or more elements of interest.
  • the overlap may be determined based on semantic relatedness.
  • the overlap may be determined based on occurrence of a term within two or more sets of tokens for the two or more elements.
  • the semantic relatedness or lexical similarity may be determined based on Jaccard coefficients determined for the set of tokens.
  • the context components 260 links the two or more elements of interest.
  • the two or more elements of interest may be linked in one or more of the matrices or indices described above.
  • the two or more elements are linked by generating a context matrix for each document within the set of documents identified in relation to the one or more search terms described above with respect to the method 300 .
  • the context matrix may include the terms within a document in both rows and columns. A bit or value at an intersection of two terms may indicate a contextual link between the two terms.
  • FIG. 7 is a flowchart illustrating operations of the textual identification system 150 in performing a method 700 of processing and identifying elements of interest from content within a set of retrieved documents, according to some example embodiments. Operations of the method 700 may be performed by the modules described above with respect to FIG. 2 . The method 700 may include or be performed as part or sub-operations of one or more operations of the methods 300 , 500 , or 600 .
  • the access component 210 accesses a set of document corpora.
  • the set of document corpora includes the selected document corpus of operation 310 .
  • the set of document corpora is accessed in response to receiving the one or more search terms in operation 320 .
  • operation 710 may occur after operations 310 - 380 .
  • the set of document corpora is accessed without prior selection of a specified document corpus.
  • the access component 210 may access the set of document corpora by directly accessing one or more databases directly or via a network connection.
  • the access component 210 accesses the set of document corpora by accessing a set of metadata identifying the set of document corpora.
  • the document component 220 dynamically partitions the set of document corpora to identify a document corpus containing the set of documents associated with the one or more search terms. In some instances, the document component 220 identifies the document corpus by identifying the search terms among keywords associated with each document corpus of the set of document corpora. In some embodiments, each document corpus may be associated with a distinct database or data source. For example, each distinct database or data source may be associated with or part of a distinct client device. In identifying the document corpus, the document component 220 may select a client device from which the document component may select documents in response to receiving the one or more search terms.
  • the document component 220 dynamically partitions the set of document corpora regardless of distribution of the document corpora among multiple client devices. In these instances, the document component 220 identifies the one or more search terms. The document component 220 may compare the one or more search terms with an index or matrix identifying terms associated with individual documents within each document corpus of the set of document corpora. The index or matrix may also identify the document corpus with which each of the documents are associated. The document component 220 may identify one or more document corpora from the index or matrix. The document component 220 may then perform a comparative analysis of the one or more document corpora to identify a single document corpus to search using the one or more search terms. In some instances, the comparative analysis identifies the document corpus having a highest number of occurrences of the search terms, and selects the specified document corpus.
  • the document component 220 combines two or more document corpora to generate a dynamic document corpus. In these embodiments, where several document corpora include a suitable number of instances of occurrences of the search terms, the document component 220 selects the two or more document corpora and searches each of the document corpora for documents including the one or more search terms.
  • Modules may constitute either software modules (e.g., code embodied on a machine-readable medium) or hardware modules.
  • a “hardware module” is a tangible unit capable of performing certain operations and may be configured or arranged in a certain physical manner.
  • one or more computer systems e.g., a standalone computer system, a client computer system, or a server computer system
  • one or more hardware modules of a computer system e.g., a processor or a group of processors
  • software e.g., an application or application portion
  • a hardware module may be implemented mechanically, electronically, or any suitable combination thereof.
  • a hardware module may include dedicated circuitry or logic that is permanently configured to perform certain operations.
  • a hardware module may be a special-purpose processor, such as a Field-Programmable Gate Array (FPGA) or an Application Specific Integrated Circuit (ASIC).
  • a hardware module may also include programmable logic or circuitry that is temporarily configured by software to perform certain operations.
  • a hardware module may include software executed by a general-purpose processor or other programmable processor. Once configured by such software, hardware modules become specific machines (or specific components of a machine) uniquely tailored to perform the configured functions and are no longer general-purpose processors. It will be appreciated that the decision to implement a hardware module mechanically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by cost and time considerations.
  • hardware module should be understood to encompass a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired), or temporarily configured (e.g., programmed) to operate in a certain manner or to perform certain operations described herein.
  • “hardware-implemented module” refers to a hardware module. Considering embodiments in which hardware modules are temporarily configured (e.g., programmed), each of the hardware modules need not be configured or instantiated at any one instance in time. For example, where a hardware module comprises a general-purpose processor configured by software to become a special-purpose processor, the general-purpose processor may be configured as respectively different special-purpose processors (e.g., comprising different hardware modules) at different times. Software accordingly configures a particular processor or processors, for example, to constitute a particular hardware module at one instance of time and to constitute a different hardware module at a different instance of time.
  • Hardware modules can provide information to, and receive information from, other hardware modules. Accordingly, the described hardware modules may be regarded as being communicatively coupled. Where multiple hardware modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) between or among two or more of the hardware modules. In embodiments in which multiple hardware modules are configured or instantiated at different times, communications between such hardware modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple hardware modules have access. For example, one hardware module may perform an operation and store the output of that operation in a memory device to which it is communicatively coupled. A further hardware module may then, at a later time, access the memory device to retrieve and process the stored output. Hardware modules may also initiate communications with input or output devices, and can operate on a resource (e.g., a collection of information).
  • a resource e.g., a collection of information
  • processors may be temporarily configured (e.g., by software) or permanently configured to perform the relevant operations. Whether temporarily or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions described herein.
  • processor-implemented module refers to a hardware module implemented using one or more processors.
  • the methods described herein may be at least partially processor-implemented, with a particular processor or processors being an example of hardware.
  • a particular processor or processors being an example of hardware.
  • the operations of a method may be performed by one or more processors or processor-implemented modules.
  • the one or more processors may also operate to support performance of the relevant operations in a “cloud computing” environment or as a “software as a service” (SaaS).
  • SaaS software as a service
  • at least some of the operations may be performed by a group of computers (as examples of machines including processors), with these operations being accessible via a network (e.g., the Internet) and via one or more appropriate interfaces (e.g., an Application Program Interface (API)).
  • API Application Program Interface
  • processors may be distributed among the processors, not only residing within a single machine, but deployed across a number of machines.
  • the processors or processor-implemented modules may be located in a single geographic location (e.g., within a home environment, an office environment, or a server farm). In other example embodiments, the processors or processor-implemented modules may be distributed across a number of geographic locations.
  • FIGS. 1-7 The components, methods, applications and so forth described in conjunction with FIGS. 1-7 are implemented in some embodiments in the context of a machine and an associated software architecture.
  • the sections below describe representative software architecture(s) and machine (e.g., hardware) architecture that are suitable for use with the disclosed embodiments.
  • Software architectures are used in conjunction with hardware architectures to create devices and machines tailored to particular purposes. For example, a particular hardware architecture coupled with a particular software architecture will create a mobile device, such as a mobile phone, tablet device, or so forth. A slightly different hardware and software architecture may yield a smart device for use in the “internet of things,” while yet another combination produces a server computer for use within a cloud computing architecture. Not all combinations of such software and hardware architectures are presented here as those of skill in the art can readily understand how to implement the subject matter in different contexts from the disclosure contained herein.
  • FIG. 8 is a block diagram 800 illustrating a representative software architecture 802 , which may be used in conjunction with various hardware architectures herein described.
  • FIG. 8 is merely a non-limiting example of a software architecture and it will be appreciated that many other architectures may be implemented to facilitate the functionality described herein.
  • the software architecture 802 may be executing on hardware such as a machine 900 of FIG. 9 that includes, among other things, processors 910 , memory/storage 930 , and I/O components 950 .
  • a representative hardware layer 804 is illustrated and can represent, for example, the machine 900 of FIG. 9 .
  • the representative hardware layer 804 comprises one or more processing units 806 having associated executable instructions 808 .
  • the executable instructions 808 represent the executable instructions of the software architecture 802 , including implementation of the methods, components and so forth of FIG. 2 .
  • Hardware layer 804 also includes memory and/or storage modules 810 , which also have executable instructions 808 .
  • Hardware layer 804 may also comprise other hardware as indicated by 812 which represents any other hardware of the hardware layer 804 , such as the other hardware illustrated as part of machine 1000 .
  • the software 802 may be conceptualized as a stack of layers where each layer provides particular functionality.
  • the software 802 may include layers such as an operating system 814 , libraries 816 , frameworks/middleware 818 , applications 820 and presentation layer 844 .
  • the applications 820 and/or other components within the layers may invoke application programming interface (API) calls 824 through the software stack and receive a response, returned values, and so forth illustrated as messages 826 in response to the API calls 824 .
  • API application programming interface
  • the layers illustrated are representative in nature and not all software architectures have all layers. For example, some mobile or special purpose operating systems may not provide a frameworks/middleware layer 818 , while others may provide such a layer. Other software architectures may include additional or different layers.
  • the operating system 814 may manage hardware resources and provide common services.
  • the operating system 814 may include, for example, a kernel 828 , services 830 , and drivers 832 .
  • the kernel 828 may act as an abstraction layer between the hardware and the other software layers.
  • the kernel 828 may be responsible for memory management, processor management (e.g., scheduling), component management, networking, security settings, and so on.
  • the services 830 may provide other common services for the other software layers.
  • the drivers 832 may be responsible for controlling or interfacing with the underlying hardware.
  • the drivers 832 may include display drivers, camera drivers, Bluetooth® drivers, flash memory drivers, serial communication drivers (e.g., Universal Serial Bus (USB) drivers), WiFi® drivers, audio drivers, power management drivers, and so forth depending on the hardware configuration.
  • USB Universal Serial Bus
  • the libraries 816 may provide a common infrastructure that may be utilized by the applications 820 and/or other components and/or layers.
  • the libraries 816 typically provide functionality that allows other software modules to perform tasks in an easier fashion than to interface directly with the underlying operating system 814 functionality (e.g., kernel 828 , services 830 , and/or drivers 832 ).
  • the libraries 816 may include system libraries 834 (e.g., C standard library) that may provide functions such as memory allocation functions, string manipulation functions, mathematic functions, and the like.
  • libraries 816 may include API libraries 836 such as media libraries (e.g., libraries to support presentation and manipulation of various media formats such as MPEG4, H.264, MP3, AAC, AMR, JPG, PNG), graphics libraries (e.g., an OpenGL framework that may be used to render 2D and 3D information in a graphic content on a display), database libraries (e.g., SQLite that may provide various relational database functions), web libraries (e.g., WebKit that may provide web browsing functionality), and the like.
  • the libraries 816 may also include a wide variety of other libraries 838 to provide many other APIs to the applications 820 and other software components/modules.
  • the frameworks 818 may provide a higher-level common infrastructure that may be utilized by the applications 820 and/or other software components/modules.
  • the frameworks 818 may provide various graphic user interface (GUI) functions, high-level resource management, high-level location services, and so forth.
  • GUI graphic user interface
  • the frameworks 818 may provide a broad spectrum of other APIs that may be utilized by the applications 820 and/or other software components/modules, some of which may be specific to a particular operating system or platform.
  • the applications 820 includes built-in applications 840 and/or third-party applications 842 .
  • built-in applications 840 may include, but are not limited to, a contacts application, a browser application, a book reader application, a location application, a media application, a messaging application, and/or a game application.
  • the third-party applications 842 may include any of the built-in applications as well as a broad assortment of other applications.
  • the third-party application 842 e.g., an application developed using the AndroidTM or iOSTM software development kit (SDK) by an entity other than the vendor of the particular platform
  • the third-party application 842 may be mobile software running on a mobile operating system such as iOSTM, AndroidTM, Windows® Phone, or other mobile operating systems.
  • the third-party application 842 may invoke the API calls 824 provided by the mobile operating system such as operating system 814 to facilitate functionality described herein.
  • the applications 820 may utilize built in operating system functions (e.g., kernel 828 , services 830 and/or drivers 832 ), libraries (e.g., system libraries 834 , API libraries 836 , and other libraries 838 ), and frameworks/middleware 818 to create user interfaces to interact with users of the system.
  • libraries e.g., system libraries 834 , API libraries 836 , and other libraries 838
  • frameworks/middleware 818 e.g., frameworks/middleware 818 to create user interfaces to interact with users of the system.
  • interactions with a user may occur through a presentation layer, such as the presentation layer 844 .
  • the application/module “logic” can be separated from the aspects of the application/module that interact with a user.
  • a virtual machine creates a software environment where applications/modules can execute as if they were executing on a hardware machine (such as the machine 900 of FIG. 9 , for example).
  • a virtual machine is hosted by a host operating system (e.g., operating system 814 in FIG. 9 ) and typically, although not always, has a virtual machine monitor 846 , which manages the operation of the virtual machine as well as the interface with the host operating system (e.g., operating system 814 ).
  • a software architecture executes within the virtual machine such as an operating system 850 , libraries 816 , frameworks/middleware 854 , applications 856 and/or presentation layer 858 .
  • These layers of software architecture executing within the virtual machine 848 can be the same as corresponding layers previously described or may be different.
  • FIG. 9 is a block diagram illustrating components of a machine 900 , according to some example embodiments, able to read instructions from a machine-readable medium (e.g., a machine-readable storage medium) and perform any one or more of the methodologies discussed herein.
  • FIG. 9 shows a diagrammatic representation of the machine 900 in the example form of a computer system, within which instructions 916 (e.g., software, a program, an application, an applet, an app, or other executable code) for causing the machine 900 to perform any one or more of the methodologies discussed herein may be executed.
  • the instructions may cause the machine to execute the flow diagrams of FIGS. 3 and 5-7 .
  • the instructions may implement in the components or modules of FIG.
  • the instructions transform the general, non-programmed machine into a particular (e.g., special purpose) machine programmed to carry out the described and illustrated functions in the manner described.
  • the machine 900 operates as a standalone device or may be coupled (e.g., networked) to other machines.
  • the machine 900 may operate in the capacity of a server machine or a client machine in a server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment.
  • the machine 900 may comprise, but is not limited to, a server computer, a client computer, a personal computer (PC), a tablet computer, a laptop computer, a netbook, a set-top box (STB), a personal digital assistant (PDA), an entertainment media system, a cellular telephone, a smart phone, a mobile device, a wearable device (e.g., a smart watch), a smart home device (e.g., a smart appliance), other smart devices, a web appliance, a network router, a network switch, a network bridge, or any machine capable of executing the instructions 916 , sequentially or otherwise, that specify actions to be taken by machine 900 .
  • the term “machine” shall also be taken to include a collection of machines 900 that individually or jointly execute the instructions 916 to perform any one or more of the methodologies discussed herein.
  • the machine 900 may include processors 910 , memory 930 , and I/O components 950 , which may be configured to communicate with each other such as via a bus 902 .
  • the processors 910 e.g., a Central Processing Unit (CPU), a Reduced Instruction Set Computing (RISC) processor, a Complex Instruction Set Computing (CISC) processor, a Graphics Processing Unit (GPU), a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Radio-Frequency Integrated Circuit (RFIC), another processor, or any suitable combination thereof
  • the processors 910 may include, for example, processor 912 and processor 914 that may execute instructions 916 .
  • processor is intended to include multi-core processor that may comprise two or more independent processors (sometimes referred to as “cores”) that may execute instructions contemporaneously.
  • FIG. 9 shows multiple processors, the machine 900 may include a single processor with a single core, a single processor with multiple cores (e.g., a multi-core processor), multiple processors with a single core, multiple processors with multiples cores, or any combination thereof.
  • the memory/storage 930 may include a memory 932 , such as a main memory, or other memory storage, and a storage unit 936 , both accessible to the processors 910 such as via the bus 902 .
  • the storage unit 936 and memory 932 store the instructions 916 embodying any one or more of the methodologies or functions described herein.
  • the instructions 916 may also reside, completely or partially, within the memory 932 , within the storage unit 936 , within at least one of the processors 910 (e.g., within the processor's cache memory), or any suitable combination thereof, during execution thereof by the machine 900 .
  • the memory 932 , the storage unit 936 , and the memory of the processors 910 are examples of machine-readable media.
  • machine-readable medium means a device able to store instructions and data temporarily or permanently and may include, but is not limited to, random-access memory (RAM), read-only memory (ROM), buffer memory, flash memory, optical media, magnetic media, cache memory, other types of storage (e.g., Erasable Programmable Read-Only Memory (EEPROM)), and/or any suitable combination thereof.
  • RAM random-access memory
  • ROM read-only memory
  • buffer memory flash memory
  • optical media magnetic media
  • cache memory other types of storage
  • EEPROM Erasable Programmable Read-Only Memory
  • machine-readable medium shall also be taken to include any medium, or combination of multiple media, that is capable of storing instructions (e.g., instructions 916 ) for execution by a machine (e.g., machine 900 ), such that the instructions, when executed by one or more processors of the machine 900 (e.g., processors 910 ), cause the machine 900 to perform any one or more of the methodologies described herein.
  • a “machine-readable medium” refers to a single storage apparatus or device, as well as “cloud-based” storage systems or storage networks that include multiple storage apparatus or devices.
  • the term “machine-readable medium” excludes signals per se.
  • the I/O components 950 may include a wide variety of components to receive input, provide output, produce output, transmit information, exchange information, capture measurements, and so on.
  • the specific I/O components 950 that are included in a particular machine will depend on the type of machine. For example, portable machines such as mobile phones will likely include a touch input device or other such input mechanisms, while a headless server machine will likely not include such a touch input device. It will be appreciated that the I/O components 950 may include many other components that are not shown in FIG. 9 .
  • the I/O components 950 are grouped according to functionality merely for simplifying the following discussion and the grouping is in no way limiting. In various example embodiments, the I/O components 950 may include output components 952 and input components 954 .
  • the output components 952 may include visual components (e.g., a display such as a plasma display panel (PDP), a light emitting diode (LED) display, a liquid crystal display (LCD), a projector, or a cathode ray tube (CRT)), acoustic components (e.g., speakers), haptic components (e.g., a vibratory motor, resistance mechanisms), other signal generators, and so forth.
  • a display such as a plasma display panel (PDP), a light emitting diode (LED) display, a liquid crystal display (LCD), a projector, or a cathode ray tube (CRT)
  • acoustic components e.g., speakers
  • haptic components e.g., a vibratory motor, resistance mechanisms
  • the input components 954 may include alphanumeric input components (e.g., a keyboard, a touch screen configured to receive alphanumeric input, a photo-optical keyboard, or other alphanumeric input components), point based input components (e.g., a mouse, a touchpad, a trackball, a joystick, a motion sensor, or other pointing instrument), tactile input components (e.g., a physical button, a touch screen that provides location and/or force of touches or touch gestures, or other tactile input components), audio input components (e.g., a microphone), and the like.
  • alphanumeric input components e.g., a keyboard, a touch screen configured to receive alphanumeric input, a photo-optical keyboard, or other alphanumeric input components
  • point based input components e.g., a mouse, a touchpad, a trackball, a joystick, a motion sensor, or other pointing instrument
  • tactile input components e.g., a physical button,
  • the I/O components 950 may include biometric components 956 , motion components 958 , environmental components 960 , or position components 962 among a wide array of other components.
  • the biometric components 956 may include components to detect expressions (e.g., hand expressions, facial expressions, vocal expressions, body gestures, or eye tracking), measure biosignals (e.g., blood pressure, heart rate, body temperature, perspiration, or brain waves), identify a person (e.g., voice identification, retinal identification, facial identification, fingerprint identification, or electroencephalogram based identification), and the like.
  • the motion components 958 may include acceleration sensor components (e.g., accelerometer), gravitation sensor components, rotation sensor components (e.g., gyroscope), and so forth.
  • the environmental components 960 may include, for example, illumination sensor components (e.g., photometer), temperature sensor components (e.g., one or more thermometer that detect ambient temperature), humidity sensor components, pressure sensor components (e.g., barometer), acoustic sensor components (e.g., one or more microphones that detect background noise), proximity sensor components (e.g., infrared sensors that detect nearby objects), gas sensors (e.g., gas detection sensors to detection concentrations of hazardous gases for safety or to measure pollutants in the atmosphere), or other components that may provide indications, measurements, or signals corresponding to a surrounding physical environment.
  • illumination sensor components e.g., photometer
  • temperature sensor components e.g., one or more thermometer that detect ambient temperature
  • humidity sensor components e.g., pressure sensor components (e.g., barometer)
  • the position components 962 may include location sensor components (e.g., a Global Position System (GPS) receiver component), altitude sensor components (e.g., altimeters or barometers that detect air pressure from which altitude may be derived), orientation sensor components (e.g., magnetometers), and the like.
  • location sensor components e.g., a Global Position System (GPS) receiver component
  • altitude sensor components e.g., altimeters or barometers that detect air pressure from which altitude may be derived
  • orientation sensor components e.g., magnetometers
  • the I/O components 950 may include communication components 964 operable to couple the machine 900 to a network 980 or devices 970 via coupling 982 and coupling 972 respectively.
  • the communication components 964 may include a network interface component or other suitable device to interface with the network 980 .
  • communication components 964 may include wired communication components, wireless communication components, cellular communication components, Near Field Communication (NFC) components, Bluetooth® components (e.g., Bluetooth® Low Energy), WiFi® components, and other communication components to provide communication via other modalities.
  • the devices 970 may be another machine or any of a wide variety of peripheral devices (e.g., a peripheral device coupled via a Universal Serial Bus (USB)).
  • USB Universal Serial Bus
  • the communication components 964 may detect identifiers or include components operable to detect identifiers.
  • the communication components 964 may include Radio Frequency Identification (RFID) tag reader components, NFC smart tag detection components, optical reader components (e.g., an optical sensor to detect one-dimensional bar codes such as Universal Product Code (UPC) bar code, multi-dimensional bar codes such as Quick Response (QR) code, Aztec code, Data Matrix, Dataglyph, MaxiCode, PDF417, Ultra Code, UCC RSS-2D bar code, and other optical codes), or acoustic detection components (e.g., microphones to identify tagged audio signals).
  • RFID Radio Frequency Identification
  • NFC smart tag detection components e.g., an optical sensor to detect one-dimensional bar codes such as Universal Product Code (UPC) bar code, multi-dimensional bar codes such as Quick Response (QR) code, Aztec code, Data Matrix, Dataglyph, MaxiCode, PDF417, Ultra Code, UCC RSS-2D bar code, and other optical codes
  • RFID Radio Fre
  • IP Internet Protocol
  • Wi-Fi® Wireless Fidelity
  • one or more portions of the network 980 may be an ad hoc network, an intranet, an extranet, a virtual private network (VPN), a local area network (LAN), a wireless LAN (WLAN), a wide area network (WAN), a wireless WAN (WWAN), a metropolitan area network (MAN), the Internet, a portion of the Internet, a portion of the Public Switched Telephone Network (PSTN), a plain old telephone service (POTS) network, a cellular telephone network, a wireless network, a Wi-Fi® network, another type of network, or a combination of two or more such networks.
  • VPN virtual private network
  • LAN local area network
  • WLAN wireless LAN
  • WAN wide area network
  • WWAN wireless WAN
  • MAN metropolitan area network
  • PSTN Public Switched Telephone Network
  • POTS plain old telephone service
  • the network 980 or a portion of the network 980 may include a wireless or cellular network and the coupling 982 may be a Code Division Multiple Access (CDMA) connection, a Global System for Mobile communications (GSM) connection, or other type of cellular or wireless coupling.
  • CDMA Code Division Multiple Access
  • GSM Global System for Mobile communications
  • the coupling 982 may implement any of a variety of types of data transfer technology, such as Single Carrier Radio Transmission Technology (1 ⁇ RTT), Evolution-Data Optimized (EVDO) technology, General Packet Radio Service (GPRS) technology, Enhanced Data rates for GSM Evolution (EDGE) technology, third Generation Partnership Project (3GPP) including 3G, fourth generation wireless (4G) networks, Universal Mobile Telecommunications System (UMTS), High Speed Packet Access (HSPA), Worldwide Interoperability for Microwave Access (WiMAX), Long Term Evolution (LTE) standard, others defined by various standard setting organizations, other long range protocols, or other data transfer technology.
  • RTT Single Carrier Radio Transmission Technology
  • GPRS General Packet Radio Service
  • EDGE Enhanced Data rates for GSM Evolution
  • 3GPP Third Generation Partnership Project
  • 4G fourth generation wireless (4G) networks
  • Universal Mobile Telecommunications System (UMTS) Universal Mobile Telecommunications System
  • HSPA High Speed Packet Access
  • WiMAX Worldwide Interoperability for Microwave Access
  • LTE
  • the instructions 916 may be transmitted or received over the network 980 using a transmission medium via a network interface device (e.g., a network interface component included in the communication components 964 ) and utilizing any one of a number of well-known transfer protocols (e.g., hypertext transfer protocol (HTTP)). Similarly, the instructions 916 may be transmitted or received using a transmission medium via the coupling 972 (e.g., a peer-to-peer coupling) to the devices 970 .
  • the term “transmission medium” shall be taken to include any intangible medium that is capable of storing, encoding, or carrying the instructions 916 for execution by the machine 900 , and includes digital or analog communications signals or other intangible media to facilitate communication of such software.
  • inventive subject matter has been described with reference to specific example embodiments, various modifications and changes may be made to these embodiments without departing from the broader scope of embodiments of the present disclosure.
  • inventive subject matter may be referred to herein, individually or collectively, by the term “invention” merely for convenience and without intending to voluntarily limit the scope of this application to any single disclosure or inventive concept if more than one is, in fact, disclosed.
  • the term “or” may be construed in either an inclusive or exclusive sense. Moreover, plural instances may be provided for resources, operations, or structures described herein as a single instance. Additionally, boundaries between various resources, operations, modules, engines, and data stores are somewhat arbitrary, and particular operations are illustrated in a context of specific illustrative configurations. Other allocations of functionality are envisioned and may fall within a scope of various embodiments of the present disclosure. In general, structures and functionality presented as separate resources in the example configurations may be implemented as a combined structure or resource. Similarly, structures and functionality presented as a single resource may be implemented as separate resources. These and other variations, modifications, additions, and improvements fall within a scope of embodiments of the present disclosure as represented by the appended claims. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Abstract

In various example embodiments, a textual identification system is configured to receive a set of search terms and identify a set of textual data based on the search terms. The textual identification system retrieves a data structure including textual identifications for the set of textual data and processes the data structure to generate a modified data structure. The textual identification system sums rows within the modified data structure and identifies one or more elements of interest. The textual identification system then causes presentation of the elements of interest in a first portion of a graphical user interface and the textual identifications for the set of textual data in a second portion of the graphical user interface.

Description

    PRIORITY APPLICATION
  • This application claims priority to, and is a continuation of U.S. patent application Ser. No. 15/678,874, filed Aug. 16, 2017, which claims priority to U.S. Provisional Application Ser. No. 62/424,844, filed Nov. 21, 2016, the disclosures of which are incorporated herein in their entirety by reference.
  • TECHNICAL FIELD
  • The subject matter disclosed herein generally relates to machines configured to the technical field of special-purpose machines that facilitate analysis of large bodies of textual data including computerized variants of such special-purpose machines and improvements to such variants, and to the technologies by which such special-purpose machines become improved compared to other special-purpose machines that facilitate analysis of large bodies of textual data. Embodiments of the present disclosure relate generally to searching large sets of data and, more particularly, but not by way of limitation, to a system and method of identifying documents and additional elements of interest based on search terms.
  • BACKGROUND
  • Machine learning processes are often useful in making predictions based on data sets. Users may want to explore a large quantity of text or documents as part of a data set. Typically, an individual performs a series of searches, with the help of a search engine or search tool, to target individual specified aspects, things, entities, or people referenced in the documents. The series of searches may provide a separate lists of results from which the user manually identifies relevant documents. However, manual review of results within the list is often time consuming and prohibitive where the list of results is large.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • Various ones of the appended drawings merely illustrate example embodiments of the present disclosure and cannot be considered as limiting its scope.
  • FIG. 1 is a block diagram illustrating a networked system, according to some example embodiments.
  • FIG. 2 is a block diagram illustrating various modules of a textual data identification system, according to various example embodiments.
  • FIG. 3 is a flowchart illustrating individual operations of a method for processing and identifying elements of interest from content within a set of retrieved text sets, according to various example embodiments.
  • FIG. 4 is a graphical user interface displaying textual identifications and elements of interest in differing portions based on received search terms, according to various example embodiments.
  • FIG. 5 is a flowchart illustrating operations of a method of processing and identifying elements of interest from content within a set of retrieved text sets, according to various example embodiments.
  • FIG. 6 is a flowchart illustrating operations of a method of processing and identifying elements of interest from content within a set of retrieved text sets, according to various example embodiments.
  • FIG. 7 is a flowchart illustrating operations of a method for processing and identifying elements of interest from content within a set of retrieved text sets, according to various example embodiments.
  • FIG. 8 is a block diagram illustrating an example of a software architecture that may be installed on a machine, according to some example embodiments.
  • FIG. 9 illustrates a diagrammatic representation of a machine in the form of a computer system within which a set of instructions may be executed for causing the machine to perform any one or more of the methodologies discussed herein, according to an example embodiment.
  • The headings provided herein are merely for convenience and do not necessarily affect the scope or meaning of the terms used.
  • DETAILED DESCRIPTION
  • The description that follows includes systems, methods, techniques, instruction sequences, and computing machine program products that embody illustrative embodiments of the disclosure. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide an understanding of various embodiments of the inventive subject matter. It will be evident, however, to those skilled in the art, that embodiments of the inventive subject matter may be practiced without these specific details. In general, well-known instruction instances, protocols, structures, and techniques are not necessarily shown in detail.
  • Example embodiments described herein disclose a textual identification system configured to identify texts within a set of textual data and elements of interest from within the identified texts. In some instances, the textual identification system provides unique and semantically meaningful elements of interest from within the textual data to expand or focus searches performed on the set of textual data. The identification of elements of interest may eliminate or consolidate deviations within usage, context, and spelling of the elements of interest to improve types, accuracy, and semantically related content of the elements of interest with respect to an initial set of search terms.
  • For example, in some embodiments, the textual identification system may initially present a graphical user interface at a client device. Upon receiving search terms (e.g., selections from predetermined terms or freely entered term), the textual identification system identifies texts (e.g., text documents, video documents, audio documents, publications, or multimedia documents) from textual data accessible by the textual identification system. Based on the search terms and the identified texts, the textual identification system identifies and presents elements of interest (e.g., additional terms) associated with, or included in, the identified texts. The textual identification system parses the texts within the set of textual data to identify terms contained within the texts, the context in which the terms are used, deviations among usage and form of the terms, and meaningful semantic relationships among two or more terms within the texts. Based on the context, deviations, and meaningful semantic relationships of terms within the identified texts, the textual identification system generates a list of elements of interest and presents the elements of interest along with identifications of the identified texts.
  • The textual identification system provides technical improvements to previous search suggestion systems by identifying multiple disparate contextual uses and semantically meaningful combinations of terms within identified texts and with respect to the search terms used to identify the texts. Use of the indices, matrices, and data structures described herein may also increase the speed and precision with which additional terms are identified. Further, the textual identification system may better identify additional terms by merging or eliminating presentation of additional terms to remove extraneous terms, merge deviant uses of terms, and merging or separating terms based on contextual or semantically meaningful usage, thereby improving previous suggested search systems.
  • Examples merely typify possible variations. Unless explicitly stated otherwise, components and functions are optional and may be combined or subdivided, and operations may vary in sequence or be combined or subdivided. In the following description, for purposes of explanation, numerous specific details are set forth to provide a thorough understanding of example embodiments. It will be evident to one skilled in the art, however, that the present subject matter may be practiced without these specific details.
  • With reference to FIG. 1, an example embodiment of a high-level client-server-based network architecture 100 is shown. A networked system 102, in the example forms of a network-based recommendation system, provides server-side functionality via a network 104 (e.g., the Internet or wide area network (WAN)) to one or more client devices 110. FIG. 1 illustrates, for example, a web client 112 (e.g., a browser, such as the Internet Explorer® browser developed by Microsoft® Corporation of Redmond, Wash. State), a client application 114, and a programmatic client 116 executing on client device 110.
  • The client device 110 may comprise, but is not limited to, a mobile phone, desktop computer, laptop, portable digital assistants (PDAs), smart phones, tablets, ultra books, netbooks, laptops, multi-processor systems, microprocessor-based or programmable consumer electronics, game consoles, set-top boxes, or any other communication device that a user may utilize to access the networked system 102. In some embodiments, the client device 110 may comprise a display module (not shown) to display information (e.g., in the form of user interfaces). In further embodiments, the client device 110 may comprise one or more of a touch screens, accelerometers, gyroscopes, cameras, microphones, global positioning system (GP S) devices, and so forth. The client device 110 may be a device of a user that is used to perform a transaction involving digital items within the networked system 102. One or more users 106 may be a person, a machine, or other means of interacting with client device 110. In embodiments, the user 106 is not part of the network architecture 100, but may interact with the network architecture 100 via client device 110 or another means. For example, one or more portions of network 104 may be an ad hoc network, an intranet, an extranet, a virtual private network (VPN), a local area network (LAN), a wireless LAN (WLAN), a wide area network (WAN), a wireless WAN (WWAN), a metropolitan area network (MAN), a portion of the Internet, a portion of the Public Switched Telephone Network (PSTN), a cellular telephone network, a wireless network, a WiFi network, a WiMax network, another type of network, or a combination of two or more such networks.
  • Each of the client device 110 may include one or more applications (also referred to as “apps”) such as, but not limited to, a web browser, messaging application, electronic mail (email) application, and the like.
  • One or more users 106 may be a person, a machine, or other means of interacting with the client device 110. In example embodiments, the user 106 is not part of the network architecture 100, but may interact with the network architecture 100 via the client device 110 or other means. For instance, the user provides input (e.g., touch screen input or alphanumeric input) to the client device 110 and the input is communicated to the networked system 102 via the network 104. In this instance, the networked system 102, in response to receiving the input from the user, communicates information to the client device 110 via the network 104 to be presented to the user. In this way, the user can interact with the networked system 102 using the client device 110.
  • An application program interface (API) server 120 and a web server 122 are coupled to, and provide programmatic and web interfaces respectively to, one or more application servers 140. The application servers 140 may host one or more publication systems comprising a textual identification system 150, which may comprise one or more modules or applications and which may be embodied as hardware, software, firmware, or any combination thereof. The application servers 140 are, in turn, shown to be coupled to one or more database servers 124 that facilitate access to one or more information storage repositories or database(s) 126. In an example embodiment, the databases 126 are storage devices that store information to be posted (e.g., publications or listings) to the networked system 102. The databases 126 may also store digital item information in accordance with example embodiments.
  • Additionally, a third-party application 132, executing on third-party server(s) 130, is shown as having programmatic access to the networked system 102 via the programmatic interface provided by the API server 120. For example, the third-party application 132, utilizing information retrieved from the networked system 102, supports one or more features or functions on a website hosted by the third party. The third-party website, for example, provides one or more functions that are supported by the relevant systems or servers of the networked system 102.
  • The textual identification system 150 provides functionality operable to identify and retrieve documents or data and elements of interest in response to receiving search terms. For example, the textual identification system 150 may access sets of data (e.g., document corpora) stored in a structured format from the databases 126, the third-party servers 130, the client device 110, and other sources. In some example embodiments, the textual identification system 150 analyzes the set of data in order to determine portions of the data associated with the search terms and additional terms (e.g., elements of interest).
  • Further, while the network architecture 100 shown in FIG. 1 employs a client-server architecture, the present inventive subject matter is of course not limited to such an architecture, and could equally well find application in a distributed, or peer-to-peer, architecture system, for example. The textual identification system 150 could also be implemented as standalone software programs, which do not necessarily have networking capabilities.
  • Additionally, a third-party application 132, executing on a third-party server(s) 130, is shown as having programmatic access to the networked system 102 via the programmatic interface provided by the API server 120. For example, the third-party application 128, utilizing information retrieved from the networked system 102, may support one or more features or functions on a website hosted by the third party. The third-party website may, for example, provide one or more promotional, marketplace, or payment functions that are supported by the relevant applications of the networked system 102.
  • FIG. 2 is a block diagram illustrating components of the textual identification system 150. Components of the textual identification system 150 configure the textual identification system 150 to access sets of textual data to identify texts or text sets within the textual data, identify data elements within the texts, and identify elements of interest based on data structures generated from the set of textual data. In some embodiments, the components configure the textual identification system 150 to generate initial data structures and modify the data structures to process the data within the data structure; increase accuracy and efficacy of elements of interest identified from the texts; and increase the speed with which a machine forming all or part of the textual identification system 150 identifies and presents the elements of interest at the machine. In order to perform these operations, the textual identification system 150 comprises an access component 210, a document component 220, a database component 230, an element component 240, a presentation component 250, a context component 260, and the normalization component 270. Any one or more of these components may be implemented using one or more processors and hence may include one or more processors (e.g., by configuring such one or more processors to perform functions described for that component).
  • The access component 210 accesses or otherwise receives selections of at least one document corpus from a set of document corpora. The access component 210 may access the set of textual corpora by accessing a set of metadata identifying the set of textual corpora. In some instances, the access component 210 accesses the set of textual corpora by accessing one or more database directly or via a network connection. The access component 210 may also access or otherwise receive one or more search terms within a graphical user interface. In some embodiments, the access component 210 retrieves a data structure including textual identifications for a set of textual data and an indication of one or more data elements within one or more texts included in the set of textual data.
  • The document component 220 identifies a set of textual data based on one or more search terms. The document component 220 may use search algorithms to identify the set of textual data based on an index of keywords associated with the content and metadata of the document. In some embodiments, the document component 220 dynamically partitions the set of textual data to identify a textual, textual component, a text or text set within a textual corpus, or a textual corpus containing the set of documents associated with the one or more search terms. In some instances, the document component 220 partitions a set of textual corpora or merges two or more textual corpora based on search terms received from the access component 210.
  • The database component 230 generates data structures and modified data structures. In some embodiments, the database component 230 generates data structures including textual identifications for texts within the set of textual data. The data structures may also include indications of one or more data elements within the texts. The data elements may be words, titles, names, addresses, numbers, or any other suitable information contained within a text. The database component 230 may generate modified data structures from the data structures generated to represent texts within the textual data. In some embodiments, the database component 230 generates modified data structures by assigning index numbers to each element, term, combinations of elements, or combination of terms within a data structure. A full index (e.g., the data structure) may be reduced to include texts within the set of textual data identified based on the search terms. The database component 230 may sum rows within the modified data structure. In some instances, the database component 230 also processes counts for the terms using one or more processes to transform the modified data structure and remove or discount popular or common entries adding little value to analysis based on high frequency of occurrence.
  • The element component 240 identifies elements of interest within modified data structures generated by the database component 230. In some embodiments, the elements of interest are identified, at least in part, based on the summed rows of the modified data structures. The element component 240 may map textual identifications of sets of textual data to rows in transformed or modified data structures generated by the database component 230. In some instances, the element component 240 selects elements of interest by summing values from transformed matrices based on comparison of values associated with the elements to an interest threshold. The element component 240 may also identify element types for each element of interest.
  • The presentation component 250 causes presentations of graphical user interfaces, visual indicators, portions of texts, and other elements described herein. In some embodiments, the presentation component 250 causes presentation of a graphical user interface including selectable interface elements configured to receive search terms or provide search terms for selection and subsequent query of the set of textual data. The presentation component 250 may cause presentation of elements of interest within the graphical user interface as well as portions of texts accessed or retrieved from the set of textual data based on the search terms provided to the access component 210. In some embodiments, the presentation component 250 causes presentation of unique and tailored graphical user interfaces based on a combination of the texts, the search terms, and the elements of interest. The tailored graphical user interfaces may be presented differently to different users based on the information retrieved by the textual identification system 150, the user performing the search, element relationships or collocations, combinations thereof, and other suitable information. In some instances, portions of the graphical user interface are dynamically generated, such that a portion of the graphical user interface may only appear when information relevant to the portion is retrieved, identified, or generated by the textual identification system 150. In these instances, the graphical user interface may automatically resize, reorient, repartition, or otherwise adjust one or more initially presented portions of the graphical user interface to accommodate addition of a new portion based on the inclusion of additional information from the textual identification system 150.
  • The context component 260 determines context occurrences for elements of interest within texts of the set of textual data. In some instances, the context component 260 tokenizes the context to provide an index number for terms included in a textual proximate to another term for which context is being determined. The context component 260 may associate index numbers for terms surrounding a specified term and may link instances of a term surrounding a specified term that have a lexical similarity.
  • The normalization component 270 normalizes elements of interest by removing redundant elements of interest based on the context occurrence of two or more elements of interest. The normalization component 270 may generate a normalized set of elements of interest by identifying deviations among the instances. In some embodiments, normalization of the elements of interest occurs without removing or merging instances of the terms within the data structures described herein. The normalization component 270 may pass the normalized set of elements of interest to the presentation component 250, such that the presentation component 250 presents the elements of interest without duplication of elements of interest having deviating instances.
  • Any one or more of the components described may be implemented using hardware alone (e.g., one or more of the processors of a machine) or a combination of hardware and software. For example, any component described in the textual identification system 150 may physically include an arrangement of one or more processors (e.g., a subset of or among the one or more processors of the machine) configured to perform the operations described herein for that component. As another example, any component of the textual identification system 150 may include software, hardware, or both, that configure an arrangement of one or more processors (e.g., among the one or more processors of the machine) to perform the operations described herein for that component. Accordingly, different components of the textual identification system 150 may include and configure different arrangements of such processors or a single arrangement of such processors as different points in time. Moreover, any two or more components of the textual identification system 150 may be logically or physically combined into a single component, and the functions described herein for a single component may be subdivided among multiple components. Furthermore, according to various example embodiments, components described herein as being implemented within a single machine, database, or device may be distributed across multiple machines, databases, or devices.
  • FIG. 3 is a flowchart illustrating operations of the textual identification system 150 in performing a method 300 of processing and identifying elements of interest from content within a set of retrieved documents, according to some example embodiments. Operations of the method 300 may be performed by the modules described above with respect to FIG. 2.
  • In operation 310, the access component 210 receives a selection of a textual corpus from a set of textual corpora. Each textual corpus of the set of textual corpora contains one or more texts. The texts or text sets may include documents of varying types. For example the document types may include text documents, video documents, audio documents, multimedia documents, and other suitable documents. In the present disclosure, “textual” is used interchangeably with a broad number of document types, publications (e.g., documents published or otherwise accessible directly or by a network connection). Further, although described as texts, text sets, sets of textual data, or textual corpora, it should be understood that one or more terms, such as publication, may be used interchangeably in the present disclosure or embodiments disclosed herein. The set of texts (e.g., set of documents) may be identified from the selected textual corpus (e.g., document corpus).
  • In some embodiments, the presentation component 250 is activated by a selection of a graphical interface element to initiate presentation of a graphical user interface, as shown in FIG. 4. The graphical user interface includes one or more graphical interface elements representing available selections within the graphical user interface. In some embodiments, the selections made available by the graphical interface elements include search term input, document corpus selection, search result selection, document selection, element of interest selection, and search type selection or entry. The access component 210 may receive the selection of the document corpus from an input device receiving a selection from a set of document corpora represented as discrete selectable graphical interface elements. The set of document corpora may be presented on a display device of a client device accessing one or more document servers.
  • In operation 320, the access component 210 receives or otherwise accesses one or more search terms displayed within the graphical user interface. In some embodiments, the access component 210 receives the search terms from an input device of a client device on which the graphical user interface is presented. The access component 210 receives the one or more search terms as one or more differing types of user input through the input device. For example, the one or more search terms may be received within a text input field (e.g., a text box presented in the graphical user interface), as a selection from a set of radio buttons, as a selection from a drop down menu, as a selection from a scroll menu, or any other suitable input type.
  • In operation 330, the document component 220 identifies a set of textual data (e.g., a set of documents or set of publication data) based on the one or more search terms. In some embodiments, the set of documents are identified based on a presence of the one or more search terms within the document or within metadata associated with the document. In some embodiments, once the documents are incorporated into a document corpus, a content of the document and metadata associated with the document may be parsed and indexed to identify keywords. Keywords may include words, named individuals, named entities (e.g., a city name, a project name, an organization name), titles, authors, fields (e.g., From, To, Carbon Copy, and Blind Carbon Copy fields), dates, and other suitable terms. The keywords and the metadata may be extracted from the documents and accompanying data using information extraction and machine learning algorithms. In some embodiments, the document component 220 uses one or more search engine algorithms to identify the set of documents based on the index of the keywords associated with the content and metadata of the document.
  • In operation 340, the access component 210 retrieves a data structure including textual identifications (e.g., document identifications) for the set of textual data and an indication of the one or more data elements within the documents (e.g., texts within the textual data). In some embodiments, the data structure may be the index of keywords in the content and metadata of the documents identified based on the search terms. The index may include semantically meaningful collocations as well as the keywords from the content and the metadata. In some instances, the index is generated as a table having counts of the terms and semantically meaningful collocations within the document content and metadata. In some embodiments, the counts are a number of instances that a given term or semantically meaningful collocation occurs within the document content or the metadata for the document.
  • The indexes for the documents within the document corpus include collocations of semantically meaningful n-grams. Semantically meaningful collocations may include frequently occurring compositions of words having semantic meaning. For example, “strong” and “coffee” may occur together more often than a predetermined instance threshold and, when occurring together within a predefined distance, contain a semantic meaning, “strong coffee,” which may not occur in collocations of synonyms of the two terms. In some instances, the semantically meaningful n-grams or collocations may be determined heuristically. The semantically meaningful n-grams or collocations may also be identified using semantic analysis, stochastic semantic analysis, natural language processing, natural language understanding, or any other suitable algorithmic identification of the meaningful semantic relation between collocated terms.
  • In operation 350, the database component 230 processes the data structure to generate a modified data structure. In some embodiments, to generate the modified data structure, the database component 230 assigns an index number to each term or semantically meaningful n-gram. The index number may be obtained by sorting the textual representations of the terms and semantically meaningful n-grams and associating each with a position in a sort order. For example, “Aardvark” may receive an index number of zero and “Zena” may receive an index number of one thousand.
  • The modified data structure may be generated by reducing the full index to documents included in the set of documents identified based on the one or more search terms. In some embodiments, the terms, semantically meaningful n-grams, entity names, and the like are provided values within the modified data structure to construct a count matrix. The count matrix may include documents (e.g., documents identified within a specified document corpus or set of document corpora) as rows and elements of interest (e.g., terms and semantically meaningful n-grams) as columns. The documents in the rows may be represented by a document identification (e.g., a numerical value, an alphanumeric combination, or a set of characters). The terms and semantically meaningful collocations may be represented within a cell of the columns by the term or terms and an indication of a term type. The term type may indicate a category for the term. The intersections between the rows and columns may include a value for a number of occurrences of the specified element of interest within the specified document.
  • In operation 360, the database component 230 sums rows within the modified data structure. The rows include values for data elements included in each of the identified set of documents. In some embodiments, the counts (e.g., values at the intersections of specified rows and columns) are processed using a Term Frequency-Inverse Document Frequency (TF-IDF) transformation. The TF-IDF transformation may discount popular items as less interesting. In some instances, the TF-IDF transformation is a two-step process. First, the database component 230 sums the number of documents in which each item occurs. Second, the database component 230 divides the entries in each of the table rows of the modified data structure by the sum. The database component 230 thereby decreases weights of less informative but more popular or frequent terms. The TF-IDF transformation may generate a transformed data structure. In some embodiments, the transformed data structure is used as the basis for identifying potentially interesting elements or terms.
  • In operation 370, the element component 240 identifies one or more elements of interest based on the summed rows of the modified data structure. In some embodiments, to extract the one or more elements of interest, the element component 240 maps document identifications of the set of documents identified in operation 330 to rows of the transformed data structure. Using the mapping, the element component 240 creates a smaller matrix (e.g., an element matrix) composed of the document rows returned as query results. In the element matrix, the element component 240 selects terms of interest by summing the values from the transformed matrix and identifying the terms having a summed value above an interest threshold. In some embodiments, the interest threshold is predetermined. In some instances, the interest threshold is dynamic. In these embodiments, the dynamic interest threshold may be set as a function of the summed values for the terms. For example, the dynamic interest threshold may be set, at the time of summing the values for the terms, to select terms and to return a set number of terms (e.g., elements of interest) for each document of the set of documents identified in operation 330.
  • In operation 380, the presentation component 250 causes presentation of the elements of interest in a first portion of the graphical user interface and the textual identifications for the set of documents in a second portion of the graphical user interface. In some embodiments, the presentation component 250 causes presentation of the elements of interest and the document identifications in the graphical user interface depicted in FIG. 4. A first portion 410, depicted in FIG. 4, displays the elements of interest (e.g., the terms from the transformed matrix). A second portion 420 displays the textual identifications. In some instances, the document identifications include one or more of the values from the transformed matrix, a title of the document, an identifying subset of content of the document, a selectable representation (e.g., a graphical or textual representation) of the document, or any other suitable identifying information for the documents of the set of documents. In some instances, as shown in FIG. 4, the elements of interest in the first portion 410 and the document identifications in the second portion 420 are presented distinctly from one another and without an indication of a relationship between the items included in the first portion 410 and those included in the second portion 420. In some embodiments, the presentation component 250 generates and presents the elements of interest and the document identifications to indicate a relationship between specified elements of interest and specified document identifications. For example, the document identifications may be spaced a distance apart enabling the elements of interest found within each identified document to be presented adjacent to the document identification of the document in which the elements of interest are found.
  • FIG. 5 is a flowchart illustrating operations of the textual identification system 150 in performing a method 500 of processing and identifying elements of interest from content within a set of retrieved documents, according to some example embodiments. Operations of the method 500 may be performed by the modules described above with respect to FIG. 2. In some example embodiments, one or more operations of the method 500 are performed as part or sub-operations of one or more operations of the method 300. In some instances, the method 500 may include one or more operations of the method 300.
  • In operation 510, the context component 260 determines a context occurrence for each element of interest within the set of documents. The context occurrence represents a number of related times a term occurs in a document. In some instances, a context around each term may be tokenized. The context component 260, in tokenizing the context, may identify an index number for terms included in the document proximate to the term for which context is being determined. The context component 260 may then associate, in a matrix, one or more index numbers for the terms surrounding the specified term for which the context is being identified. In some embodiments, the context component 260 associates the index numbers of surrounding terms for each instance of a term for which the context is being identified. For example, where the context component 260 is determining context for three instances of the term “cheese,” the context component 260 may identify three sets of terms, with a set of terms surrounding each of the instances of the term “cheese.” The context component 260 may identify the index number for each of the terms within the three sets of terms and associate the index numbers with the instance of the term that they surround.
  • After the context component 260 identifies and associates the index numbers with instances of the term, the context component 260 determines the context of an instance of the term by comparing the associated index numbers. The context component 260 may link two or more instances of the term for which the surrounding terms are determined to have a lexical similarity. The lexical similarity of surrounding terms may be identified based on an overlap of terms identified within the surrounding terms. Overlap of terms may be identified where the same term occurs in two or more of the surrounding terms. Lexical similarity may also be identified where terms in sets of surrounding terms are synonyms, have similar definitions, or are otherwise semantically related. In some instances, the lexical similarity may be determined based on Jaccard coefficients determined for the sets of surrounding terms defined by a size of set intersection divided by a size of a set union.
  • In operation 520, the normalization component 270 normalizes the elements of interest by removing redundant elements of interest based on the context occurrence of two or more elements of interest. The normalization component 270 generates a normalized set of elements of interest. The normalization component 270 may normalize instances of an element of interest within a document by identifying one or more deviations among the instances. Deviations may include misspellings, different case usage, partial omissions (e.g., omitting a term forming a linked set of terms such as a full name), or other suitable deviations. In some embodiments, normalizing the elements of interest removes redundant instances of the same element of interest within a list presented at a client device. Removal of the redundant instances may free attention space within the list and remove confusion between similar instances of a term that refer to the same entity. In some instances, the normalization component 270 normalizes the elements of interest for presentation without removing or merging instances of the terms within one or more of the matrices or indices described above. By maintaining separate instances of the element of interest, the normalization component 270 prevents the database component 230 from erroneously reducing a term's likelihood of being deemed important based on overrepresentation due to merged instances.
  • In operation 530, the presentation component 250 causes presentation of the normalized set of elements of interest in the first portion of the graphical user interface. In some embodiments, the presentation component 250 presents the normalized set of elements of interest similarly to or the same as described above with respect to operation 380. The normalized set of elements may be presented in the first portion of the graphical user interface. In some instances, the elements of the normalized set of elements are presented in an order according to their association with the documents identified based on the search terms. In some embodiments, the normalized set of elements may be presented as an ordered list independent of a relationship to the identified documents presented in the second portion 420.
  • In operation 540, the element component 240 identifies an element type for each of the elements of interest. In some embodiments, the element component 240 identifies the element type for the elements of interest by determining the elements of interest identified from the set of documents retrieved based on the search terms. The element component 240 may then parse one or more of the matrices or indices described above to identify the element type for each element of interest.
  • In operation 550, the presentation component 250 causes presentation of a visual indicator differentiating the elements of interest based on the element types. The visual indicator may be a graphical indicator or a textual indicator. In some instances, the visual indicator is coded to indicate the element type without including all of the characters or words for the element type. For example, the presentation component 250 may identify an element type as a city name and abbreviate or otherwise code the element type as “CN.” Although the coding of the visual indicator has been described given a specific example of an abbreviation, it should be understood that the presentation component 250 may code the element type in any suitable manner. Further, in some embodiments, the presentation component may generate and cause presentation of key mapping codes and full names for element types.
  • In operation 560, the presentation component 250 causes presentation of at least a portion of a document of the set of documents in a third portion of the graphical user interface. In some embodiments, as shown in FIG. 4, the third portion 430 may be positioned proximate to the second portion 420. The portion of the document presented in the third portion 430 may include text from a text, text set, publication, or document selected or otherwise specified in the second portion. For example, where a user selects a graphical interface element representing a document identification in the second portion 420 of the graphical user interface, the presentation component 250 generates and causes presentation of the portion of the selected document in the third portion 430. In some embodiments, where the document or publication retrieved is a video document or an audio document, the third portion 430 of the graphical user interface may include selectable interface elements configured to display or play the video or audio document within the third portion 430 of the graphical user interface. In some instances, the third portion 430 of the graphical user interface includes an instance of an application configured to display or play the audio or video document. In addition to an interface element or an instance of an application, the third portion 430 may also include textual information representing, or included within, the video, audio, or multimedia document.
  • FIG. 6 is a flowchart illustrating operations of the textual identification system 150 in performing a method 600 of processing and identifying elements of interest from content within a set of retrieved documents, according to some example embodiments. Operations of the method 600 may be performed by the modules described above with respect to FIG. 2. The method 600 may include or be performed as part or sub-operations of one or more operations of the methods 300 or 500.
  • In operation 610, the context component 260 generates a set of tokens for each element of interest. The set of tokens may represent the context occurrence of a specified element. In some embodiments, operation 610 is performed in response to determining the context of occurrence for each element of interest, as described above with respect to operation 510 of the method 500. The context component 260 may tokenize each element of interest using the index numbers described above or may generate a separate set of context tokens. The tokens may be a numerical value or any other suitable value to identify the term and associate the term with the term for which the context is being identified.
  • In operation 620, the context component 260 identifies an overlap of two or more elements of interest based on the set of tokens for the two or more elements of interest. The overlap may be determined based on semantic relatedness. For example, the overlap may be determined based on occurrence of a term within two or more sets of tokens for the two or more elements. As described above, with respect to operation 510, the semantic relatedness or lexical similarity may be determined based on Jaccard coefficients determined for the set of tokens.
  • In operation 630, the context components 260 links the two or more elements of interest. The two or more elements of interest may be linked in one or more of the matrices or indices described above. In some instances, the two or more elements are linked by generating a context matrix for each document within the set of documents identified in relation to the one or more search terms described above with respect to the method 300. The context matrix may include the terms within a document in both rows and columns. A bit or value at an intersection of two terms may indicate a contextual link between the two terms. Although the linking of elements of interest has been described with respect to a matrix, it should be understood that the elements of interest may be linked using metadata, data tables, or any other suitable method.
  • FIG. 7 is a flowchart illustrating operations of the textual identification system 150 in performing a method 700 of processing and identifying elements of interest from content within a set of retrieved documents, according to some example embodiments. Operations of the method 700 may be performed by the modules described above with respect to FIG. 2. The method 700 may include or be performed as part or sub-operations of one or more operations of the methods 300, 500, or 600.
  • In operation 710, the access component 210 accesses a set of document corpora. In some embodiments, the set of document corpora includes the selected document corpus of operation 310. In some instances, the set of document corpora is accessed in response to receiving the one or more search terms in operation 320. For example, as shown, operation 710 may occur after operations 310-380. As shown in FIG. 7, in these instances, the set of document corpora is accessed without prior selection of a specified document corpus. The access component 210 may access the set of document corpora by directly accessing one or more databases directly or via a network connection. In some embodiments, the access component 210 accesses the set of document corpora by accessing a set of metadata identifying the set of document corpora.
  • In operation 720, the document component 220 dynamically partitions the set of document corpora to identify a document corpus containing the set of documents associated with the one or more search terms. In some instances, the document component 220 identifies the document corpus by identifying the search terms among keywords associated with each document corpus of the set of document corpora. In some embodiments, each document corpus may be associated with a distinct database or data source. For example, each distinct database or data source may be associated with or part of a distinct client device. In identifying the document corpus, the document component 220 may select a client device from which the document component may select documents in response to receiving the one or more search terms.
  • In some embodiments, the document component 220 dynamically partitions the set of document corpora regardless of distribution of the document corpora among multiple client devices. In these instances, the document component 220 identifies the one or more search terms. The document component 220 may compare the one or more search terms with an index or matrix identifying terms associated with individual documents within each document corpus of the set of document corpora. The index or matrix may also identify the document corpus with which each of the documents are associated. The document component 220 may identify one or more document corpora from the index or matrix. The document component 220 may then perform a comparative analysis of the one or more document corpora to identify a single document corpus to search using the one or more search terms. In some instances, the comparative analysis identifies the document corpus having a highest number of occurrences of the search terms, and selects the specified document corpus.
  • In some instances, the document component 220 combines two or more document corpora to generate a dynamic document corpus. In these embodiments, where several document corpora include a suitable number of instances of occurrences of the search terms, the document component 220 selects the two or more document corpora and searches each of the document corpora for documents including the one or more search terms.
  • Modules, Components, and Logic
  • Certain embodiments are described herein as including logic or a number of components, modules, or mechanisms. Modules may constitute either software modules (e.g., code embodied on a machine-readable medium) or hardware modules. A “hardware module” is a tangible unit capable of performing certain operations and may be configured or arranged in a certain physical manner. In various example embodiments, one or more computer systems (e.g., a standalone computer system, a client computer system, or a server computer system) or one or more hardware modules of a computer system (e.g., a processor or a group of processors) may be configured by software (e.g., an application or application portion) as a hardware module that operates to perform certain operations as described herein.
  • In some embodiments, a hardware module may be implemented mechanically, electronically, or any suitable combination thereof. For example, a hardware module may include dedicated circuitry or logic that is permanently configured to perform certain operations. For example, a hardware module may be a special-purpose processor, such as a Field-Programmable Gate Array (FPGA) or an Application Specific Integrated Circuit (ASIC). A hardware module may also include programmable logic or circuitry that is temporarily configured by software to perform certain operations. For example, a hardware module may include software executed by a general-purpose processor or other programmable processor. Once configured by such software, hardware modules become specific machines (or specific components of a machine) uniquely tailored to perform the configured functions and are no longer general-purpose processors. It will be appreciated that the decision to implement a hardware module mechanically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by cost and time considerations.
  • Accordingly, the phrase “hardware module” should be understood to encompass a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired), or temporarily configured (e.g., programmed) to operate in a certain manner or to perform certain operations described herein. As used herein, “hardware-implemented module” refers to a hardware module. Considering embodiments in which hardware modules are temporarily configured (e.g., programmed), each of the hardware modules need not be configured or instantiated at any one instance in time. For example, where a hardware module comprises a general-purpose processor configured by software to become a special-purpose processor, the general-purpose processor may be configured as respectively different special-purpose processors (e.g., comprising different hardware modules) at different times. Software accordingly configures a particular processor or processors, for example, to constitute a particular hardware module at one instance of time and to constitute a different hardware module at a different instance of time.
  • Hardware modules can provide information to, and receive information from, other hardware modules. Accordingly, the described hardware modules may be regarded as being communicatively coupled. Where multiple hardware modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) between or among two or more of the hardware modules. In embodiments in which multiple hardware modules are configured or instantiated at different times, communications between such hardware modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple hardware modules have access. For example, one hardware module may perform an operation and store the output of that operation in a memory device to which it is communicatively coupled. A further hardware module may then, at a later time, access the memory device to retrieve and process the stored output. Hardware modules may also initiate communications with input or output devices, and can operate on a resource (e.g., a collection of information).
  • The various operations of example methods described herein may be performed, at least partially, by one or more processors that are temporarily configured (e.g., by software) or permanently configured to perform the relevant operations. Whether temporarily or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions described herein. As used herein, “processor-implemented module” refers to a hardware module implemented using one or more processors.
  • Similarly, the methods described herein may be at least partially processor-implemented, with a particular processor or processors being an example of hardware. For example, at least some of the operations of a method may be performed by one or more processors or processor-implemented modules. Moreover, the one or more processors may also operate to support performance of the relevant operations in a “cloud computing” environment or as a “software as a service” (SaaS). For example, at least some of the operations may be performed by a group of computers (as examples of machines including processors), with these operations being accessible via a network (e.g., the Internet) and via one or more appropriate interfaces (e.g., an Application Program Interface (API)).
  • The performance of certain of the operations may be distributed among the processors, not only residing within a single machine, but deployed across a number of machines. In some example embodiments, the processors or processor-implemented modules may be located in a single geographic location (e.g., within a home environment, an office environment, or a server farm). In other example embodiments, the processors or processor-implemented modules may be distributed across a number of geographic locations.
  • Machine and Software Architecture
  • The components, methods, applications and so forth described in conjunction with FIGS. 1-7 are implemented in some embodiments in the context of a machine and an associated software architecture. The sections below describe representative software architecture(s) and machine (e.g., hardware) architecture that are suitable for use with the disclosed embodiments.
  • Software architectures are used in conjunction with hardware architectures to create devices and machines tailored to particular purposes. For example, a particular hardware architecture coupled with a particular software architecture will create a mobile device, such as a mobile phone, tablet device, or so forth. A slightly different hardware and software architecture may yield a smart device for use in the “internet of things,” while yet another combination produces a server computer for use within a cloud computing architecture. Not all combinations of such software and hardware architectures are presented here as those of skill in the art can readily understand how to implement the subject matter in different contexts from the disclosure contained herein.
  • Software Architecture
  • FIG. 8 is a block diagram 800 illustrating a representative software architecture 802, which may be used in conjunction with various hardware architectures herein described. FIG. 8 is merely a non-limiting example of a software architecture and it will be appreciated that many other architectures may be implemented to facilitate the functionality described herein. The software architecture 802 may be executing on hardware such as a machine 900 of FIG. 9 that includes, among other things, processors 910, memory/storage 930, and I/O components 950. A representative hardware layer 804 is illustrated and can represent, for example, the machine 900 of FIG. 9. The representative hardware layer 804 comprises one or more processing units 806 having associated executable instructions 808. The executable instructions 808 represent the executable instructions of the software architecture 802, including implementation of the methods, components and so forth of FIG. 2. Hardware layer 804 also includes memory and/or storage modules 810, which also have executable instructions 808. Hardware layer 804 may also comprise other hardware as indicated by 812 which represents any other hardware of the hardware layer 804, such as the other hardware illustrated as part of machine 1000.
  • In the example architecture of FIG. 8, the software 802 may be conceptualized as a stack of layers where each layer provides particular functionality. For example, the software 802 may include layers such as an operating system 814, libraries 816, frameworks/middleware 818, applications 820 and presentation layer 844. Operationally, the applications 820 and/or other components within the layers may invoke application programming interface (API) calls 824 through the software stack and receive a response, returned values, and so forth illustrated as messages 826 in response to the API calls 824. The layers illustrated are representative in nature and not all software architectures have all layers. For example, some mobile or special purpose operating systems may not provide a frameworks/middleware layer 818, while others may provide such a layer. Other software architectures may include additional or different layers.
  • The operating system 814 may manage hardware resources and provide common services. The operating system 814 may include, for example, a kernel 828, services 830, and drivers 832. The kernel 828 may act as an abstraction layer between the hardware and the other software layers. For example, the kernel 828 may be responsible for memory management, processor management (e.g., scheduling), component management, networking, security settings, and so on. The services 830 may provide other common services for the other software layers. The drivers 832 may be responsible for controlling or interfacing with the underlying hardware. For instance, the drivers 832 may include display drivers, camera drivers, Bluetooth® drivers, flash memory drivers, serial communication drivers (e.g., Universal Serial Bus (USB) drivers), WiFi® drivers, audio drivers, power management drivers, and so forth depending on the hardware configuration.
  • The libraries 816 may provide a common infrastructure that may be utilized by the applications 820 and/or other components and/or layers. The libraries 816 typically provide functionality that allows other software modules to perform tasks in an easier fashion than to interface directly with the underlying operating system 814 functionality (e.g., kernel 828, services 830, and/or drivers 832). The libraries 816 may include system libraries 834 (e.g., C standard library) that may provide functions such as memory allocation functions, string manipulation functions, mathematic functions, and the like. In addition, the libraries 816 may include API libraries 836 such as media libraries (e.g., libraries to support presentation and manipulation of various media formats such as MPEG4, H.264, MP3, AAC, AMR, JPG, PNG), graphics libraries (e.g., an OpenGL framework that may be used to render 2D and 3D information in a graphic content on a display), database libraries (e.g., SQLite that may provide various relational database functions), web libraries (e.g., WebKit that may provide web browsing functionality), and the like. The libraries 816 may also include a wide variety of other libraries 838 to provide many other APIs to the applications 820 and other software components/modules.
  • The frameworks 818 (also sometimes referred to as middleware) may provide a higher-level common infrastructure that may be utilized by the applications 820 and/or other software components/modules. For example, the frameworks 818 may provide various graphic user interface (GUI) functions, high-level resource management, high-level location services, and so forth. The frameworks 818 may provide a broad spectrum of other APIs that may be utilized by the applications 820 and/or other software components/modules, some of which may be specific to a particular operating system or platform.
  • The applications 820 includes built-in applications 840 and/or third-party applications 842. Examples of representative built-in applications 840 may include, but are not limited to, a contacts application, a browser application, a book reader application, a location application, a media application, a messaging application, and/or a game application. The third-party applications 842 may include any of the built-in applications as well as a broad assortment of other applications. In a specific example, the third-party application 842 (e.g., an application developed using the Android™ or iOS™ software development kit (SDK) by an entity other than the vendor of the particular platform) may be mobile software running on a mobile operating system such as iOS™, Android™, Windows® Phone, or other mobile operating systems. In this example, the third-party application 842 may invoke the API calls 824 provided by the mobile operating system such as operating system 814 to facilitate functionality described herein.
  • The applications 820 may utilize built in operating system functions (e.g., kernel 828, services 830 and/or drivers 832), libraries (e.g., system libraries 834, API libraries 836, and other libraries 838), and frameworks/middleware 818 to create user interfaces to interact with users of the system. Alternatively, or additionally, in some systems interactions with a user may occur through a presentation layer, such as the presentation layer 844. In these systems, the application/module “logic” can be separated from the aspects of the application/module that interact with a user.
  • Some software architectures utilize virtual machines. In the example of FIG. 8, this is illustrated by a virtual machine 848. A virtual machine creates a software environment where applications/modules can execute as if they were executing on a hardware machine (such as the machine 900 of FIG. 9, for example). A virtual machine is hosted by a host operating system (e.g., operating system 814 in FIG. 9) and typically, although not always, has a virtual machine monitor 846, which manages the operation of the virtual machine as well as the interface with the host operating system (e.g., operating system 814). A software architecture executes within the virtual machine such as an operating system 850, libraries 816, frameworks/middleware 854, applications 856 and/or presentation layer 858. These layers of software architecture executing within the virtual machine 848 can be the same as corresponding layers previously described or may be different.
  • Example Machine Architecture and Machine-Readable Medium
  • FIG. 9 is a block diagram illustrating components of a machine 900, according to some example embodiments, able to read instructions from a machine-readable medium (e.g., a machine-readable storage medium) and perform any one or more of the methodologies discussed herein. Specifically, FIG. 9 shows a diagrammatic representation of the machine 900 in the example form of a computer system, within which instructions 916 (e.g., software, a program, an application, an applet, an app, or other executable code) for causing the machine 900 to perform any one or more of the methodologies discussed herein may be executed. For example, the instructions may cause the machine to execute the flow diagrams of FIGS. 3 and 5-7. Additionally, or alternatively, the instructions may implement in the components or modules of FIG. 2, and so forth. The instructions transform the general, non-programmed machine into a particular (e.g., special purpose) machine programmed to carry out the described and illustrated functions in the manner described. In alternative embodiments, the machine 900 operates as a standalone device or may be coupled (e.g., networked) to other machines. In a networked deployment, the machine 900 may operate in the capacity of a server machine or a client machine in a server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment. The machine 900 may comprise, but is not limited to, a server computer, a client computer, a personal computer (PC), a tablet computer, a laptop computer, a netbook, a set-top box (STB), a personal digital assistant (PDA), an entertainment media system, a cellular telephone, a smart phone, a mobile device, a wearable device (e.g., a smart watch), a smart home device (e.g., a smart appliance), other smart devices, a web appliance, a network router, a network switch, a network bridge, or any machine capable of executing the instructions 916, sequentially or otherwise, that specify actions to be taken by machine 900. Further, while only a single machine 900 is illustrated, the term “machine” shall also be taken to include a collection of machines 900 that individually or jointly execute the instructions 916 to perform any one or more of the methodologies discussed herein.
  • The machine 900 may include processors 910, memory 930, and I/O components 950, which may be configured to communicate with each other such as via a bus 902. In an example embodiment, the processors 910 (e.g., a Central Processing Unit (CPU), a Reduced Instruction Set Computing (RISC) processor, a Complex Instruction Set Computing (CISC) processor, a Graphics Processing Unit (GPU), a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Radio-Frequency Integrated Circuit (RFIC), another processor, or any suitable combination thereof) may include, for example, processor 912 and processor 914 that may execute instructions 916. The term “processor” is intended to include multi-core processor that may comprise two or more independent processors (sometimes referred to as “cores”) that may execute instructions contemporaneously. Although FIG. 9 shows multiple processors, the machine 900 may include a single processor with a single core, a single processor with multiple cores (e.g., a multi-core processor), multiple processors with a single core, multiple processors with multiples cores, or any combination thereof.
  • The memory/storage 930 may include a memory 932, such as a main memory, or other memory storage, and a storage unit 936, both accessible to the processors 910 such as via the bus 902. The storage unit 936 and memory 932 store the instructions 916 embodying any one or more of the methodologies or functions described herein. The instructions 916 may also reside, completely or partially, within the memory 932, within the storage unit 936, within at least one of the processors 910 (e.g., within the processor's cache memory), or any suitable combination thereof, during execution thereof by the machine 900. Accordingly, the memory 932, the storage unit 936, and the memory of the processors 910 are examples of machine-readable media.
  • As used herein, “machine-readable medium” means a device able to store instructions and data temporarily or permanently and may include, but is not limited to, random-access memory (RAM), read-only memory (ROM), buffer memory, flash memory, optical media, magnetic media, cache memory, other types of storage (e.g., Erasable Programmable Read-Only Memory (EEPROM)), and/or any suitable combination thereof. The term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, or associated caches and servers) able to store the instructions 916. The term “machine-readable medium” shall also be taken to include any medium, or combination of multiple media, that is capable of storing instructions (e.g., instructions 916) for execution by a machine (e.g., machine 900), such that the instructions, when executed by one or more processors of the machine 900 (e.g., processors 910), cause the machine 900 to perform any one or more of the methodologies described herein. Accordingly, a “machine-readable medium” refers to a single storage apparatus or device, as well as “cloud-based” storage systems or storage networks that include multiple storage apparatus or devices. The term “machine-readable medium” excludes signals per se.
  • The I/O components 950 may include a wide variety of components to receive input, provide output, produce output, transmit information, exchange information, capture measurements, and so on. The specific I/O components 950 that are included in a particular machine will depend on the type of machine. For example, portable machines such as mobile phones will likely include a touch input device or other such input mechanisms, while a headless server machine will likely not include such a touch input device. It will be appreciated that the I/O components 950 may include many other components that are not shown in FIG. 9. The I/O components 950 are grouped according to functionality merely for simplifying the following discussion and the grouping is in no way limiting. In various example embodiments, the I/O components 950 may include output components 952 and input components 954. The output components 952 may include visual components (e.g., a display such as a plasma display panel (PDP), a light emitting diode (LED) display, a liquid crystal display (LCD), a projector, or a cathode ray tube (CRT)), acoustic components (e.g., speakers), haptic components (e.g., a vibratory motor, resistance mechanisms), other signal generators, and so forth. The input components 954 may include alphanumeric input components (e.g., a keyboard, a touch screen configured to receive alphanumeric input, a photo-optical keyboard, or other alphanumeric input components), point based input components (e.g., a mouse, a touchpad, a trackball, a joystick, a motion sensor, or other pointing instrument), tactile input components (e.g., a physical button, a touch screen that provides location and/or force of touches or touch gestures, or other tactile input components), audio input components (e.g., a microphone), and the like.
  • In further example embodiments, the I/O components 950 may include biometric components 956, motion components 958, environmental components 960, or position components 962 among a wide array of other components. For example, the biometric components 956 may include components to detect expressions (e.g., hand expressions, facial expressions, vocal expressions, body gestures, or eye tracking), measure biosignals (e.g., blood pressure, heart rate, body temperature, perspiration, or brain waves), identify a person (e.g., voice identification, retinal identification, facial identification, fingerprint identification, or electroencephalogram based identification), and the like. The motion components 958 may include acceleration sensor components (e.g., accelerometer), gravitation sensor components, rotation sensor components (e.g., gyroscope), and so forth. The environmental components 960 may include, for example, illumination sensor components (e.g., photometer), temperature sensor components (e.g., one or more thermometer that detect ambient temperature), humidity sensor components, pressure sensor components (e.g., barometer), acoustic sensor components (e.g., one or more microphones that detect background noise), proximity sensor components (e.g., infrared sensors that detect nearby objects), gas sensors (e.g., gas detection sensors to detection concentrations of hazardous gases for safety or to measure pollutants in the atmosphere), or other components that may provide indications, measurements, or signals corresponding to a surrounding physical environment. The position components 962 may include location sensor components (e.g., a Global Position System (GPS) receiver component), altitude sensor components (e.g., altimeters or barometers that detect air pressure from which altitude may be derived), orientation sensor components (e.g., magnetometers), and the like.
  • Communication may be implemented using a wide variety of technologies. The I/O components 950 may include communication components 964 operable to couple the machine 900 to a network 980 or devices 970 via coupling 982 and coupling 972 respectively. For example, the communication components 964 may include a network interface component or other suitable device to interface with the network 980. In further examples, communication components 964 may include wired communication components, wireless communication components, cellular communication components, Near Field Communication (NFC) components, Bluetooth® components (e.g., Bluetooth® Low Energy), WiFi® components, and other communication components to provide communication via other modalities. The devices 970 may be another machine or any of a wide variety of peripheral devices (e.g., a peripheral device coupled via a Universal Serial Bus (USB)).
  • Moreover, the communication components 964 may detect identifiers or include components operable to detect identifiers. For example, the communication components 964 may include Radio Frequency Identification (RFID) tag reader components, NFC smart tag detection components, optical reader components (e.g., an optical sensor to detect one-dimensional bar codes such as Universal Product Code (UPC) bar code, multi-dimensional bar codes such as Quick Response (QR) code, Aztec code, Data Matrix, Dataglyph, MaxiCode, PDF417, Ultra Code, UCC RSS-2D bar code, and other optical codes), or acoustic detection components (e.g., microphones to identify tagged audio signals). In addition, a variety of information may be derived via the communication components 964, such as, location via Internet Protocol (IP) geo-location, location via Wi-Fi® signal triangulation, location via detecting a NFC beacon signal that may indicate a particular location, and so forth.
  • Transmission Medium
  • In various example embodiments, one or more portions of the network 980 may be an ad hoc network, an intranet, an extranet, a virtual private network (VPN), a local area network (LAN), a wireless LAN (WLAN), a wide area network (WAN), a wireless WAN (WWAN), a metropolitan area network (MAN), the Internet, a portion of the Internet, a portion of the Public Switched Telephone Network (PSTN), a plain old telephone service (POTS) network, a cellular telephone network, a wireless network, a Wi-Fi® network, another type of network, or a combination of two or more such networks. For example, the network 980 or a portion of the network 980 may include a wireless or cellular network and the coupling 982 may be a Code Division Multiple Access (CDMA) connection, a Global System for Mobile communications (GSM) connection, or other type of cellular or wireless coupling. In this example, the coupling 982 may implement any of a variety of types of data transfer technology, such as Single Carrier Radio Transmission Technology (1×RTT), Evolution-Data Optimized (EVDO) technology, General Packet Radio Service (GPRS) technology, Enhanced Data rates for GSM Evolution (EDGE) technology, third Generation Partnership Project (3GPP) including 3G, fourth generation wireless (4G) networks, Universal Mobile Telecommunications System (UMTS), High Speed Packet Access (HSPA), Worldwide Interoperability for Microwave Access (WiMAX), Long Term Evolution (LTE) standard, others defined by various standard setting organizations, other long range protocols, or other data transfer technology.
  • The instructions 916 may be transmitted or received over the network 980 using a transmission medium via a network interface device (e.g., a network interface component included in the communication components 964) and utilizing any one of a number of well-known transfer protocols (e.g., hypertext transfer protocol (HTTP)). Similarly, the instructions 916 may be transmitted or received using a transmission medium via the coupling 972 (e.g., a peer-to-peer coupling) to the devices 970. The term “transmission medium” shall be taken to include any intangible medium that is capable of storing, encoding, or carrying the instructions 916 for execution by the machine 900, and includes digital or analog communications signals or other intangible media to facilitate communication of such software.
  • Language
  • Throughout this specification, plural instances may implement components, operations, or structures described as a single instance. Although individual operations of one or more methods are illustrated and described as separate operations, one or more of the individual operations may be performed concurrently, and nothing requires that the operations be performed in the order illustrated. Structures and functionality presented as separate components in example configurations may be implemented as a combined structure or component. Similarly, structures and functionality presented as a single component may be implemented as separate components. These and other variations, modifications, additions, and improvements fall within the scope of the subject matter herein.
  • Although an overview of the inventive subject matter has been described with reference to specific example embodiments, various modifications and changes may be made to these embodiments without departing from the broader scope of embodiments of the present disclosure. Such embodiments of the inventive subject matter may be referred to herein, individually or collectively, by the term “invention” merely for convenience and without intending to voluntarily limit the scope of this application to any single disclosure or inventive concept if more than one is, in fact, disclosed.
  • The embodiments illustrated herein are described in sufficient detail to enable those skilled in the art to practice the teachings disclosed. Other embodiments may be used and derived therefrom, such that structural and logical substitutions and changes may be made without departing from the scope of this disclosure. The Detailed Description, therefore, is not to be taken in a limiting sense, and the scope of various embodiments is defined only by the appended claims, along with the full range of equivalents to which such claims are entitled.
  • As used herein, the term “or” may be construed in either an inclusive or exclusive sense. Moreover, plural instances may be provided for resources, operations, or structures described herein as a single instance. Additionally, boundaries between various resources, operations, modules, engines, and data stores are somewhat arbitrary, and particular operations are illustrated in a context of specific illustrative configurations. Other allocations of functionality are envisioned and may fall within a scope of various embodiments of the present disclosure. In general, structures and functionality presented as separate resources in the example configurations may be implemented as a combined structure or resource. Similarly, structures and functionality presented as a single resource may be implemented as separate resources. These and other variations, modifications, additions, and improvements fall within a scope of embodiments of the present disclosure as represented by the appended claims. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims (21)

1. (canceled)
2. A method, comprising:
accessing, by one or more processors of a machine, a set of textual corpora, each textual corpus of the set of textual corpora containing one or more text sets;
dynamically partitioning, by one or more processors, the set of textual corpora to identify a textual corpus from the set of textual corpora containing the set of text sets associated with one or more search terms;
identifying, by one or more processors, a set of textual data based on the one or more search terms;
retrieving, by one or more processors, a data structure including textual identifications for the set of textual data and an indication of one or more data elements within one or more text sets of the set of textual data;
processing, by the one or more processors, the data structure to generate a modified data structure, the modified data structure generated by reducing to text sets included in the set of textual data identified based on the one or more search terms;
summing rows, by the one or more processors, within the modified data structure, the rows including values for data elements included in each of the identified set of textual data;
identifying, by the one or more processors, one or more elements of interest within the set of textual data based on the summed rows of the modified data structure; and
causing, by one or more processors, presentation of the elements of interest in a first portion of a graphical user interface and the textual identifications for the set of textual data in a second portion of the graphical user interface.
3. The method of claim 2, wherein the set of textual corpora comprises a set of document corpora, each textual corpus from the set of document corpora comprises a document corpus, and the set of text sets comprises a set of documents.
4. The method of claim 2, wherein each textual corpus is associated with a separate data source.
5. The method of claim 2, wherein each textual corpus is associated with a distinct client device.
6. The method of claim 2, wherein the causing the presentation of the elements of interest and the textual identifications comprises:
causing presentation of at least a portion of a text set of the set of textual data in a third portion of the graphical user interface.
7. The method of claim 2, wherein the causing the presentation of the elements of interest and the textual identifications comprises:
identifying an element type for each element in the elements of interest; and
causing presentation of a visual indicator differentiating the elements of interest based on an element type.
8. The method of claim 2, further comprising:
determining a context of occurrence for each element of interest within the set of textual data;
in response to determining the context of occurrence for each element of interest, generating a set of tokens for each element of interest, the set of tokens representing the context of occurrence;
identifying an overlap of two or more of the elements of interest based on the set of tokens for the two or more elements of interest; and
linking two or more elements of interest.
9. The method of claim 2, further comprising:
receiving a selection of a given textual corpus from the set of textual corpora, the set of textual data identified from the selected given textual corpus.
10. A computer implemented system, comprising:
one or more processors; and
a processor-readable storage device comprising processor-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising:
accessing a set of textual corpora, each textual corpus of the set of textual corpora containing one or more text sets;
dynamically partitioning the set of textual corpora to identify a textual corpus from the set of textual corpora containing the set of text sets associated with one or more search terms;
identifying a set of textual data based on the one or more search terms;
retrieving a data structure including textual identifications for the set of textual data and an indication of one or more data elements within one or more text sets of the set of textual data;
processing the data structure to generate a modified data structure, the modified data structure generated by reducing to text sets included in the set of textual data identified based on the one or more search terms;
summing rows within the modified data structure, the rows including values for data elements included in each of the identified set of textual data;
identifying one or more elements of interest within the set of textual data based on the summed rows of the modified data structure; and
causing presentation of the elements of interest in a first portion of a graphical user interface and the textual identifications for the set of textual data in a second portion of the graphical user interface.
11. The system of claim 10, wherein the set of textual corpora comprises a set of document corpora, each textual corpus from the set of document corpora comprises a document corpus, and the set of text sets comprises a set of documents.
12. The system of claim 10, wherein each textual corpus is associated with a separate data source.
13. The system of claim 10, wherein each textual corpus is associated with a distinct client device.
14. The system of claim 10, wherein the causing the presentation of the elements of interest and the textual identifications comprises:
causing presentation of at least a portion of a text set of the set of textual data in a third portion of the graphical user interface.
15. The system of claim 10, wherein the causing the presentation of the elements of interest and the textual identifications comprises:
identifying an element type for each element in the elements of interest; and
causing presentation of a visual indicator differentiating the elements of interest based on an element type.
16. The system of claim 10, wherein the operations further comprise:
determining a context of occurrence for each element of interest within the set of textual data;
in response to determining the context of occurrence for each element of interest, generating a set of tokens for each element of interest, the set of tokens representing the context of occurrence;
identifying an overlap of two or more of the elements of interest based on the set of tokens for the two or more elements of interest; and
linking two or more elements of interest.
17. The system of claim 10, wherein the operations further comprise:
receiving a selection of a given textual corpus from the set of textual corpora, the set of textual data identified from the selected given textual corpus.
18. A processor-readable storage device comprising instructions that, when executed by one or more processors of a machine, cause the machine to perform operations comprising:
accessing a set of textual corpora, each textual corpus of the set of textual corpora containing one or more text sets;
dynamically partitioning the set of textual corpora to identify a textual corpus from the set of textual corpora containing the set of text sets associated with one or more search terms;
identifying a set of textual data based on the one or more search terms;
retrieving a data structure including textual identifications for the set of textual data and an indication of one or more data elements within one or more text sets of the set of textual data;
processing the data structure to generate a modified data structure, the modified data structure generated by reducing to text sets included in the set of textual data identified based on the one or more search terms;
summing rows within the modified data structure, the rows including values for data elements included in each of the identified set of textual data;
identifying one or more elements of interest within the set of textual data based on the summed rows of the modified data structure; and
causing presentation of the elements of interest in a first portion of a graphical user interface and the textual identifications for the set of textual data in a second portion of the graphical user interface.
19. The processor-readable storage device of claim 18, wherein the set of textual corpora comprises a set of document corpora, each textual corpus from the set of document corpora comprises a document corpus, and the set of text sets comprises a set of documents.
20. The processor-readable storage device of claim 18, wherein each textual corpus is associated with a separate data source.
21. The processor-readable storage device of claim 18, wherein each textual corpus is associated with a distinct client device.
US16/354,688 2016-11-21 2019-03-15 Analysis of large bodies of textual data Abandoned US20190243897A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/354,688 US20190243897A1 (en) 2016-11-21 2019-03-15 Analysis of large bodies of textual data

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201662424844P 2016-11-21 2016-11-21
US15/678,874 US10318630B1 (en) 2016-11-21 2017-08-16 Analysis of large bodies of textual data
US16/354,688 US20190243897A1 (en) 2016-11-21 2019-03-15 Analysis of large bodies of textual data

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US15/678,874 Continuation US10318630B1 (en) 2016-11-21 2017-08-16 Analysis of large bodies of textual data

Publications (1)

Publication Number Publication Date
US20190243897A1 true US20190243897A1 (en) 2019-08-08

Family

ID=66767655

Family Applications (2)

Application Number Title Priority Date Filing Date
US15/678,874 Active US10318630B1 (en) 2016-11-21 2017-08-16 Analysis of large bodies of textual data
US16/354,688 Abandoned US20190243897A1 (en) 2016-11-21 2019-03-15 Analysis of large bodies of textual data

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US15/678,874 Active US10318630B1 (en) 2016-11-21 2017-08-16 Analysis of large bodies of textual data

Country Status (1)

Country Link
US (2) US10318630B1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11113286B2 (en) * 2019-12-26 2021-09-07 Snowflake Inc. Generation of pruning index for pattern matching queries

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10318630B1 (en) * 2016-11-21 2019-06-11 Palantir Technologies Inc. Analysis of large bodies of textual data
US10788951B2 (en) * 2017-02-23 2020-09-29 Bank Of America Corporation Data processing system with machine learning engine to provide dynamic interface functions
US10872236B1 (en) 2018-09-28 2020-12-22 Amazon Technologies, Inc. Layout-agnostic clustering-based classification of document keys and values
US11257006B1 (en) 2018-11-20 2022-02-22 Amazon Technologies, Inc. Auto-annotation techniques for text localization
US10949661B2 (en) * 2018-11-21 2021-03-16 Amazon Technologies, Inc. Layout-agnostic complex document processing system
US20210089403A1 (en) * 2019-09-20 2021-03-25 Samsung Electronics Co., Ltd. Metadata table management scheme for database consistency
US20220318284A1 (en) * 2020-12-31 2022-10-06 Proofpoint, Inc. Systems and methods for query term analytics
CN116719465B (en) * 2023-08-08 2023-11-10 深圳市智慧城市科技发展集团有限公司 Opinion feedback method, terminal device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080072180A1 (en) * 2006-09-15 2008-03-20 Emc Corporation User readability improvement for dynamic updating of search results
US20110016118A1 (en) * 2009-07-20 2011-01-20 Lexisnexis Method and apparatus for determining relevant search results using a matrix framework
US10318630B1 (en) * 2016-11-21 2019-06-11 Palantir Technologies Inc. Analysis of large bodies of textual data

Family Cites Families (860)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5109399A (en) 1989-08-18 1992-04-28 Alamo City Technologies, Inc. Emergency call locating system
FR2684214B1 (en) 1991-11-22 1997-04-04 Sepro Robotique INDEXING CARD FOR GEOGRAPHIC INFORMATION SYSTEM AND SYSTEM INCLUDING APPLICATION.
US5632009A (en) 1993-09-17 1997-05-20 Xerox Corporation Method and system for producing a table image showing indirect data representations
US5670987A (en) 1993-09-21 1997-09-23 Kabushiki Kaisha Toshiba Virtual manipulating apparatus and method
US5437032A (en) 1993-11-04 1995-07-25 International Business Machines Corporation Task scheduler for a miltiprocessor system
US6877137B1 (en) 1998-04-09 2005-04-05 Rose Blush Software Llc System, method and computer program product for mediating notes and note sub-notes linked or otherwise associated with stored or networked web pages
US5560005A (en) 1994-02-25 1996-09-24 Actamed Corp. Methods and systems for object-based relational distributed databases
US5777549A (en) 1995-03-29 1998-07-07 Cabletron Systems, Inc. Method and apparatus for policy-based alarm notification in a distributed network management environment
US5872973A (en) 1995-10-26 1999-02-16 Viewsoft, Inc. Method for managing dynamic relations between objects in dynamic object-oriented languages
US6366933B1 (en) 1995-10-27 2002-04-02 At&T Corp. Method and apparatus for tracking and viewing changes on the web
US5845300A (en) 1996-06-05 1998-12-01 Microsoft Corporation Method and apparatus for suggesting completions for a partially entered data item based on previously-entered, associated data items
US5897636A (en) 1996-07-11 1999-04-27 Tandem Corporation Incorporated Distributed object computer system with hierarchical name space versioning
US5798769A (en) 1996-08-15 1998-08-25 Xerox Corporation Method and apparatus for maintaining links between graphic objects in a free-form graphics display system
CA2187704C (en) 1996-10-11 1999-05-04 Darcy Kim Rossmo Expert system method of performing crime site analysis
US5870559A (en) 1996-10-15 1999-02-09 Mercury Interactive Software system and associated methods for facilitating the analysis and management of web sites
US5974572A (en) 1996-10-15 1999-10-26 Mercury Interactive Corporation Software system and methods for generating a load test using a server access log
US6430305B1 (en) 1996-12-20 2002-08-06 Synaptics, Incorporated Identity verification methods
JP2940501B2 (en) 1996-12-25 1999-08-25 日本電気株式会社 Document classification apparatus and method
US6477527B2 (en) 1997-05-09 2002-11-05 International Business Machines Corporation System, method, and program for object building in queries over object views
US6026233A (en) 1997-05-27 2000-02-15 Microsoft Corporation Method and apparatus for presenting and selecting options to modify a programming language statement
US6091956A (en) 1997-06-12 2000-07-18 Hollenberg; Dennis D. Situation information system
US6073129A (en) 1997-12-29 2000-06-06 Bull Hn Information Systems Inc. Method and apparatus for improving the performance of a database management system through a central cache mechanism
JP3636272B2 (en) 1998-02-09 2005-04-06 富士通株式会社 Icon display method, apparatus thereof, and recording medium
US6374251B1 (en) 1998-03-17 2002-04-16 Microsoft Corporation Scalable system for clustering of large databases
US6247019B1 (en) 1998-03-17 2001-06-12 Prc Public Sector, Inc. Object-based geographic information system (GIS)
US6167405A (en) 1998-04-27 2000-12-26 Bull Hn Information Systems Inc. Method and apparatus for automatically populating a data warehouse system
US8396824B2 (en) 1998-05-28 2013-03-12 Qps Tech. Limited Liability Company Automatic data categorization with optimally spaced semantic seed terms
US7168039B2 (en) 1998-06-02 2007-01-23 International Business Machines Corporation Method and system for reducing the horizontal space required for displaying a column containing text data
US6742003B2 (en) 2001-04-30 2004-05-25 Microsoft Corporation Apparatus and accompanying methods for visualizing clusters of data and hierarchical cluster classifications
US6510504B2 (en) 1998-06-29 2003-01-21 Oracle Corporation Methods and apparatus for memory allocation for object instances in an object-oriented software environment
US6577304B1 (en) 1998-08-14 2003-06-10 I2 Technologies Us, Inc. System and method for visually representing a supply chain
US6243717B1 (en) 1998-09-01 2001-06-05 Camstar Systems, Inc. System and method for implementing revision management of linked data entities and user dependent terminology
US6161098A (en) 1998-09-14 2000-12-12 Folio (Fn), Inc. Method and apparatus for enabling small investors with a portfolio of securities to manage taxable events within the portfolio
US6232971B1 (en) 1998-09-23 2001-05-15 International Business Machines Corporation Variable modality child windows
US6418438B1 (en) 1998-12-16 2002-07-09 Microsoft Corporation Dynamic scalable lock mechanism
US6279018B1 (en) 1998-12-21 2001-08-21 Kudrollis Software Inventions Pvt. Ltd. Abbreviating and compacting text to cope with display space constraint in computer software
US6922699B2 (en) 1999-01-26 2005-07-26 Xerox Corporation System and method for quantitatively representing data objects in vector space
US7111231B1 (en) 1999-02-24 2006-09-19 Intellisync Corporation System and methodology for dynamic application environment employing runtime execution templates
US6574635B2 (en) 1999-03-03 2003-06-03 Siebel Systems, Inc. Application instantiation based upon attributes and values stored in a meta data repository, including tiering of application layers objects and components
US6631496B1 (en) 1999-03-22 2003-10-07 Nec Corporation System for personalizing, organizing and managing web information
US6369835B1 (en) 1999-05-18 2002-04-09 Microsoft Corporation Method and system for generating a movie file from a slide show presentation
US6714936B1 (en) 1999-05-25 2004-03-30 Nevin, Iii Rocky Harry W. Method and apparatus for displaying data stored in linked nodes
US6304873B1 (en) 1999-07-06 2001-10-16 Compaq Computer Corporation System and method for performing database operations and for skipping over tuples locked in an incompatible mode
US6307573B1 (en) 1999-07-22 2001-10-23 Barbara L. Barros Graphic-information flow method and system for visually analyzing patterns and relationships
US7039863B1 (en) 1999-07-23 2006-05-02 Adobe Systems Incorporated Computer generation of documents using layout elements and content elements
US7373592B2 (en) 1999-07-30 2008-05-13 Microsoft Corporation Modeless child windows for application programs
US6560620B1 (en) 1999-08-03 2003-05-06 Aplix Research, Inc. Hierarchical document comparison system and method
US6976210B1 (en) 1999-08-31 2005-12-13 Lucent Technologies Inc. Method and apparatus for web-site-independent personalization from multiple sites having user-determined extraction functionality
GB2371901B (en) 1999-09-21 2004-06-23 Andrew E Borthwick A probabilistic record linkage model derived from training data
US20020174201A1 (en) 1999-09-30 2002-11-21 Ramer Jon E. Dynamic configuration of context-sensitive personal sites and membership channels
US6990238B1 (en) 1999-09-30 2006-01-24 Battelle Memorial Institute Data processing, analysis, and visualization system for use with disparate data types
US6674434B1 (en) 1999-10-25 2004-01-06 Navigation Technologies Corp. Method and system for automatic generation of shape and curvature data for a geographic database
US7630986B1 (en) 1999-10-27 2009-12-08 Pinpoint, Incorporated Secure data interchange
US7216115B1 (en) 1999-11-10 2007-05-08 Fastcase.Com, Inc. Apparatus and method for displaying records responsive to a database query
US7716077B1 (en) 1999-11-22 2010-05-11 Accenture Global Services Gmbh Scheduling and planning maintenance and service in a network-based supply chain environment
FR2806183B1 (en) 1999-12-01 2006-09-01 Cartesis S A DEVICE AND METHOD FOR INSTANT CONSOLIDATION, ENRICHMENT AND "REPORTING" OR BACKGROUND OF INFORMATION IN A MULTIDIMENSIONAL DATABASE
US7194680B1 (en) 1999-12-07 2007-03-20 Adobe Systems Incorporated Formatting content by example
US20020013781A1 (en) 2000-01-13 2002-01-31 Erik Petersen System and method of searchin and gathering information on-line and off-line
US6567936B1 (en) 2000-02-08 2003-05-20 Microsoft Corporation Data clustering using error-tolerant frequent item sets
US20040117387A1 (en) 2000-02-25 2004-06-17 Vincent Civetta Database sizing and diagnostic utility
US6859909B1 (en) 2000-03-07 2005-02-22 Microsoft Corporation System and method for annotating web-based documents
WO2001073652A1 (en) 2000-03-24 2001-10-04 Access Business Group International Llc System and method for detecting fraudulent transactions
US6456997B1 (en) 2000-04-12 2002-09-24 International Business Machines Corporation System and method for dynamically generating an invisible hierarchy in a planning system
US6745382B1 (en) 2000-04-13 2004-06-01 Worldcom, Inc. CORBA wrappers for rules automation technology
JP4325075B2 (en) 2000-04-21 2009-09-02 ソニー株式会社 Data object management device
US7269786B1 (en) 2000-05-04 2007-09-11 International Business Machines Corporation Navigating an index to access a subject multi-dimensional database
US6915289B1 (en) 2000-05-04 2005-07-05 International Business Machines Corporation Using an index to access a subject multi-dimensional database
US6642945B1 (en) 2000-05-04 2003-11-04 Microsoft Corporation Method and system for optimizing a visual display for handheld computer systems
US7062483B2 (en) 2000-05-18 2006-06-13 Endeca Technologies, Inc. Hierarchical data-driven search and navigation system and method for information retrieval
US6594672B1 (en) 2000-06-01 2003-07-15 Hyperion Solutions Corporation Generating multidimensional output using meta-models and meta-outlines
US6839745B1 (en) 2000-07-19 2005-01-04 Verizon Corporate Services Group Inc. System and method for generating reports in a telecommunication system
US7278105B1 (en) 2000-08-21 2007-10-02 Vignette Corporation Visualization and analysis of user clickpaths
US6795868B1 (en) 2000-08-31 2004-09-21 Data Junction Corp. System and method for event-driven data transformation
US20030172014A1 (en) 2000-09-01 2003-09-11 Chris Quackenbush System and method for online valuation and analysis
US20020065708A1 (en) 2000-09-22 2002-05-30 Hikmet Senay Method and system for interactive visual analyses of organizational interactions
AUPR033800A0 (en) 2000-09-25 2000-10-19 Telstra R & D Management Pty Ltd A document categorisation system
US7430717B1 (en) 2000-09-26 2008-09-30 International Business Machines Corporation Method for adapting a K-means text clustering to emerging data
US6829621B2 (en) 2000-10-06 2004-12-07 International Business Machines Corporation Automatic determination of OLAP cube dimensions
US8117281B2 (en) 2006-11-02 2012-02-14 Addnclick, Inc. Using internet content as a means to establish live social networks by linking internet users to each other who are simultaneously engaged in the same and/or similar content
US8707185B2 (en) 2000-10-10 2014-04-22 Addnclick, Inc. Dynamic information management system and method for content delivery and sharing in content-, metadata- and viewer-based, live social networking among users concurrently engaged in the same and/or similar content
JP2002123530A (en) 2000-10-12 2002-04-26 Hitachi Ltd Method and device for visualizing multidimensional data
US6754640B2 (en) 2000-10-30 2004-06-22 William O. Bozeman Universal positive pay match, authentication, authorization, settlement and clearing system
US6738770B2 (en) 2000-11-04 2004-05-18 Deep Sky Software, Inc. System and method for filtering and sorting data
DE60124657T2 (en) 2000-11-08 2007-10-18 International Business Machines Corp. Reduction of exclusion conflicts in SQL transactions
US6978419B1 (en) 2000-11-15 2005-12-20 Justsystem Corporation Method and apparatus for efficient identification of duplicate and near-duplicate documents and text spans using high-discriminability text fragments
GB0029229D0 (en) 2000-11-30 2001-01-17 Unisys Corp Counter measures for irregularities in financial transactions
US7058648B1 (en) 2000-12-01 2006-06-06 Oracle International Corporation Hierarchy-based secured document repository
US20020103705A1 (en) 2000-12-06 2002-08-01 Forecourt Communication Group Method and apparatus for using prior purchases to select activities to present to a customer
US7529698B2 (en) 2001-01-16 2009-05-05 Raymond Anthony Joao Apparatus and method for providing transaction history information, account history information, and/or charge-back information
JP2002222083A (en) 2001-01-29 2002-08-09 Fujitsu Ltd Device and method for instance storage
US9053222B2 (en) 2002-05-17 2015-06-09 Lawrence A. Lynn Patient safety processor
AUPR313301A0 (en) 2001-02-15 2001-03-08 Topshop Holdings Pty Ltd Method & system for avoiding channel conflict in electronic commerce
US6516268B2 (en) 2001-02-16 2003-02-04 Wizeguides.Com Inc. Bundled map guide
US20100057622A1 (en) 2001-02-27 2010-03-04 Faith Patrick L Distributed Quantum Encrypted Pattern Generation And Scoring
US6985950B1 (en) 2001-03-06 2006-01-10 Microsoft Corporation System for creating a space-efficient document categorizer for training and testing of automatic categorization engines
US20060265397A1 (en) 2001-03-06 2006-11-23 Knowledge Vector, Inc. Methods, systems, and computer program products for extensible, profile-and context-based information correlation, routing and distribution
US7043702B2 (en) 2001-03-15 2006-05-09 Xerox Corporation Method for visualizing user path through a web site and a path's associated information scent
US9256356B2 (en) 2001-03-29 2016-02-09 International Business Machines Corporation Method and system for providing feedback for docking a content pane in a host window
JP3842573B2 (en) 2001-03-30 2006-11-08 株式会社東芝 Structured document search method, structured document management apparatus and program
US6775675B1 (en) 2001-04-04 2004-08-10 Sagemetrics Corporation Methods for abstracting data from various data structures and managing the presentation of the data
KR20040004619A (en) 2001-05-11 2004-01-13 컴퓨터 어소시에이츠 싱크, 인코포레이티드 Method and system for transforming legacy software applications into modern object-oriented systems
US20020169759A1 (en) 2001-05-14 2002-11-14 International Business Machines Corporation Method and apparatus for graphically formulating a search query and displaying result set
US6980984B1 (en) 2001-05-16 2005-12-27 Kanisa, Inc. Content provider systems and methods using structured data
US7865427B2 (en) 2001-05-30 2011-01-04 Cybersource Corporation Method and apparatus for evaluating fraud risk in an electronic commerce transaction
US6828920B2 (en) 2001-06-04 2004-12-07 Lockheed Martin Orincon Corporation System and method for classifying vehicles
US8001465B2 (en) 2001-06-26 2011-08-16 Kudrollis Software Inventions Pvt. Ltd. Compacting an information array display to cope with two dimensional display space constraint
US7194369B2 (en) 2001-07-23 2007-03-20 Cognis Corporation On-site analysis system with central processor and method of analyzing
US7461077B1 (en) 2001-07-31 2008-12-02 Nicholas Greenwood Representation of data records
US20030130993A1 (en) 2001-08-08 2003-07-10 Quiver, Inc. Document categorization engine
US20030039948A1 (en) 2001-08-09 2003-02-27 Donahue Steven J. Voice enabled tutorial system and method
US20040205524A1 (en) 2001-08-15 2004-10-14 F1F9 Spreadsheet data processing system
US7082365B2 (en) 2001-08-16 2006-07-25 Networks In Motion, Inc. Point of interest spatial rating search method and system
EP1435058A4 (en) 2001-10-11 2005-12-07 Visualsciences Llc System, method, and computer program product for processing and visualization of information
US6877136B2 (en) 2001-10-26 2005-04-05 United Services Automobile Association (Usaa) System and method of providing electronic access to one or more documents
US7611602B2 (en) 2001-12-13 2009-11-03 Urban Mapping, Llc Method of producing maps and other objects configured for presentation of spatially-related layers of data
US7970240B1 (en) 2001-12-17 2011-06-28 Google Inc. Method and apparatus for archiving and visualizing digital images
US20070203771A1 (en) 2001-12-17 2007-08-30 Caballero Richard J System and method for processing complex orders
US7475242B2 (en) 2001-12-18 2009-01-06 Hewlett-Packard Development Company, L.P. Controlling the distribution of information
US7454466B2 (en) 2002-01-16 2008-11-18 Xerox Corporation Method and system for flexible workflow management
US7139800B2 (en) 2002-01-16 2006-11-21 Xerox Corporation User interface for a message-based system having embedded information management capabilities
US7640173B2 (en) 2002-01-17 2009-12-29 Applied Medical Software, Inc. Method and system for evaluating a physician's economic performance and gainsharing of physician services
US7546245B2 (en) 2002-01-17 2009-06-09 Amsapplied Medical Software, Inc. Method and system for gainsharing of physician services
US7305444B2 (en) 2002-01-23 2007-12-04 International Business Machines Corporation Method and system for controlling delivery of information in a forum
CA3077873A1 (en) 2002-03-20 2003-10-02 Catalina Marketing Corporation Targeted incentives based upon predicted behavior
US7533026B2 (en) 2002-04-12 2009-05-12 International Business Machines Corporation Facilitating management of service elements usable in providing information technology service offerings
US7162475B2 (en) 2002-04-17 2007-01-09 Ackerman David M Method for user verification and authentication and multimedia processing for interactive database management and method for viewing the multimedia
US20040126840A1 (en) 2002-12-23 2004-07-01 Affymetrix, Inc. Method, system and computer software for providing genomic ontological data
US7171427B2 (en) 2002-04-26 2007-01-30 Oracle International Corporation Methods of navigating a cube that is implemented as a relational object
US20040012633A1 (en) 2002-04-26 2004-01-22 Affymetrix, Inc., A Corporation Organized Under The Laws Of Delaware System, method, and computer program product for dynamic display, and analysis of biological sequence data
US7426559B2 (en) 2002-05-09 2008-09-16 International Business Machines Corporation Method for sequential coordination of external database application events with asynchronous internal database events
US7539680B2 (en) 2002-05-10 2009-05-26 Lsi Corporation Revision control for database of evolved design
US7703021B1 (en) 2002-05-24 2010-04-20 Sparta Systems, Inc. Defining user access in highly-configurable systems
JP2003345810A (en) 2002-05-28 2003-12-05 Hitachi Ltd Method and system for document retrieval and document retrieval result display system
US20030229848A1 (en) 2002-06-05 2003-12-11 Udo Arend Table filtering in a computer user interface
US7103854B2 (en) 2002-06-27 2006-09-05 Tele Atlas North America, Inc. System and method for associating text and graphical views of map information
US7461158B2 (en) 2002-08-07 2008-12-02 Intelliden, Inc. System and method for controlling access rights to network resources
US7076508B2 (en) 2002-08-12 2006-07-11 International Business Machines Corporation Method, system, and program for merging log entries from multiple recovery log files
CA2398103A1 (en) 2002-08-14 2004-02-14 March Networks Corporation Multi-dimensional table filtering system
US7127352B2 (en) 2002-09-30 2006-10-24 Lucent Technologies Inc. System and method for providing accurate local maps for a central service
WO2004036461A2 (en) 2002-10-14 2004-04-29 Battelle Memorial Institute Information reservoir
US20040078251A1 (en) 2002-10-16 2004-04-22 Demarcken Carl G. Dividing a travel query into sub-queries
US20040143602A1 (en) 2002-10-18 2004-07-22 Antonio Ruiz Apparatus, system and method for automated and adaptive digital image/video surveillance for events and configurations using a rich multimedia relational database
GB0224589D0 (en) 2002-10-22 2002-12-04 British Telecomm Method and system for processing or searching user records
US20040085318A1 (en) 2002-10-31 2004-05-06 Philipp Hassler Graphics generation and integration
US7870078B2 (en) 2002-11-01 2011-01-11 Id Insight Incorporated System, method and computer program product for assessing risk of identity theft
US20040111480A1 (en) 2002-12-09 2004-06-10 Yue Jonathan Zhanjun Message screening system and method
US8589273B2 (en) 2002-12-23 2013-11-19 Ge Corporate Financial Services, Inc. Methods and systems for managing risk management information
US20040148301A1 (en) 2003-01-24 2004-07-29 Mckay Christopher W.T. Compressed data structure for a database
US7752117B2 (en) 2003-01-31 2010-07-06 Trading Technologies International, Inc. System and method for money management in electronic trading environment
US7091827B2 (en) 2003-02-03 2006-08-15 Ingrid, Inc. Communications control in a security system
WO2013126281A1 (en) 2012-02-24 2013-08-29 Lexisnexis Risk Solutions Fl Inc. Systems and methods for putative cluster analysis
US20040153418A1 (en) 2003-02-05 2004-08-05 Hanweck Gerald Alfred System and method for providing access to data from proprietary tools
US7627552B2 (en) 2003-03-27 2009-12-01 Microsoft Corporation System and method for filtering and organizing items based on common elements
US7280038B2 (en) 2003-04-09 2007-10-09 John Robinson Emergency response data transmission system
KR100996029B1 (en) 2003-04-29 2010-11-22 삼성전자주식회사 Apparatus and method for coding of low density parity check code
US8386377B1 (en) 2003-05-12 2013-02-26 Id Analytics, Inc. System and method for credit scoring using an identity network connectivity
US20050027705A1 (en) 2003-05-20 2005-02-03 Pasha Sadri Mapping method and system
US9607092B2 (en) 2003-05-20 2017-03-28 Excalibur Ip, Llc Mapping method and system
US7620648B2 (en) 2003-06-20 2009-11-17 International Business Machines Corporation Universal annotation configuration and deployment
US20060136402A1 (en) 2004-12-22 2006-06-22 Tsu-Chang Lee Object-based information storage, search and mining system method
US20040267746A1 (en) 2003-06-26 2004-12-30 Cezary Marcjan User interface for controlling access to computer objects
US7392249B1 (en) 2003-07-01 2008-06-24 Microsoft Corporation Methods, systems, and computer-readable mediums for providing persisting and continuously updating search folders
US8412566B2 (en) 2003-07-08 2013-04-02 Yt Acquisition Corporation High-precision customer-based targeting by individual usage statistics
US7055110B2 (en) 2003-07-28 2006-05-30 Sig G Kupka Common on-screen zone for menu activation and stroke input
US7139772B2 (en) 2003-08-01 2006-11-21 Oracle International Corporation Ownership reassignment in a shared-nothing database system
US7363581B2 (en) 2003-08-12 2008-04-22 Accenture Global Services Gmbh Presentation generator
US7373669B2 (en) 2003-08-13 2008-05-13 The 41St Parameter, Inc. Method and system for determining presence of probable error or fraud in a data set by linking common data values or elements
WO2005036319A2 (en) 2003-09-22 2005-04-21 Catalina Marketing International, Inc. Assumed demographics, predicted behaviour, and targeted incentives
US7516086B2 (en) 2003-09-24 2009-04-07 Idearc Media Corp. Business rating placement heuristic
US7454045B2 (en) 2003-10-10 2008-11-18 The United States Of America As Represented By The Department Of Health And Human Services Determination of feature boundaries in a digital representation of an anatomical structure
US7334195B2 (en) 2003-10-14 2008-02-19 Microsoft Corporation System and process for presenting search results in a histogram/cluster format
US7584172B2 (en) 2003-10-16 2009-09-01 Sap Ag Control for selecting data query and visual configuration
US8627489B2 (en) 2003-10-31 2014-01-07 Adobe Systems Incorporated Distributed document version control
US20050108063A1 (en) 2003-11-05 2005-05-19 Madill Robert P.Jr. Systems and methods for assessing the potential for fraud in business transactions
US7324995B2 (en) 2003-11-17 2008-01-29 Rackable Systems Inc. Method for retrieving and modifying data elements on a shared medium
US20050125715A1 (en) 2003-12-04 2005-06-09 Fabrizio Di Franco Method of saving data in a graphical user interface
US7818658B2 (en) 2003-12-09 2010-10-19 Yi-Chih Chen Multimedia presentation system
US7917376B2 (en) 2003-12-29 2011-03-29 Montefiore Medical Center System and method for monitoring patient care
US20050154769A1 (en) 2004-01-13 2005-07-14 Llumen, Inc. Systems and methods for benchmarking business performance data against aggregated business performance data
US20050154628A1 (en) 2004-01-13 2005-07-14 Illumen, Inc. Automated management of business performance information
US20050166144A1 (en) 2004-01-22 2005-07-28 Mathcom Inventions Ltd. Method and system for assigning a background to a document and document having a background made according to the method and system
US7872669B2 (en) 2004-01-22 2011-01-18 Massachusetts Institute Of Technology Photo-based mobile deixis system and related techniques
US7343552B2 (en) 2004-02-12 2008-03-11 Fuji Xerox Co., Ltd. Systems and methods for freeform annotations
US20050180330A1 (en) 2004-02-17 2005-08-18 Touchgraph Llc Method of animating transitions and stabilizing node motion during dynamic graph navigation
US20050182793A1 (en) 2004-02-18 2005-08-18 Keenan Viktor M. Map structure and method for producing
US7596285B2 (en) 2004-02-26 2009-09-29 International Business Machines Corporation Providing a portion of an electronic mail message at a reduced resolution
JP4226491B2 (en) 2004-02-26 2009-02-18 株式会社ザナヴィ・インフォマティクス Search data update system and navigation device
US20050210409A1 (en) 2004-03-19 2005-09-22 Kenny Jou Systems and methods for class designation in a computerized social network application
CA2820249C (en) 2004-03-23 2016-07-19 Google Inc. A digital mapping system
US7599790B2 (en) 2004-03-23 2009-10-06 Google Inc. Generating and serving tiles in a digital mapping system
US7865301B2 (en) 2004-03-23 2011-01-04 Google Inc. Secondary map in digital mapping system
US20060026120A1 (en) 2004-03-24 2006-02-02 Update Publications Lp Method and system for collecting, processing, and distributing residential property data
US7269801B2 (en) 2004-03-30 2007-09-11 Autodesk, Inc. System for managing the navigational usability of an interactive map
US9106694B2 (en) 2004-04-01 2015-08-11 Fireeye, Inc. Electronic message analysis for malware detection
US20050222928A1 (en) 2004-04-06 2005-10-06 Pricewaterhousecoopers Llp Systems and methods for investigation of financial reporting information
JP2007535764A (en) 2004-04-26 2007-12-06 ライト90,インコーポレイテッド Real-time data prediction
US20050246327A1 (en) 2004-04-30 2005-11-03 Yeung Simon D User interfaces and methods of using the same
EP2487601A1 (en) 2004-05-04 2012-08-15 Boston Consulting Group, Inc. Method and apparatus for selecting, analyzing and visualizing related database records as a network
US20050251786A1 (en) 2004-05-07 2005-11-10 International Business Machines Corporation System and method for dynamic software installation instructions
EP1761863A4 (en) 2004-05-25 2009-11-18 Postini Inc Electronic message source information reputation system
US8885894B2 (en) 2004-06-14 2014-11-11 Michael John Rowen Reduction of transaction fraud through the use of automatic centralized signature/sign verification combined with credit and fraud scoring during real-time payment card authorization processes
GB2415317B (en) 2004-06-15 2007-08-15 Orange Personal Comm Serv Ltd Provision of group services in a telecommunications network
EP1782371A4 (en) 2004-06-22 2009-12-02 Coras Inc Systems and methods for software based on business concepts
FR2872653B1 (en) 2004-06-30 2006-12-29 Skyrecon Systems Sa SYSTEM AND METHODS FOR SECURING COMPUTER STATIONS AND / OR COMMUNICATIONS NETWORKS
US8289390B2 (en) 2004-07-28 2012-10-16 Sri International Method and apparatus for total situational awareness and monitoring
WO2006018843A2 (en) 2004-08-16 2006-02-23 Beinsync Ltd. A system and method for the synchronization of data across multiple computing devices
US7290698B2 (en) 2004-08-25 2007-11-06 Sony Corporation Progress bar with multiple portions
US7617232B2 (en) 2004-09-02 2009-11-10 Microsoft Corporation Centralized terminology and glossary development
US7933862B2 (en) 2004-09-27 2011-04-26 Microsoft Corporation One click conditional formatting method and system for software programs
US7712049B2 (en) 2004-09-30 2010-05-04 Microsoft Corporation Two-dimensional radial user interface for computer software applications
US7788589B2 (en) 2004-09-30 2010-08-31 Microsoft Corporation Method and system for improved electronic task flagging and management
US20060074881A1 (en) 2004-10-02 2006-04-06 Adventnet, Inc. Structure independent searching in disparate databases
US7284198B2 (en) 2004-10-07 2007-10-16 International Business Machines Corporation Method and system for document draft reminder based on inactivity
US20060080316A1 (en) 2004-10-08 2006-04-13 Meridio Ltd Multiple indexing of an electronic document to selectively permit access to the content and metadata thereof
US20060080616A1 (en) 2004-10-13 2006-04-13 Xerox Corporation Systems, methods and user interfaces for document workflow construction
GB0422750D0 (en) 2004-10-13 2004-11-17 Ciphergrid Ltd Remote database technique
US7574409B2 (en) 2004-11-04 2009-08-11 Vericept Corporation Method, apparatus, and system for clustering and classification
US7797197B2 (en) 2004-11-12 2010-09-14 Amazon Technologies, Inc. Method and system for analyzing the performance of affiliate sites
US7529734B2 (en) 2004-11-12 2009-05-05 Oracle International Corporation Method and apparatus for facilitating a database query using a query criteria template
US7899796B1 (en) 2004-11-23 2011-03-01 Andrew Borthwick Batch automated blocking and record matching
US7620628B2 (en) 2004-12-06 2009-11-17 Yahoo! Inc. Search processing with automatic categorization of queries
US20060129746A1 (en) 2004-12-14 2006-06-15 Ithink, Inc. Method and graphic interface for storing, moving, sending or printing electronic data to two or more locations, in two or more formats with a single save function
US7849395B2 (en) 2004-12-15 2010-12-07 Microsoft Corporation Filter and sort by color
US7451397B2 (en) 2004-12-15 2008-11-11 Microsoft Corporation System and method for automatically completing spreadsheet formulas
US8700414B2 (en) 2004-12-29 2014-04-15 Sap Ag System supported optimization of event resolution
US20060143079A1 (en) 2004-12-29 2006-06-29 Jayanta Basak Cross-channel customer matching
US7660823B2 (en) 2004-12-30 2010-02-09 Sas Institute Inc. Computer-implemented system and method for visualizing OLAP and multidimensional data in a calendar format
US7418461B2 (en) 2005-01-14 2008-08-26 Microsoft Corporation Schema conformance for database servers
US9436945B2 (en) 2005-02-01 2016-09-06 Redfin Corporation Interactive map-based search and advertising
US8200700B2 (en) 2005-02-01 2012-06-12 Newsilike Media Group, Inc Systems and methods for use of structured and unstructured distributed data
US8271436B2 (en) 2005-02-07 2012-09-18 Mimosa Systems, Inc. Retro-fitting synthetic full copies of data
US7614006B2 (en) 2005-02-11 2009-11-03 International Business Machines Corporation Methods and apparatus for implementing inline controls for transposing rows and columns of computer-based tables
US7979457B1 (en) 2005-03-02 2011-07-12 Kayak Software Corporation Efficient search of supplier servers based on stored search results
US8646080B2 (en) 2005-09-16 2014-02-04 Avg Technologies Cy Limited Method and apparatus for removing harmful software
US20060242630A1 (en) 2005-03-09 2006-10-26 Maxis Co., Ltd. Process for preparing design procedure document and apparatus for the same
US8091784B1 (en) 2005-03-09 2012-01-10 Diebold, Incorporated Banking system controlled responsive to data bearing records
US7483028B2 (en) 2005-03-15 2009-01-27 Microsoft Corporation Providing 1D and 2D connectors in a connected diagram
US7725728B2 (en) 2005-03-23 2010-05-25 Business Objects Data Integration, Inc. Apparatus and method for dynamically auditing data migration to produce metadata
US7676845B2 (en) 2005-03-24 2010-03-09 Microsoft Corporation System and method of selectively scanning a file on a computing device for malware
US20060218491A1 (en) 2005-03-25 2006-09-28 International Business Machines Corporation System, method and program product for community review of documents
US7596528B1 (en) 2005-03-31 2009-09-29 Trading Technologies International, Inc. System and method for dynamically regulating order entry in an electronic trading environment
US7525422B2 (en) 2005-04-14 2009-04-28 Verizon Business Global Llc Method and system for providing alarm reporting in a managed network services environment
US7426654B2 (en) 2005-04-14 2008-09-16 Verizon Business Global Llc Method and system for providing customer controlled notifications in a managed network services system
US20060242040A1 (en) 2005-04-20 2006-10-26 Aim Holdings Llc Method and system for conducting sentiment analysis for securities research
US8639757B1 (en) 2011-08-12 2014-01-28 Sprint Communications Company L.P. User localization using friend location information
US8082172B2 (en) 2005-04-26 2011-12-20 The Advisory Board Company System and method for peer-profiling individual performance
US8145686B2 (en) 2005-05-06 2012-03-27 Microsoft Corporation Maintenance of link level consistency between database and file system
US7958120B2 (en) 2005-05-10 2011-06-07 Netseer, Inc. Method and apparatus for distributed community finding
US7672968B2 (en) 2005-05-12 2010-03-02 Apple Inc. Displaying a tooltip associated with a concurrently displayed database object
US8024778B2 (en) 2005-05-24 2011-09-20 CRIF Corporation System and method for defining attributes, decision rules, or both, for remote execution, claim set I
US8825370B2 (en) 2005-05-27 2014-09-02 Yahoo! Inc. Interactive map-based travel guide
US8161122B2 (en) 2005-06-03 2012-04-17 Messagemind, Inc. System and method of dynamically prioritized electronic mail graphical user interface, and measuring email productivity and collaboration trends
US8341259B2 (en) 2005-06-06 2012-12-25 Adobe Systems Incorporated ASP for web analytics including a real-time segmentation workbench
EP1732034A1 (en) 2005-06-06 2006-12-13 First Data Corporation System and method for authorizing electronic payment transactions
US8200676B2 (en) 2005-06-28 2012-06-12 Nokia Corporation User interface for geographic search
US8560413B1 (en) 2005-07-14 2013-10-15 John S. Quarterman Method and system for detecting distributed internet crime
US20070016363A1 (en) 2005-07-15 2007-01-18 Oracle International Corporation Interactive map-based user interface for transportation planning
US7991764B2 (en) 2005-07-22 2011-08-02 Yogesh Chunilal Rathod Method and system for communication, publishing, searching, sharing and dynamically providing a journal feed
US7421429B2 (en) 2005-08-04 2008-09-02 Microsoft Corporation Generate blog context ranking using track-back weight, context weight and, cumulative comment weight
JP3989527B2 (en) 2005-08-04 2007-10-10 松下電器産業株式会社 Search article estimation apparatus and method, and search article estimation apparatus server
US20070130206A1 (en) 2005-08-05 2007-06-07 Siemens Corporate Research Inc System and Method For Integrating Heterogeneous Biomedical Information
EP1917544A2 (en) 2005-08-23 2008-05-07 R.A. Smith & Associates, Inc. High accuracy survey-grade gis system
US20070050429A1 (en) 2005-08-26 2007-03-01 Centric Software, Inc. Time-range locking for temporal database and branched-and-temporal databases
US7917841B2 (en) 2005-08-29 2011-03-29 Edgar Online, Inc. System and method for rendering data
US8095866B2 (en) 2005-09-09 2012-01-10 Microsoft Corporation Filtering user interface for a data summary table
JP2007079641A (en) 2005-09-09 2007-03-29 Canon Inc Information processor and processing method, program, and storage medium
US7716226B2 (en) 2005-09-27 2010-05-11 Patentratings, Llc Method and system for probabilistically quantifying and visualizing relevance between two or more citationally or contextually related data objects
US20070078832A1 (en) 2005-09-30 2007-04-05 Yahoo! Inc. Method and system for using smart tags and a recommendation engine using smart tags
US8306986B2 (en) 2005-09-30 2012-11-06 American Express Travel Related Services Company, Inc. Method, system, and computer program product for linking customer information
US7870493B2 (en) 2005-10-03 2011-01-11 Microsoft Corporation Distributed clipboard
US7668769B2 (en) 2005-10-04 2010-02-23 Basepoint Analytics, LLC System and method of detecting fraud
US7574428B2 (en) 2005-10-11 2009-08-11 Telmap Ltd Geometry-based search engine for navigation systems
US8032885B2 (en) 2005-10-11 2011-10-04 Oracle International Corporation Method and medium for combining operation commands into database submission groups
US7487139B2 (en) 2005-10-12 2009-02-03 International Business Machines Corporation Method and system for filtering a table
US7933897B2 (en) 2005-10-12 2011-04-26 Google Inc. Entity display priority in a distributed geographic information system
US20070094389A1 (en) 2005-10-23 2007-04-26 Bill Nussey Provision of rss feeds based on classification of content
US7627812B2 (en) 2005-10-27 2009-12-01 Microsoft Corporation Variable formatting of cells
US20090168163A1 (en) 2005-11-01 2009-07-02 Global Bionic Optics Pty Ltd. Optical lens systems
US20100198858A1 (en) 2005-11-21 2010-08-05 Anti-Gang Enforcement Networking Technology, Inc. System and Methods for Linking Multiple Events Involving Firearms and Gang Related Activities
US7814102B2 (en) 2005-12-07 2010-10-12 Lexisnexis, A Division Of Reed Elsevier Inc. Method and system for linking documents with multiple topics to related documents
US7730109B2 (en) 2005-12-12 2010-06-01 Google, Inc. Message catalogs for remote modules
US7725530B2 (en) 2005-12-12 2010-05-25 Google Inc. Proxy server collection of data for module incorporation into a container document
US8185819B2 (en) 2005-12-12 2012-05-22 Google Inc. Module specification for a module to be incorporated into a container document
US7730082B2 (en) 2005-12-12 2010-06-01 Google Inc. Remote module incorporation into a container document
US9141913B2 (en) 2005-12-16 2015-09-22 Nextbio Categorization and filtering of scientific data
US7606844B2 (en) 2005-12-19 2009-10-20 Commvault Systems, Inc. System and method for performing replication copy storage operations
US7792809B2 (en) 2005-12-19 2010-09-07 Tera data US, Inc. Database system
US20090082997A1 (en) 2005-12-21 2009-03-26 Tokman Michael G Method of identifying clusters and connectivity between clusters
US8726144B2 (en) 2005-12-23 2014-05-13 Xerox Corporation Interactive learning-based document annotation
US20070150369A1 (en) 2005-12-28 2007-06-28 Zivin Michael A Method and system for determining the optimal travel route by which customers can purchase local goods at the lowest total cost
US7788296B2 (en) 2005-12-29 2010-08-31 Guidewire Software, Inc. Method and apparatus for managing a computer-based address book for incident-related work
US8712828B2 (en) 2005-12-30 2014-04-29 Accenture Global Services Limited Churn prediction and management system
CN100481077C (en) 2006-01-12 2009-04-22 国际商业机器公司 Visual method and device for strengthening search result guide
US7634717B2 (en) 2006-01-23 2009-12-15 Microsoft Corporation Multiple conditional formatting
US7818291B2 (en) 2006-02-03 2010-10-19 The General Electric Company Data object access system and method using dedicated task object
US20070185867A1 (en) 2006-02-03 2007-08-09 Matteo Maga Statistical modeling methods for determining customer distribution by churn probability within a customer population
US7770100B2 (en) 2006-02-27 2010-08-03 Microsoft Corporation Dynamic thresholds for conditional formats
US8698809B2 (en) 2006-03-03 2014-04-15 Donya Labs Ab Creation and rendering of hierarchical digital multimedia data
US7579965B2 (en) 2006-03-03 2009-08-25 Andrew Bucholz Vehicle data collection and processing system
US7899611B2 (en) 2006-03-03 2011-03-01 Inrix, Inc. Detecting anomalous road traffic conditions
US20070208498A1 (en) 2006-03-03 2007-09-06 Inrix, Inc. Displaying road traffic condition information and user controls
US20080052142A1 (en) 2006-03-13 2008-02-28 Bailey Maurice G T System and method for real-time display of emergencies, resources and personnel
US7912773B1 (en) 2006-03-24 2011-03-22 Sas Institute Inc. Computer-implemented data storage systems and methods for use with predictive model systems
US7512578B2 (en) 2006-03-30 2009-03-31 Emc Corporation Smart containers
DE602006002873D1 (en) 2006-03-31 2008-11-06 Research In Motion Ltd A user interface method and apparatus for controlling the visual display of maps with selectable map elements in mobile communication devices
US7743056B2 (en) 2006-03-31 2010-06-22 Aol Inc. Identifying a result responsive to a current location of a client device
US8176179B2 (en) 2006-04-03 2012-05-08 Secure64 Software Corporation Method and system for data-structure management
US20070240062A1 (en) 2006-04-07 2007-10-11 Christena Jennifer Y Method and System for Restricting User Operations in a Graphical User Inerface Window
US20080040275A1 (en) 2006-04-25 2008-02-14 Uc Group Limited Systems and methods for identifying potentially fraudulent financial transactions and compulsive spending behavior
US8739278B2 (en) 2006-04-28 2014-05-27 Oracle International Corporation Techniques for fraud monitoring and detection using application fingerprinting
US7752123B2 (en) 2006-04-28 2010-07-06 Townsend Analytics Ltd. Order management system and method for electronic securities trading
US20070260597A1 (en) 2006-05-02 2007-11-08 Mark Cramer Dynamic search engine results employing user behavior
US7756843B1 (en) 2006-05-25 2010-07-13 Juniper Networks, Inc. Identifying and processing confidential information on network endpoints
US9195985B2 (en) 2006-06-08 2015-11-24 Iii Holdings 1, Llc Method, system, and computer program product for customer-level data verification
US7657626B1 (en) 2006-09-19 2010-02-02 Enquisite, Inc. Click fraud detection
US7468662B2 (en) 2006-06-16 2008-12-23 International Business Machines Corporation Method for spatio-temporal event detection using composite definitions for camera systems
US8290943B2 (en) 2006-07-14 2012-10-16 Raytheon Company Geographical information display system and method
WO2008011728A1 (en) 2006-07-28 2008-01-31 Pattern Intelligence Inc. System and method for detecting and analyzing pattern relationships
WO2008022051A2 (en) 2006-08-10 2008-02-21 Loma Linda University Medical Center Advanced emergency geographical information system
US20130150004A1 (en) 2006-08-11 2013-06-13 Michael Rosen Method and apparatus for reducing mobile phone usage while driving
US20080040684A1 (en) 2006-08-14 2008-02-14 Richard Crump Intelligent Pop-Up Window Method and Apparatus
US20080077597A1 (en) 2006-08-24 2008-03-27 Lance Butler Systems and methods for photograph mapping
US20080051989A1 (en) 2006-08-25 2008-02-28 Microsoft Corporation Filtering of data layered on mapping applications
US8230332B2 (en) 2006-08-30 2012-07-24 Compsci Resources, Llc Interactive user interface for converting unstructured documents
JP4778865B2 (en) 2006-08-30 2011-09-21 株式会社ソニー・コンピュータエンタテインメント Image viewer, image display method and program
US7725547B2 (en) 2006-09-06 2010-05-25 International Business Machines Corporation Informing a user of gestures made by others out of the user's line of sight
US7899822B2 (en) 2006-09-08 2011-03-01 International Business Machines Corporation Automatically linking documents with relevant structured information
US8271429B2 (en) 2006-09-11 2012-09-18 Wiredset Llc System and method for collecting and processing data
CN101145152B (en) 2006-09-14 2010-08-11 国际商业机器公司 System and method for automatically refining reality in specific context
US8054756B2 (en) 2006-09-18 2011-11-08 Yahoo! Inc. Path discovery and analytics for network data
US7945582B2 (en) 2006-09-23 2011-05-17 Gis Planning, Inc. Web-based interactive geographic information systems mapping analysis and methods of using thereof
US20080082486A1 (en) 2006-09-29 2008-04-03 Yahoo! Inc. Platform for user discovery experience
WO2008043082A2 (en) 2006-10-05 2008-04-10 Splunk Inc. Time series search engine
US7761407B1 (en) 2006-10-10 2010-07-20 Medallia, Inc. Use of primary and secondary indexes to facilitate aggregation of records of an OLAP data cube
US7840525B2 (en) 2006-10-11 2010-11-23 Ruby Jonathan P Extended transactions
US7698336B2 (en) 2006-10-26 2010-04-13 Microsoft Corporation Associating geographic-related information with objects
US7912875B2 (en) 2006-10-31 2011-03-22 Business Objects Software Ltd. Apparatus and method for filtering data using nested panels
US20080148398A1 (en) 2006-10-31 2008-06-19 Derek John Mezack System and Method for Definition and Automated Analysis of Computer Security Threat Models
US8229902B2 (en) 2006-11-01 2012-07-24 Ab Initio Technology Llc Managing storage of individually accessible data units
US7792868B2 (en) 2006-11-10 2010-09-07 Microsoft Corporation Data object linking and browsing tool
US20140006109A1 (en) 2006-11-13 2014-01-02 Vendavo, Inc. System and Methods for Generating Price Sensitivity
US7962495B2 (en) 2006-11-20 2011-06-14 Palantir Technologies, Inc. Creating data in a data store using a dynamic ontology
US7599945B2 (en) 2006-11-30 2009-10-06 Yahoo! Inc. Dynamic cluster visualization
JP4923990B2 (en) 2006-12-04 2012-04-25 株式会社日立製作所 Failover method and its computer system.
US8069202B1 (en) 2007-02-02 2011-11-29 Resource Consortium Limited Creating a projection of a situational network
US8126848B2 (en) 2006-12-07 2012-02-28 Robert Edward Wagner Automated method for identifying and repairing logical data discrepancies between database replicas in a database cluster
US7680939B2 (en) 2006-12-20 2010-03-16 Yahoo! Inc. Graphical user interface to manipulate syndication data feeds
US7809703B2 (en) 2006-12-22 2010-10-05 International Business Machines Corporation Usage of development context in search operations
US8290838B1 (en) 2006-12-29 2012-10-16 Amazon Technologies, Inc. Indicating irregularities in online financial transactions
US20080162616A1 (en) 2006-12-29 2008-07-03 Sap Ag Skip relation pattern for graph structures
US8368695B2 (en) 2007-02-08 2013-02-05 Microsoft Corporation Transforming offline maps into interactive online maps
US8196184B2 (en) 2007-02-16 2012-06-05 Microsoft Corporation Efficient data structures for multi-dimensional security
US8930331B2 (en) 2007-02-21 2015-01-06 Palantir Technologies Providing unique views of data based on changes or rules
US20080208735A1 (en) 2007-02-22 2008-08-28 American Expresstravel Related Services Company, Inc., A New York Corporation Method, System, and Computer Program Product for Managing Business Customer Contacts
US7920963B2 (en) 2007-02-22 2011-04-05 Iac Search & Media, Inc. Map interface with a movable marker
US7873557B2 (en) 2007-02-28 2011-01-18 Aaron Guidotti Information, document, and compliance management for financial professionals, clients, and supervisors
US8352881B2 (en) 2007-03-08 2013-01-08 International Business Machines Corporation Method, apparatus and program storage device for providing customizable, immediate and radiating menus for accessing applications and actions
US8959568B2 (en) 2007-03-14 2015-02-17 Microsoft Corporation Enterprise security assessment sharing
WO2008115519A1 (en) 2007-03-20 2008-09-25 President And Fellows Of Harvard College A system for estimating a distribution of message content categories in source data
US7814084B2 (en) 2007-03-21 2010-10-12 Schmap Inc. Contact information capture and link redirection
US8036971B2 (en) 2007-03-30 2011-10-11 Palantir Technologies, Inc. Generating dynamic date sets that represent market conditions
US20090018940A1 (en) 2007-03-30 2009-01-15 Liang Wang Enhanced Fraud Detection With Terminal Transaction-Sequence Processing
JP5268274B2 (en) 2007-03-30 2013-08-21 キヤノン株式会社 Search device, method, and program
US20080249937A1 (en) 2007-04-06 2008-10-09 Walls Robert K Payment card based remittance system with delivery of anti-money laundering information to receiving financial institution
US8229458B2 (en) 2007-04-08 2012-07-24 Enhanced Geographic Llc Systems and methods to determine the name of a location visited by a user of a wireless device
US20080255973A1 (en) 2007-04-10 2008-10-16 Robert El Wade Sales transaction analysis tool and associated method of use
US20090164387A1 (en) 2007-04-17 2009-06-25 Semandex Networks Inc. Systems and methods for providing semantically enhanced financial information
AU2008242910A1 (en) 2007-04-17 2008-10-30 Emd Millipore Corporation Graphical user interface for analysis and comparison of location-specific multiparameter data sets
US8312546B2 (en) 2007-04-23 2012-11-13 Mcafee, Inc. Systems, apparatus, and methods for detecting malware
US20080267107A1 (en) 2007-04-27 2008-10-30 Outland Research, Llc Attraction wait-time inquiry apparatus, system and method
DE102008010419A1 (en) 2007-05-03 2008-11-13 Navigon Ag Apparatus and method for creating a text object
US8090603B2 (en) 2007-05-11 2012-01-03 Fansnap, Inc. System and method for selecting event tickets
US10769290B2 (en) 2007-05-11 2020-09-08 Fair Isaac Corporation Systems and methods for fraud detection via interactive link analysis
US20080294663A1 (en) 2007-05-14 2008-11-27 Heinley Brandon J Creation and management of visual timelines
US20080288425A1 (en) 2007-05-17 2008-11-20 Christian Posse Methods and Apparatus for Reasoning About Information Fusion Approaches
WO2009038822A2 (en) 2007-05-25 2009-03-26 The Research Foundation Of State University Of New York Spectral clustering for multi-type relational data
US8515207B2 (en) 2007-05-25 2013-08-20 Google Inc. Annotations in panoramic images, and applications thereof
US7809785B2 (en) 2007-05-28 2010-10-05 Google Inc. System using router in a web browser for inter-domain communication
US8739123B2 (en) 2007-05-28 2014-05-27 Google Inc. Incorporating gadget functionality on webpages
US7644238B2 (en) 2007-06-01 2010-01-05 Microsoft Corporation Timestamp based transactional memory
US9009829B2 (en) 2007-06-12 2015-04-14 The Trustees Of Columbia University In The City Of New York Methods, systems, and media for baiting inside attackers
US20120084866A1 (en) 2007-06-12 2012-04-05 Stolfo Salvatore J Methods, systems, and media for measuring computer security
US7930547B2 (en) 2007-06-15 2011-04-19 Alcatel-Lucent Usa Inc. High accuracy bloom filter using partitioned hashing
US7783658B1 (en) 2007-06-18 2010-08-24 Seisint, Inc. Multi-entity ontology weighting systems and methods
WO2009006448A1 (en) 2007-06-28 2009-01-08 Cashedge, Inc. Global risk administration method and system
US8037046B2 (en) 2007-06-29 2011-10-11 Microsoft Corporation Collecting and presenting temporal-based action information
WO2009009623A1 (en) 2007-07-09 2009-01-15 Tailwalker Technologies, Inc. Integrating a methodology management system with project tasks in a project management system
US8271477B2 (en) 2007-07-20 2012-09-18 Informatica Corporation Methods and systems for accessing data
US20090027418A1 (en) 2007-07-24 2009-01-29 Maru Nimit H Map-based interfaces for storing and locating information about geographical areas
US8234298B2 (en) 2007-07-25 2012-07-31 International Business Machines Corporation System and method for determining driving factor in a data cube
US9489216B2 (en) 2007-07-26 2016-11-08 Sap Se Active tiled user interface
US8600872B1 (en) 2007-07-27 2013-12-03 Wells Fargo Bank, N.A. System and method for detecting account compromises
US7644106B2 (en) 2007-07-30 2010-01-05 Oracle International Corporation Avoiding lock contention by using a wait for completion mechanism
US10762080B2 (en) 2007-08-14 2020-09-01 John Nicholas and Kristin Gross Trust Temporal document sorter and method
US20090055251A1 (en) 2007-08-20 2009-02-26 Weblistic, Inc., A California Corporation Directed online advertising system and method
US7983902B2 (en) 2007-08-23 2011-07-19 Google Inc. Domain dictionary creation by detection of new topic words using divergence value comparison
US8631015B2 (en) 2007-09-06 2014-01-14 Linkedin Corporation Detecting associates
US9060012B2 (en) 2007-09-26 2015-06-16 The 41St Parameter, Inc. Methods and apparatus for detecting fraud with time based computer tags
US20090088964A1 (en) 2007-09-28 2009-04-02 Dave Schaaf Map scrolling method and apparatus for navigation system for selectively displaying icons
US8849728B2 (en) 2007-10-01 2014-09-30 Purdue Research Foundation Visual analytics law enforcement tools
US8484115B2 (en) 2007-10-03 2013-07-09 Palantir Technologies, Inc. Object-oriented time series generator
US20090094270A1 (en) 2007-10-08 2009-04-09 Alirez Baldomero J Method of building a validation database
US20090106308A1 (en) 2007-10-18 2009-04-23 Christopher Killian Complexity estimation of data objects
US8214308B2 (en) 2007-10-23 2012-07-03 Sas Institute Inc. Computer-implemented systems and methods for updating predictive models
US8397168B2 (en) 2008-04-05 2013-03-12 Social Communications Company Interfacing with a spatial virtual communication environment
US20090125369A1 (en) 2007-10-26 2009-05-14 Crowe Horwath Llp System and method for analyzing and dispositioning money laundering suspicious activity alerts
US7650310B2 (en) 2007-10-30 2010-01-19 Intuit Inc. Technique for reducing phishing
US8510743B2 (en) 2007-10-31 2013-08-13 Google Inc. Terminating computer applications
US8200618B2 (en) 2007-11-02 2012-06-12 International Business Machines Corporation System and method for analyzing data in a report
US20090126020A1 (en) 2007-11-09 2009-05-14 Norton Richard Elliott Engine for rule based content filtering
WO2009061501A1 (en) 2007-11-09 2009-05-14 Telecommunication Systems, Inc. Points-of-interest panning on a displayed map with a persistent search on a wireless phone
US8688607B2 (en) 2007-11-12 2014-04-01 Debra Pacha System and method for detecting healthcare insurance fraud
US9898767B2 (en) 2007-11-14 2018-02-20 Panjiva, Inc. Transaction facilitating marketplace platform
WO2011085360A1 (en) 2010-01-11 2011-07-14 Panjiva, Inc. Evaluating public records of supply transactions for financial investment decisions
US8626618B2 (en) 2007-11-14 2014-01-07 Panjiva, Inc. Using non-public shipper records to facilitate rating an entity based on public records of supply transactions
KR20090050577A (en) 2007-11-16 2009-05-20 삼성전자주식회사 User interface for displaying and playing multimedia contents and apparatus comprising the same and control method thereof
US20090132953A1 (en) 2007-11-16 2009-05-21 Iac Search & Media, Inc. User interface and method in local search system with vertical search results and an interactive map
US8145703B2 (en) 2007-11-16 2012-03-27 Iac Search & Media, Inc. User interface and method in a local search system with related search results
WO2009073637A2 (en) 2007-11-29 2009-06-11 Iqzone Systems and methods for personal information management and contact picture synchronization and distribution
US20090144262A1 (en) 2007-12-04 2009-06-04 Microsoft Corporation Search query transformation using direct manipulation
US8869098B2 (en) 2007-12-05 2014-10-21 International Business Machines Corporation Computer method and apparatus for providing model to model transformation using an MDA approach
US8270577B2 (en) 2007-12-13 2012-09-18 Verizon Patent And Licensing Inc. Multiple visual voicemail mailboxes
US8001482B2 (en) 2007-12-21 2011-08-16 International Business Machines Corporation Method of displaying tab titles
US8230333B2 (en) 2007-12-26 2012-07-24 Vistracks, Inc. Analysis of time-based geospatial mashups using AD HOC visual queries
US7865308B2 (en) 2007-12-28 2011-01-04 Yahoo! Inc. User-generated activity maps
US20090172669A1 (en) 2007-12-28 2009-07-02 International Business Machines Corporation Use of redundancy groups in runtime computer management of business applications
US8010886B2 (en) 2008-01-04 2011-08-30 Microsoft Corporation Intelligently representing files in a view
US8055633B2 (en) 2008-01-21 2011-11-08 International Business Machines Corporation Method, system and computer program product for duplicate detection
KR100915295B1 (en) 2008-01-22 2009-09-03 성균관대학교산학협력단 System and method for search service having a function of automatic classification of search results
US8239245B2 (en) 2008-01-22 2012-08-07 International Business Machines Corporation Method and apparatus for end-to-end retail store site optimization
US7805457B1 (en) 2008-02-14 2010-09-28 Securus Technologies, Inc. System and method for identifying members of a gang or security threat group
WO2009115921A2 (en) 2008-02-22 2009-09-24 Ipath Technologies Private Limited Techniques for enterprise resource mobilization
US8606807B2 (en) 2008-02-28 2013-12-10 Red Hat, Inc. Integration of triple tags into a tagging tool and text browsing
US20090222760A1 (en) 2008-02-29 2009-09-03 Halverson Steven G Method, System and Computer Program Product for Automating the Selection and Ordering of Column Data in a Table for a User
US20090234720A1 (en) 2008-03-15 2009-09-17 Gridbyte Method and System for Tracking and Coaching Service Professionals
US8229945B2 (en) 2008-03-20 2012-07-24 Schooner Information Technology, Inc. Scalable database management software on a cluster of nodes using a shared-distributed flash memory
US9830366B2 (en) 2008-03-22 2017-11-28 Thomson Reuters Global Resources Online analytic processing cube with time stamping
CN102084386A (en) 2008-03-24 2011-06-01 姜旻秀 Keyword-advertisement method using meta-information related to digital contents and system thereof
KR101806432B1 (en) 2008-03-26 2017-12-07 테라노스, 인코포레이티드 Methods and systems for assessing clinical outcomes
US20090254970A1 (en) 2008-04-04 2009-10-08 Avaya Inc. Multi-tier security event correlation and mitigation
WO2009132106A2 (en) 2008-04-22 2009-10-29 Oxford J Craig System and method for interactive map, database, and social networking engine
US8266168B2 (en) 2008-04-24 2012-09-11 Lexisnexis Risk & Information Analytics Group Inc. Database systems and methods for linking records and entity representations with sufficiently high confidence
JP4585579B2 (en) 2008-04-24 2010-11-24 株式会社日立製作所 Data management method, data management program, and data management apparatus
US8121962B2 (en) 2008-04-25 2012-02-21 Fair Isaac Corporation Automated entity identification for efficient profiling in an event probability prediction system
US8620641B2 (en) 2008-05-16 2013-12-31 Blackberry Limited Intelligent elision
US8844033B2 (en) 2008-05-27 2014-09-23 The Trustees Of Columbia University In The City Of New York Systems, methods, and media for detecting network anomalies using a trained probabilistic model
US8688622B2 (en) 2008-06-02 2014-04-01 The Boeing Company Methods and systems for loading data into a temporal data warehouse
US8813050B2 (en) 2008-06-03 2014-08-19 Isight Partners, Inc. Electronic crime detection and tracking
US20090307049A1 (en) 2008-06-05 2009-12-10 Fair Isaac Corporation Soft Co-Clustering of Data
US7962458B2 (en) 2008-06-12 2011-06-14 Gravic, Inc. Method for replicating explicit locks in a data replication engine
US8301593B2 (en) 2008-06-12 2012-10-30 Gravic, Inc. Mixed mode synchronous and asynchronous replication system
US8452790B1 (en) 2008-06-13 2013-05-28 Ustringer LLC Method and apparatus for distributing content
FI127113B (en) 2008-06-17 2017-11-15 Tekla Corp Information search
US8860754B2 (en) 2008-06-22 2014-10-14 Tableau Software, Inc. Methods and systems of automatically generating marks in a graphical view
US8301904B1 (en) 2008-06-24 2012-10-30 Mcafee, Inc. System, method, and computer program product for automatically identifying potentially unwanted data as unwanted
US9720971B2 (en) 2008-06-30 2017-08-01 International Business Machines Corporation Discovering transformations applied to a source table to generate a target table
AU2009266403A1 (en) 2008-07-02 2010-01-07 Pacific Knowledge Systems Pty. Ltd. Method and system for generating text
GB2461771A (en) 2008-07-11 2010-01-20 Icyte Pty Ltd Annotation of electronic documents with preservation of document as originally annotated
WO2010006334A1 (en) 2008-07-11 2010-01-14 Videosurf, Inc. Apparatus and software system for and method of performing a visual-relevance-rank subsequent search
US8301464B1 (en) 2008-07-18 2012-10-30 Cave Consulting Group, Inc. Method and system for producing statistical analysis of medical care information
WO2010017229A1 (en) 2008-08-04 2010-02-11 Younoodle, Inc. Entity performance analysis engines
US8037040B2 (en) 2008-08-08 2011-10-11 Oracle International Corporation Generating continuous query notifications
US8010545B2 (en) 2008-08-28 2011-08-30 Palo Alto Research Center Incorporated System and method for providing a topic-directed search
US20110078055A1 (en) 2008-09-05 2011-03-31 Claude Faribault Methods and systems for facilitating selecting and/or purchasing of items
JP4612715B2 (en) 2008-09-05 2011-01-12 株式会社日立製作所 Information processing system, data update method, and data update program
US9348499B2 (en) 2008-09-15 2016-05-24 Palantir Technologies, Inc. Sharing objects that rely on local resources with outside servers
US8041714B2 (en) 2008-09-15 2011-10-18 Palantir Technologies, Inc. Filter chains with associated views for exploring large data sets
US20100070845A1 (en) 2008-09-17 2010-03-18 International Business Machines Corporation Shared web 2.0 annotations linked to content segments of web documents
US8667583B2 (en) 2008-09-22 2014-03-04 Microsoft Corporation Collecting and analyzing malware data
US8214361B1 (en) 2008-09-30 2012-07-03 Google Inc. Organizing search results in a topic hierarchy
US8554579B2 (en) 2008-10-13 2013-10-08 Fht, Inc. Management, reporting and benchmarking of medication preparation
US20100114887A1 (en) 2008-10-17 2010-05-06 Google Inc. Textual Disambiguation Using Social Connections
US8391584B2 (en) 2008-10-20 2013-03-05 Jpmorgan Chase Bank, N.A. Method and system for duplicate check detection
US8108933B2 (en) 2008-10-21 2012-01-31 Lookout, Inc. System and method for attack and malware prevention
US8411046B2 (en) 2008-10-23 2013-04-02 Microsoft Corporation Column organization of content
US20100106611A1 (en) 2008-10-24 2010-04-29 Uc Group Ltd. Financial transactions systems and methods
US8306947B2 (en) 2008-10-30 2012-11-06 Hewlett-Packard Development Company, L.P. Replication of operations on objects distributed in a storage system
US7974943B2 (en) 2008-10-30 2011-07-05 Hewlett-Packard Development Company, L.P. Building a synchronized target database
US8717364B2 (en) 2008-11-15 2014-05-06 New BIS Safe Luxco S.a.r.l Data visualization methods
US9818118B2 (en) 2008-11-19 2017-11-14 Visa International Service Association Transaction aggregator
US20100131502A1 (en) 2008-11-25 2010-05-27 Fordham Bradley S Cohort group generation and automatic updating
US20100131457A1 (en) 2008-11-26 2010-05-27 Microsoft Corporation Flattening multi-dimensional data sets into de-normalized form
US10002161B2 (en) 2008-12-03 2018-06-19 Sap Se Multithreading and concurrency control for a rule-based transaction engine
US8204859B2 (en) 2008-12-10 2012-06-19 Commvault Systems, Inc. Systems and methods for managing replicated database data
CN101414375A (en) 2008-12-15 2009-04-22 阿里巴巴集团控股有限公司 System and method for networking trading using intermediate platform
KR101207510B1 (en) 2008-12-18 2012-12-03 한국전자통신연구원 Cluster Data Management System And Method for Data Restoring Using Shared Read-Only Log in Cluster Data Management System
US8346820B2 (en) 2008-12-22 2013-01-01 Google Inc. Asynchronous distributed garbage collection for replicated storage clusters
US8762869B2 (en) 2008-12-23 2014-06-24 Intel Corporation Reduced complexity user interface
US8719350B2 (en) 2008-12-23 2014-05-06 International Business Machines Corporation Email addressee verification
US8352347B2 (en) 2008-12-29 2013-01-08 Athenainvest, Inc. Investment classification and tracking system using diamond ratings
US8073877B2 (en) 2009-01-20 2011-12-06 Yahoo! Inc. Scalable semi-structured named entity detection
US20100262688A1 (en) 2009-01-21 2010-10-14 Daniar Hussain Systems, methods, and devices for detecting security vulnerabilities in ip networks
US20100191563A1 (en) 2009-01-23 2010-07-29 Doctors' Administrative Solutions, Llc Physician Practice Optimization Tracking
WO2010085773A1 (en) 2009-01-24 2010-07-29 Kontera Technologies, Inc. Hybrid contextual advertising and related content analysis and display techniques
US8601401B2 (en) 2009-01-30 2013-12-03 Navico Holding As Method, apparatus and computer program product for synchronizing cursor events
US20100211535A1 (en) 2009-02-17 2010-08-19 Rosenberger Mark Elliot Methods and systems for management of data
US8886689B2 (en) 2009-02-17 2014-11-11 Trane U.S. Inc. Efficient storage of data allowing for multiple level granularity retrieval
EP2221733A1 (en) 2009-02-17 2010-08-25 AMADEUS sas Method allowing validation in a production database of new entered data prior to their release
US20100228752A1 (en) 2009-02-25 2010-09-09 Microsoft Corporation Multi-condition filtering of an interactive summary table
US9177264B2 (en) 2009-03-06 2015-11-03 Chiaramail, Corp. Managing message categories in a network
US8473454B2 (en) 2009-03-10 2013-06-25 Xerox Corporation System and method of on-demand document processing
US8078825B2 (en) 2009-03-11 2011-12-13 Oracle America, Inc. Composite hash and list partitioning of database tables
US20100235915A1 (en) 2009-03-12 2010-09-16 Nasir Memon Using host symptoms, host roles, and/or host reputation for detection of host infection
US8447722B1 (en) 2009-03-25 2013-05-21 Mcafee, Inc. System and method for data mining and security policy management
IL197961A0 (en) 2009-04-05 2009-12-24 Guy Shaked Methods for effective processing of time series
US9767427B2 (en) 2009-04-30 2017-09-19 Hewlett Packard Enterprise Development Lp Modeling multi-dimensional sequence data over streams
US8719249B2 (en) 2009-05-12 2014-05-06 Microsoft Corporation Query classification
US8856691B2 (en) 2009-05-29 2014-10-07 Microsoft Corporation Gesture tool
US20100306029A1 (en) 2009-06-01 2010-12-02 Ryan Jolley Cardholder Clusters
US8495151B2 (en) 2009-06-05 2013-07-23 Chandra Bodapati Methods and systems for determining email addresses
US9268761B2 (en) 2009-06-05 2016-02-23 Microsoft Technology Licensing, Llc In-line dynamic text with variable formatting
US20100321399A1 (en) 2009-06-18 2010-12-23 Patrik Ellren Maps from Sparse Geospatial Data Tiles
KR101076887B1 (en) 2009-06-26 2011-10-25 주식회사 하이닉스반도체 Method of fabricating landing plug in semiconductor device
US20110004498A1 (en) 2009-07-01 2011-01-06 International Business Machines Corporation Method and System for Identification By A Cardholder of Credit Card Fraud
US8209699B2 (en) 2009-07-10 2012-06-26 Teradata Us, Inc. System and method for subunit operations in a database
US9104695B1 (en) 2009-07-27 2015-08-11 Palantir Technologies, Inc. Geotagging structured data
US8635223B2 (en) 2009-07-28 2014-01-21 Fti Consulting, Inc. System and method for providing a classification suggestion for electronically stored information
US8321943B1 (en) 2009-07-30 2012-11-27 Symantec Corporation Programmatic communication in the event of host malware infection
WO2011020101A2 (en) 2009-08-14 2011-02-17 Telogis, Inc. Real time map rendering with data clustering and expansion and overlay
US8560548B2 (en) 2009-08-19 2013-10-15 International Business Machines Corporation System, method, and apparatus for multidimensional exploration of content items in a content store
US20110047540A1 (en) 2009-08-24 2011-02-24 Embarcadero Technologies Inc. System and Methodology for Automating Delivery, Licensing, and Availability of Software Products
US8334773B2 (en) 2009-08-28 2012-12-18 Deal Magic, Inc. Asset monitoring and tracking system
JP5431235B2 (en) 2009-08-28 2014-03-05 株式会社日立製作所 Equipment condition monitoring method and apparatus
US8229876B2 (en) 2009-09-01 2012-07-24 Oracle International Corporation Expediting K-means cluster analysis data mining using subsample elimination preprocessing
US20110066933A1 (en) 2009-09-02 2011-03-17 Ludwig Lester F Value-driven visualization primitives for spreadsheets, tabular data, and advanced spreadsheet visualization
US9280777B2 (en) 2009-09-08 2016-03-08 Target Brands, Inc. Operations dashboard
US20110184813A1 (en) 2009-09-14 2011-07-28 Cbs Interactive, Inc. Targeting offers to users of a web site
US8214490B1 (en) 2009-09-15 2012-07-03 Symantec Corporation Compact input compensating reputation data tracking mechanism
US8756489B2 (en) 2009-09-17 2014-06-17 Adobe Systems Incorporated Method and system for dynamic assembly of form fragments
US8347398B1 (en) 2009-09-23 2013-01-01 Savvystuff Property Trust Selected text obfuscation and encryption in a local, network and cloud computing environment
US20110074811A1 (en) 2009-09-25 2011-03-31 Apple Inc. Map Layout for Print Production
US20110078173A1 (en) 2009-09-30 2011-03-31 Avaya Inc. Social Network User Interface
US20110087519A1 (en) 2009-10-09 2011-04-14 Visa U.S.A. Inc. Systems and Methods for Panel Enhancement with Transaction Data
US8595058B2 (en) 2009-10-15 2013-11-26 Visa U.S.A. Systems and methods to match identifiers
US8554699B2 (en) 2009-10-20 2013-10-08 Google Inc. Method and system for detecting anomalies in time series data
US9165304B2 (en) 2009-10-23 2015-10-20 Service Management Group, Inc. Analyzing consumer behavior using electronically-captured consumer location data
US20110099133A1 (en) 2009-10-28 2011-04-28 Industrial Technology Research Institute Systems and methods for capturing and managing collective social intelligence information
CN102054015B (en) 2009-10-28 2014-05-07 财团法人工业技术研究院 System and method of organizing community intelligent information by using organic matter data model
US8312367B2 (en) 2009-10-30 2012-11-13 Synopsys, Inc. Technique for dynamically sizing columns in a table
US8806355B2 (en) 2009-11-06 2014-08-12 Cisco Technology, Inc. Method and apparatus for visualizing and navigating within an immersive collaboration environment
EP2499748A4 (en) 2009-11-13 2017-03-01 Zoll Medical Corporation Community-based response system
US20110131547A1 (en) 2009-12-01 2011-06-02 International Business Machines Corporation Method and system defining and interchanging diagrams of graphical modeling languages
US20110131130A1 (en) 2009-12-01 2011-06-02 Bank Of America Corporation Integrated risk assessment and management system
US11122009B2 (en) 2009-12-01 2021-09-14 Apple Inc. Systems and methods for identifying geographic locations of social media content collected over social networks
US8645478B2 (en) 2009-12-10 2014-02-04 Mcafee, Inc. System and method for monitoring social engineering in a computer network environment
US20110153384A1 (en) 2009-12-17 2011-06-23 Matthew Donald Horne Visual comps builder
US8676597B2 (en) 2009-12-28 2014-03-18 General Electric Company Methods and systems for mapping healthcare services analytics for volume and trends
US20110161132A1 (en) 2009-12-29 2011-06-30 Sukriti Goel Method and system for extracting process sequences
US8626768B2 (en) 2010-01-06 2014-01-07 Microsoft Corporation Automated discovery aggregation and organization of subject area discussions
US8564596B2 (en) 2010-01-12 2013-10-22 Palantir Technologies, Inc. Techniques for density mapping
US9026552B2 (en) 2010-01-18 2015-05-05 Salesforce.Com, Inc. System and method for linking contact records to company locations
US8271461B2 (en) 2010-01-18 2012-09-18 Battelle Memorial Institute Storing and managing information artifacts collected by information analysts using a computing device
US8571919B2 (en) 2010-01-20 2013-10-29 American Express Travel Related Services Company, Inc. System and method for identifying attributes of a population using spend level data
US8290926B2 (en) 2010-01-21 2012-10-16 Microsoft Corporation Scalable topical aggregation of data feeds
US8843855B2 (en) 2010-01-25 2014-09-23 Linx Systems, Inc. Displaying maps of measured events
US8683363B2 (en) 2010-01-26 2014-03-25 Apple Inc. Device, method, and graphical user interface for managing user interface content and user interface elements
US8260664B2 (en) 2010-02-05 2012-09-04 Microsoft Corporation Semantic advertising selection from lateral concepts and topics
US20110208565A1 (en) 2010-02-23 2011-08-25 Michael Ross complex process management
US20110219321A1 (en) 2010-03-02 2011-09-08 Microsoft Corporation Web-based control using integrated control interface having dynamic hit zones
US20110218934A1 (en) 2010-03-03 2011-09-08 Jeremy Elser System and methods for comparing real properties for purchase and for generating heat maps to aid in identifying price anomalies of such real properties
US8478709B2 (en) 2010-03-08 2013-07-02 Hewlett-Packard Development Company, L.P. Evaluation of client status for likelihood of churn
US8863279B2 (en) 2010-03-08 2014-10-14 Raytheon Company System and method for malware detection
US8868728B2 (en) 2010-03-11 2014-10-21 Accenture Global Services Limited Systems and methods for detecting and investigating insider fraud
US20110231296A1 (en) 2010-03-16 2011-09-22 UberMedia, Inc. Systems and methods for interacting with messages, authors, and followers
US8738418B2 (en) 2010-03-19 2014-05-27 Visa U.S.A. Inc. Systems and methods to enhance search data with transaction based data
US8577911B1 (en) 2010-03-23 2013-11-05 Google Inc. Presenting search term refinements
US20110238553A1 (en) 2010-03-26 2011-09-29 Ashwin Raj Electronic account-to-account funds transfer
US8306846B2 (en) 2010-04-12 2012-11-06 First Data Corporation Transaction location analytics systems and methods
US20110251951A1 (en) 2010-04-13 2011-10-13 Dan Kolkowitz Anti-fraud event correlation
US8572023B2 (en) 2010-04-14 2013-10-29 Bank Of America Corporation Data services framework workflow processing
US10198463B2 (en) 2010-04-16 2019-02-05 Salesforce.Com, Inc. Methods and systems for appending data to large data volumes in a multi-tenant store
US8719267B2 (en) 2010-04-19 2014-05-06 Alcatel Lucent Spectral neighborhood blocking for entity resolution
US8255399B2 (en) 2010-04-28 2012-08-28 Microsoft Corporation Data classifier
US8874432B2 (en) 2010-04-28 2014-10-28 Nec Laboratories America, Inc. Systems and methods for semi-supervised relationship extraction
US8799812B2 (en) 2010-04-29 2014-08-05 Cheryl Parker System and method for geographic based data visualization and extraction
US8489331B2 (en) 2010-04-29 2013-07-16 Microsoft Corporation Destination maps user interface
US8473415B2 (en) 2010-05-04 2013-06-25 Kevin Paul Siegel System and method for identifying a point of compromise in a payment transaction processing system
US8392394B1 (en) 2010-05-04 2013-03-05 Google Inc. Merging search results
US20120116828A1 (en) 2010-05-10 2012-05-10 Shannon Jeffrey L Promotions and advertising system
US8595234B2 (en) 2010-05-17 2013-11-26 Wal-Mart Stores, Inc. Processing data feeds
US20110289407A1 (en) 2010-05-18 2011-11-24 Naik Devang K Font recommendation engine
US20110289397A1 (en) 2010-05-19 2011-11-24 Mauricio Eastmond Displaying Table Data in a Limited Display Area
JP5161267B2 (en) 2010-05-19 2013-03-13 株式会社日立製作所 Screen customization support system, screen customization support method, and screen customization support program
US8723679B2 (en) 2010-05-25 2014-05-13 Public Engines, Inc. Systems and methods for transmitting alert messages relating to events that occur within a pre-defined area
US20110295649A1 (en) 2010-05-31 2011-12-01 International Business Machines Corporation Automatic churn prediction
US8756224B2 (en) 2010-06-16 2014-06-17 Rallyverse, Inc. Methods, systems, and media for content ranking using real-time data
US20110310005A1 (en) 2010-06-17 2011-12-22 Qualcomm Incorporated Methods and apparatus for contactless gesture recognition
US8380719B2 (en) 2010-06-18 2013-02-19 Microsoft Corporation Semantic content searching
US8706854B2 (en) 2010-06-30 2014-04-22 Raytheon Company System and method for organizing, managing and running enterprise-wide scans
KR101196935B1 (en) 2010-07-05 2012-11-05 엔에이치엔(주) Method and system for providing reprsentation words of real-time popular keyword
US8489641B1 (en) 2010-07-08 2013-07-16 Google Inc. Displaying layers of search results on a map
WO2012004933A1 (en) 2010-07-09 2012-01-12 パナソニック株式会社 Object mapping device, method of mapping object, program and recording medium
US8407341B2 (en) 2010-07-09 2013-03-26 Bank Of America Corporation Monitoring communications
US20120019559A1 (en) 2010-07-20 2012-01-26 Siler Lucas C Methods and Apparatus for Interactive Display of Images and Measurements
WO2012025915A1 (en) 2010-07-21 2012-03-01 Sqream Technologies Ltd A system and method for the parallel execution of database queries over cpus and multi core processors
US8554653B2 (en) 2010-07-22 2013-10-08 Visa International Service Association Systems and methods to identify payment accounts having business spending activities
DE102010036906A1 (en) 2010-08-06 2012-02-09 Tavendo Gmbh Configurable pie menu
US20120036013A1 (en) 2010-08-09 2012-02-09 Brent Lee Neuhaus System and method for determining a consumer's location code from payment transaction data
US8775530B2 (en) 2010-08-25 2014-07-08 International Business Machines Corporation Communication management method and system
US20120050293A1 (en) 2010-08-25 2012-03-01 Apple, Inc. Dynamically smoothing a curve
US20120066166A1 (en) 2010-09-10 2012-03-15 International Business Machines Corporation Predictive Analytics for Semi-Structured Case Oriented Processes
US8661335B2 (en) 2010-09-20 2014-02-25 Blackberry Limited Methods and systems for identifying content elements
US9747270B2 (en) 2011-01-07 2017-08-29 Microsoft Technology Licensing, Llc Natural input for spreadsheet actions
US9069842B2 (en) 2010-09-28 2015-06-30 The Mitre Corporation Accessing documents using predictive word sequences
WO2012043650A1 (en) 2010-09-29 2012-04-05 楽天株式会社 Display program, display device, information processing method, recording medium, and information processing device
US8463036B1 (en) 2010-09-30 2013-06-11 A9.Com, Inc. Shape-based search of a collection of content
US20120084118A1 (en) 2010-09-30 2012-04-05 International Business Machines Corporation Sales predication for a new store based on on-site market survey data and high resolution geographical information
US8549004B2 (en) 2010-09-30 2013-10-01 Hewlett-Packard Development Company, L.P. Estimation of unique database values
US20120084135A1 (en) 2010-10-01 2012-04-05 Smartslips Inc. System and method for tracking transaction records in a network
EP2444134A1 (en) 2010-10-19 2012-04-25 Travian Games GmbH Methods, server system and browser clients for providing a game map of a browser-based online multi-player game
US8781169B2 (en) 2010-11-03 2014-07-15 Endeavoring, Llc Vehicle tracking and locating system
US8700643B1 (en) 2010-11-03 2014-04-15 Google Inc. Managing electronic media collections
US8316030B2 (en) 2010-11-05 2012-11-20 Nextgen Datacom, Inc. Method and system for document classification or search using discrete words
WO2012065186A2 (en) 2010-11-12 2012-05-18 Realnetworks, Inc. Traffic management in adaptive streaming protocols
CN102467596B (en) 2010-11-15 2016-09-21 商业对象软件有限公司 Instrument board evaluator
US20120131107A1 (en) 2010-11-18 2012-05-24 Microsoft Corporation Email Filtering Using Relationship and Reputation Data
JP5706137B2 (en) 2010-11-22 2015-04-22 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation Method and computer program for displaying a plurality of posts (groups of data) on a computer screen in real time along a plurality of axes
WO2012071533A1 (en) 2010-11-24 2012-05-31 LogRhythm Inc. Advanced intelligence engine
WO2012071571A2 (en) 2010-11-26 2012-05-31 Agency For Science, Technology And Research Method for creating a report from radiological images using electronic report templates
US20120137235A1 (en) 2010-11-29 2012-05-31 Sabarish T S Dynamic user interface generation
US20120136804A1 (en) 2010-11-30 2012-05-31 Raymond J. Lucia, SR. Wealth Management System and Method
US8839133B2 (en) 2010-12-02 2014-09-16 Microsoft Corporation Data visualizations including interactive time line representations
CN102546446A (en) 2010-12-13 2012-07-04 太仓市浏河镇亿网行网络技术服务部 Email device
US8521667B2 (en) 2010-12-15 2013-08-27 Microsoft Corporation Detection and categorization of malicious URLs
US9141405B2 (en) 2010-12-15 2015-09-22 International Business Machines Corporation User interface construction
US9378294B2 (en) 2010-12-17 2016-06-28 Microsoft Technology Licensing, Llc Presenting source regions of rendered source web pages in target regions of target web pages
US20120159399A1 (en) 2010-12-17 2012-06-21 International Business Machines Corporation System for organizing and navigating data within a table
US9336184B2 (en) 2010-12-17 2016-05-10 Microsoft Technology Licensing, Llc Representation of an interactive document as a graph of entities
US20120158527A1 (en) 2010-12-21 2012-06-21 Class6Ix, Llc Systems, Methods and/or Computer Readable Storage Media Facilitating Aggregation and/or Personalized Sequencing of News Video Content
EP2654864B1 (en) 2010-12-22 2020-10-28 Syqe Medical Ltd. System for drug delivery
US8682812B1 (en) 2010-12-23 2014-03-25 Narus, Inc. Machine learning based botnet detection using real-time extracted traffic features
US9881257B2 (en) 2010-12-29 2018-01-30 Tickr, Inc. Multi-dimensional visualization of temporal information
US20120173381A1 (en) 2011-01-03 2012-07-05 Stanley Benjamin Smith Process and system for pricing and processing weighted data in a federated or subscription based data source
EP2668608A4 (en) 2011-01-27 2017-07-05 L-3 Communications Corporation Internet isolation for avoiding internet security threats
US8510154B2 (en) 2011-01-27 2013-08-13 Leroy Robinson Method and system for searching for, and monitoring assessment of, original content creators and the original content thereof
US8447263B2 (en) 2011-01-28 2013-05-21 Don Reich Emergency call analysis system
US8437731B2 (en) 2011-01-28 2013-05-07 Don Reich Emergency call analysis system
US20120203584A1 (en) 2011-02-07 2012-08-09 Amnon Mishor System and method for identifying potential customers
IL211163A0 (en) 2011-02-10 2011-04-28 Univ Ben Gurion A method for generating a randomized data structure for representing sets, based on bloom filters
CN103380426B (en) 2011-02-16 2017-09-22 英派尔科技开发有限公司 Inquiry is performed using semantic restriction relation
US20120215898A1 (en) 2011-02-17 2012-08-23 Nitin Jayant Shah Applications of a Network-Centric Information Distribution Platform on the Internet
KR101950529B1 (en) 2011-02-24 2019-02-20 렉시스넥시스, 어 디비젼 오브 리드 엘서비어 인크. Methods for electronic document searching and graphically representing electronic document searches
US20120246148A1 (en) 2011-03-22 2012-09-27 Intergraph Technologies Company Contextual Display and Scrolling of Search Results in Graphical Environment
US9449010B2 (en) 2011-04-02 2016-09-20 Open Invention Network, Llc System and method for managing sensitive data using intelligent mobile agents on a network
US8381120B2 (en) 2011-04-11 2013-02-19 Credibility Corp. Visualization tools for reviewing credibility and stateful hierarchical access to credibility
US8839434B2 (en) 2011-04-15 2014-09-16 Raytheon Company Multi-nodal malware analysis
US10185932B2 (en) 2011-05-06 2019-01-22 Microsoft Technology Licensing, Llc Setting permissions for links forwarded in electronic messages
US9047441B2 (en) 2011-05-24 2015-06-02 Palo Alto Networks, Inc. Malware analysis system
US10380585B2 (en) 2011-06-02 2019-08-13 Visa International Service Association Local usage of electronic tokens in a transaction processing system
US10395256B2 (en) 2011-06-02 2019-08-27 Visa International Service Association Reputation management in a transaction processing system
US20120310702A1 (en) 2011-06-03 2012-12-06 Uc Group Limited Systems and methods for monitoring compulsive behavior and for identifying early warning indicators across multiple websites
US8799190B2 (en) 2011-06-17 2014-08-05 Microsoft Corporation Graph-based malware classification based on file relationships
US9104765B2 (en) 2011-06-17 2015-08-11 Robert Osann, Jr. Automatic webpage characterization and search results annotation
US8799240B2 (en) 2011-06-23 2014-08-05 Palantir Technologies, Inc. System and method for investigating large amounts of data
US9092482B2 (en) 2013-03-14 2015-07-28 Palantir Technologies, Inc. Fair scheduling for mixed-query loads
US9547693B1 (en) 2011-06-23 2017-01-17 Palantir Technologies Inc. Periodic database search manager for multiple data sources
US8640246B2 (en) 2011-06-27 2014-01-28 Raytheon Company Distributed malware detection
US8725307B2 (en) 2011-06-28 2014-05-13 Schneider Electric It Corporation System and method for measurement aided prediction of temperature and airflow values in a data center
US20130006725A1 (en) 2011-06-30 2013-01-03 Accenture Global Services Limited Tolling integration technology
US20130006657A1 (en) 2011-06-30 2013-01-03 Verizon Patent And Licensing Inc. Reporting and analytics for healthcare fraud detection information
CA2837454A1 (en) 2011-07-01 2013-01-10 Truecar, Inc. Method and system for selection, filtering or presentation of available sales outlets
US9240011B2 (en) 2011-07-13 2016-01-19 Visa International Service Association Systems and methods to communicate with transaction terminals
US9026944B2 (en) 2011-07-14 2015-05-05 Microsoft Technology Licensing, Llc Managing content through actions on context based menus
US8751399B2 (en) 2011-07-15 2014-06-10 Wal-Mart Stores, Inc. Multi-channel data driven, real-time anti-money laundering system for electronic payment cards
US8982130B2 (en) 2011-07-15 2015-03-17 Green Charge Networks Cluster mapping to highlight areas of electrical congestion
US8726379B1 (en) 2011-07-15 2014-05-13 Norse Corporation Systems and methods for dynamic protection from electronic attacks
US8447674B2 (en) 2011-07-21 2013-05-21 Bank Of America Corporation Multi-stage filtering for fraud detection with customer history filters
US20130024268A1 (en) 2011-07-22 2013-01-24 Ebay Inc. Incentivizing the linking of internet content to products for sale
US8666919B2 (en) 2011-07-29 2014-03-04 Accenture Global Services Limited Data quality management for profiling, linking, cleansing and migrating data
US20130036038A1 (en) 2011-08-02 2013-02-07 Tata Consultancy Services Limited Financial activity monitoring system
US9280532B2 (en) 2011-08-02 2016-03-08 Palantir Technologies, Inc. System and method for accessing rich objects via spreadsheets
EP2560134A1 (en) 2011-08-19 2013-02-20 Agor Services BVBA A platform and method enabling collaboration between value chain partners
US20130046635A1 (en) 2011-08-19 2013-02-21 Bank Of America Corporation Triggering offers based on detected location of a mobile point of sale device
US8966392B2 (en) 2011-08-29 2015-02-24 Novell, Inc. Event management apparatus, systems, and methods
US8630892B2 (en) 2011-08-31 2014-01-14 Accenture Global Services Limited Churn analysis system
US8854371B2 (en) 2011-08-31 2014-10-07 Sap Ag Method and system for generating a columnar tree map
US8504542B2 (en) 2011-09-02 2013-08-06 Palantir Technologies, Inc. Multi-row transactions
US8533204B2 (en) 2011-09-02 2013-09-10 Xerox Corporation Text-based searching of image data
US10031646B2 (en) 2011-09-07 2018-07-24 Mcafee, Llc Computer system security dashboard
US8949164B1 (en) 2011-09-08 2015-02-03 George O. Mohler Event forecasting system
WO2013036785A2 (en) 2011-09-08 2013-03-14 Hewlett-Packard Development Company, L.P. Visual component and drill down mapping
US10140620B2 (en) 2011-09-15 2018-11-27 Stephan HEATH Mobile device system and method providing combined delivery system using 3D geo-target location-based mobile commerce searching/purchases, discounts/coupons products, goods, and services, or service providers-geomapping-company/local and socially-conscious information/social networking (“PS-GM-C/LandSC/I-SN”)
WO2013044141A2 (en) 2011-09-22 2013-03-28 Capgemini U.S. Llc Process transformation and transitioning apparatuses, methods and systems
US8903355B2 (en) 2011-09-26 2014-12-02 Solacom Technologies Inc. Answering or releasing emergency calls from a map display for an emergency services platform
US20130086482A1 (en) 2011-09-30 2013-04-04 Cbs Interactive, Inc. Displaying plurality of content items in window
BR112014008351A2 (en) 2011-10-05 2017-04-18 Mastercard International Inc naming mechanism
US20130097482A1 (en) 2011-10-13 2013-04-18 Microsoft Corporation Search result entry truncation using pixel-based approximation
US8849776B2 (en) 2011-10-17 2014-09-30 Yahoo! Inc. Method and system for resolving data inconsistency
US20130101159A1 (en) 2011-10-21 2013-04-25 Qualcomm Incorporated Image and video based pedestrian traffic estimation
US9460218B2 (en) 2011-10-26 2016-10-04 Google Inc. Indicating location status
US8918424B2 (en) 2011-10-31 2014-12-23 Advanced Community Services Managing homeowner association messages
US9411797B2 (en) 2011-10-31 2016-08-09 Microsoft Technology Licensing, Llc Slicer elements for filtering tabular data
US8843421B2 (en) 2011-11-01 2014-09-23 Accenture Global Services Limited Identification of entities likely to engage in a behavior
US9009183B2 (en) 2011-11-03 2015-04-14 Microsoft Technology Licensing, Llc Transformation of a system change set from machine-consumable form to a form that is readily consumable by a human
US9053083B2 (en) 2011-11-04 2015-06-09 Microsoft Technology Licensing, Llc Interaction between web gadgets and spreadsheets
US9298825B2 (en) 2011-11-17 2016-03-29 Microsoft Technology Licensing, Llc Tagging entities with descriptive phrases
US8498984B1 (en) 2011-11-21 2013-07-30 Google Inc. Categorization of search results
US9159024B2 (en) 2011-12-07 2015-10-13 Wal-Mart Stores, Inc. Real-time predictive intelligence platform
CN103167093A (en) 2011-12-08 2013-06-19 青岛海信移动通信技术股份有限公司 Filling method of mobile phone email address
US20130151388A1 (en) 2011-12-12 2013-06-13 Visa International Service Association Systems and methods to identify affluence levels of accounts
US9026364B2 (en) 2011-12-12 2015-05-05 Toyota Jidosha Kabushiki Kaisha Place affinity estimation
US20130157234A1 (en) 2011-12-14 2013-06-20 Microsoft Corporation Storyline visualization
US8868558B2 (en) 2011-12-19 2014-10-21 Yahoo! Inc. Quote-based search
US20130160120A1 (en) 2011-12-20 2013-06-20 Yahoo! Inc. Protecting end users from malware using advertising virtual machine
US9026480B2 (en) 2011-12-21 2015-05-05 Telenav, Inc. Navigation system with point of interest classification mechanism and method of operation thereof
US20130166550A1 (en) 2011-12-21 2013-06-27 Sap Ag Integration of Tags and Object Data
US8880420B2 (en) 2011-12-27 2014-11-04 Grubhub, Inc. Utility for creating heatmaps for the study of competitive advantage in the restaurant marketplace
US9189556B2 (en) 2012-01-06 2015-11-17 Google Inc. System and method for displaying information local to a selected area
WO2013102892A1 (en) 2012-01-06 2013-07-11 Technologies Of Voice Interface Ltd A system and method for generating personalized sensor-based activation of software
US9116994B2 (en) 2012-01-09 2015-08-25 Brightedge Technologies, Inc. Search engine optimization for category specific search results
US8843431B2 (en) 2012-01-16 2014-09-23 International Business Machines Corporation Social network analysis for churn prediction
US8909648B2 (en) 2012-01-18 2014-12-09 Technion Research & Development Foundation Limited Methods and systems of supervised learning of semantic relatedness
US20130197925A1 (en) 2012-01-31 2013-08-01 Optumlnsight, Inc. Behavioral clustering for removing outlying healthcare providers
US8965422B2 (en) 2012-02-23 2015-02-24 Blackberry Limited Tagging instant message content for retrieval using mobile communication devices
AU2013222093A1 (en) 2012-02-24 2014-09-11 Mccormick & Company, Incorporated System and method for providing flavor advisement and enhancement
WO2013130633A1 (en) 2012-02-29 2013-09-06 Google Inc. Interactive query completion templates
US20130232045A1 (en) 2012-03-04 2013-09-05 Oracle International Corporation Automatic Detection Of Fraud And Error Using A Vector-Cluster Model
JP2013191187A (en) 2012-03-15 2013-09-26 Fujitsu Ltd Processing device, program and processing system
US8787939B2 (en) 2012-03-27 2014-07-22 Facebook, Inc. Dynamic geographic beacons for geographic-positioning-capable devices
US20130263019A1 (en) 2012-03-30 2013-10-03 Maria G. Castellanos Analyzing social media
US8738665B2 (en) 2012-04-02 2014-05-27 Apple Inc. Smart progress indicator
US8983936B2 (en) 2012-04-04 2015-03-17 Microsoft Corporation Incremental visualization for structured data in an enterprise-level data store
US9071653B2 (en) 2012-04-05 2015-06-30 Verizon Patent And Licensing Inc. Reducing cellular network traffic
US8792677B2 (en) 2012-04-19 2014-07-29 Intelligence Based Integrated Security Systems, Inc. Large venue security method
US9298856B2 (en) 2012-04-23 2016-03-29 Sap Se Interactive data exploration and visualization tool
US9043710B2 (en) 2012-04-26 2015-05-26 Sap Se Switch control in report generation
US8742934B1 (en) 2012-04-29 2014-06-03 Intel-Based Solutions, LLC System and method for facilitating the execution of law enforcement duties and enhancing anti-terrorism and counter-terrorism capabilities
US10304036B2 (en) 2012-05-07 2019-05-28 Nasdaq, Inc. Social media profiling for one or more authors using one or more social media platforms
EP2662782A1 (en) 2012-05-10 2013-11-13 Siemens Aktiengesellschaft Method and system for storing data in a database
US20140032506A1 (en) 2012-06-12 2014-01-30 Quality Attributes Software, Inc. System and methods for real-time detection, correction, and transformation of time series data
US9003023B2 (en) 2012-06-13 2015-04-07 Zscaler, Inc. Systems and methods for interactive analytics of internet traffic
CN102760172B (en) 2012-06-28 2015-05-20 北京奇虎科技有限公司 Network searching method and network searching system
US8966441B2 (en) 2012-07-12 2015-02-24 Oracle International Corporation Dynamic scripts to extend static applications
WO2014010082A1 (en) 2012-07-13 2014-01-16 株式会社日立ソリューションズ Retrieval device, method for controlling retrieval device, and recording medium
US20140058763A1 (en) 2012-07-24 2014-02-27 Deloitte Development Llc Fraud detection methods and systems
US8836788B2 (en) 2012-08-06 2014-09-16 Cloudparc, Inc. Controlling use of parking spaces and restricted locations using multiple cameras
US20140047319A1 (en) 2012-08-13 2014-02-13 Sap Ag Context injection and extraction in xml documents based on common sparse templates
US8554875B1 (en) 2012-08-13 2013-10-08 Ribbon Labs, Inc. Communicating future locations in a social network
US20140149272A1 (en) 2012-08-17 2014-05-29 Trueex Group Llc Interoffice bank offered rate financial product and implementation
US10311062B2 (en) 2012-08-21 2019-06-04 Microsoft Technology Licensing, Llc Filtering structured data using inexact, culture-dependent terms
US8676857B1 (en) 2012-08-23 2014-03-18 International Business Machines Corporation Context-based search for a data store related to a graph node
US10163158B2 (en) 2012-08-27 2018-12-25 Yuh-Shen Song Transactional monitoring system
JP5904909B2 (en) 2012-08-31 2016-04-20 株式会社日立製作所 Supplier search device and supplier search program
US20140068487A1 (en) 2012-09-05 2014-03-06 Roche Diagnostics Operations, Inc. Computer Implemented Methods For Visualizing Correlations Between Blood Glucose Data And Events And Apparatuses Thereof
US20140074855A1 (en) 2012-09-13 2014-03-13 Verance Corporation Multimedia content tags
US20140081652A1 (en) 2012-09-14 2014-03-20 Risk Management Solutions Llc Automated Healthcare Risk Management System Utilizing Real-time Predictive Models, Risk Adjusted Provider Cost Index, Edit Analytics, Strategy Management, Managed Learning Environment, Contact Management, Forensic GUI, Case Management And Reporting System For Preventing And Detecting Healthcare Fraud, Abuse, Waste And Errors
US20140095273A1 (en) 2012-09-28 2014-04-03 Catalina Marketing Corporation Basket aggregator and locator
US20140095509A1 (en) 2012-10-02 2014-04-03 Banjo, Inc. Method of tagging content lacking geotags with a location
GB2578839B (en) 2012-10-08 2020-08-19 Fisher Rosemount Systems Inc Dynamically reusable classes
CN103731447B (en) 2012-10-11 2019-03-26 腾讯科技(深圳)有限公司 A kind of data query method and system
US9104786B2 (en) 2012-10-12 2015-08-11 International Business Machines Corporation Iterative refinement of cohorts using visual exploration and data analytics
US8688573B1 (en) 2012-10-16 2014-04-01 Intuit Inc. Method and system for identifying a merchant payee associated with a cash transaction
US20140108068A1 (en) 2012-10-17 2014-04-17 Jonathan A. Williams System and Method for Scheduling Tee Time
US8914886B2 (en) 2012-10-29 2014-12-16 Mcafee, Inc. Dynamic quarantining for malware detection
US9501799B2 (en) 2012-11-08 2016-11-22 Hartford Fire Insurance Company System and method for determination of insurance classification of entities
US9378030B2 (en) 2013-10-01 2016-06-28 Aetherpal, Inc. Method and apparatus for interactive mobile device guidance
US10504127B2 (en) 2012-11-15 2019-12-10 Home Depot Product Authority, Llc System and method for classifying relevant competitors
US20140143009A1 (en) 2012-11-16 2014-05-22 International Business Machines Corporation Risk reward estimation for company-country pairs
US9146969B2 (en) 2012-11-26 2015-09-29 The Boeing Company System and method of reduction of irrelevant information during search
US20140149130A1 (en) 2012-11-29 2014-05-29 Verizon Patent And Licensing Inc. Healthcare fraud detection based on statistics, learning, and parameters
US20140157172A1 (en) 2012-11-30 2014-06-05 Drillmap Geographic layout of petroleum drilling data and methods for processing data
US20140156527A1 (en) 2012-11-30 2014-06-05 Bank Of America Corporation Pre-payment authorization categorization
US10672008B2 (en) 2012-12-06 2020-06-02 Jpmorgan Chase Bank, N.A. System and method for data analytics
US9497289B2 (en) 2012-12-07 2016-11-15 Genesys Telecommunications Laboratories, Inc. System and method for social message classification based on influence
US9195506B2 (en) 2012-12-21 2015-11-24 International Business Machines Corporation Processor provisioning by a middleware processing system for a plurality of logical processor partitions
US9294576B2 (en) 2013-01-02 2016-03-22 Microsoft Technology Licensing, Llc Social media impact assessment
US20140195515A1 (en) 2013-01-10 2014-07-10 I3 Analytics Methods and systems for querying and displaying data using interactive three-dimensional representations
US9805407B2 (en) 2013-01-25 2017-10-31 Illumina, Inc. Methods and systems for using a cloud computing environment to configure and sell a biological sample preparation cartridge and share related data
US20140222793A1 (en) 2013-02-07 2014-08-07 Parlance Corporation System and Method for Automatically Importing, Refreshing, Maintaining, and Merging Contact Sets
US20140222521A1 (en) 2013-02-07 2014-08-07 Ibms, Llc Intelligent management and compliance verification in distributed work flow environments
US9264393B2 (en) 2013-02-13 2016-02-16 International Business Machines Corporation Mail server-based dynamic workflow management
US8744890B1 (en) 2013-02-14 2014-06-03 Aktana, Inc. System and method for managing system-level workflow strategy and individual workflow activity
US20140244388A1 (en) 2013-02-28 2014-08-28 MetroStar Systems, Inc. Social Content Synchronization
US9286618B2 (en) 2013-03-08 2016-03-15 Mastercard International Incorporated Recognizing and combining redundant merchant designations in a transaction database
US10140664B2 (en) 2013-03-14 2018-11-27 Palantir Technologies Inc. Resolving similar entities from a transaction database
GB2513247A (en) 2013-03-15 2014-10-22 Palantir Technologies Inc Data clustering
US8788405B1 (en) 2013-03-15 2014-07-22 Palantir Technologies, Inc. Generating data clusters with customizable analysis strategies
US8924388B2 (en) 2013-03-15 2014-12-30 Palantir Technologies Inc. Computer-implemented systems and methods for comparing and associating objects
US9501202B2 (en) 2013-03-15 2016-11-22 Palantir Technologies, Inc. Computer graphical user interface with genomic workflow
US9225737B2 (en) 2013-03-15 2015-12-29 Shape Security, Inc. Detecting the introduction of alien content
US9230280B1 (en) 2013-03-15 2016-01-05 Palantir Technologies Inc. Clustering data based on indications of financial malfeasance
US8917274B2 (en) 2013-03-15 2014-12-23 Palantir Technologies Inc. Event matrix based on integrated data
US8937619B2 (en) 2013-03-15 2015-01-20 Palantir Technologies Inc. Generating an object time series from data objects
GB2513721A (en) 2013-03-15 2014-11-05 Palantir Technologies Inc Computer-implemented systems and methods for comparing and associating objects
US9740369B2 (en) 2013-03-15 2017-08-22 Palantir Technologies Inc. Systems and methods for providing a tagging interface for external content
GB2513720A (en) 2013-03-15 2014-11-05 Palantir Technologies Inc Computer-implemented systems and methods for comparing and associating objects
US8868486B2 (en) 2013-03-15 2014-10-21 Palantir Technologies Inc. Time-sensitive cube
US9372929B2 (en) 2013-03-20 2016-06-21 Securboration, Inc. Methods and systems for node and link identification
IN2013CH01237A (en) 2013-03-21 2015-08-14 Infosys Ltd
US20140310266A1 (en) 2013-04-10 2014-10-16 Google Inc. Systems and Methods for Suggesting Places for Persons to Meet
US9390162B2 (en) 2013-04-25 2016-07-12 International Business Machines Corporation Management of a database system
US9767127B2 (en) 2013-05-02 2017-09-19 Outseeker Corp. Method for record linkage from multiple sources
US20140331119A1 (en) 2013-05-06 2014-11-06 Mcafee, Inc. Indicating website reputations during user interactions
US8799799B1 (en) 2013-05-07 2014-08-05 Palantir Technologies Inc. Interactive geospatial map
GB2542517B (en) 2013-05-07 2018-01-24 Palantir Technologies Inc Interactive Geospatial map
US20140351069A1 (en) 2013-05-22 2014-11-27 Cube, Co. System and method for dynamically configuring merchant accounts for multiple payment processors
US9576248B2 (en) 2013-06-01 2017-02-21 Adam M. Hurwitz Record linkage sharing using labeled comparison vectors and a machine learning domain classification trainer
CN104243425B (en) 2013-06-19 2018-09-04 深圳市腾讯计算机系统有限公司 A kind of method, apparatus and system carrying out Content Management in content distributing network
US10164923B2 (en) 2013-06-21 2018-12-25 International Business Machines Corporation Methodology that uses culture information as a means to detect spam
US8620790B2 (en) 2013-07-11 2013-12-31 Scvngr Systems and methods for dynamic transaction-payment routing
US20150019394A1 (en) 2013-07-11 2015-01-15 Mastercard International Incorporated Merchant information correction through transaction history or detail
US8752178B2 (en) 2013-07-31 2014-06-10 Splunk Inc. Blacklisting and whitelisting of security-related events
US9047480B2 (en) 2013-08-01 2015-06-02 Bitglass, Inc. Secure application access system
US9565152B2 (en) 2013-08-08 2017-02-07 Palantir Technologies Inc. Cable reader labeling
US9335897B2 (en) 2013-08-08 2016-05-10 Palantir Technologies Inc. Long click display of a context menu
US9223773B2 (en) 2013-08-08 2015-12-29 Palatir Technologies Inc. Template system for custom document generation
GB2518745A (en) 2013-08-08 2015-04-01 Palantir Technologies Inc Template system for custom document generation
US9477372B2 (en) 2013-08-08 2016-10-25 Palantir Technologies Inc. Cable reader snippets and postboard
US8713467B1 (en) 2013-08-09 2014-04-29 Palantir Technologies, Inc. Context-sensitive views
US10673802B2 (en) 2013-08-29 2020-06-02 Pecan Technologies Inc. Social profiling of electronic messages
US9787760B2 (en) 2013-09-24 2017-10-10 Chad Folkening Platform for building virtual entities using equity systems
US9785317B2 (en) 2013-09-24 2017-10-10 Palantir Technologies Inc. Presentation and analysis of user interaction data
US8689108B1 (en) 2013-09-24 2014-04-01 Palantir Technologies, Inc. Presentation and analysis of user interaction data
US8938686B1 (en) 2013-10-03 2015-01-20 Palantir Technologies Inc. Systems and methods for analyzing performance of an entity
US8812960B1 (en) 2013-10-07 2014-08-19 Palantir Technologies Inc. Cohort-based presentation of user interaction data
US20150106170A1 (en) 2013-10-11 2015-04-16 Adam BONICA Interface and methods for tracking and analyzing political ideology and interests
US8924872B1 (en) 2013-10-18 2014-12-30 Palantir Technologies Inc. Overview user interface of emergency call data of a law enforcement agency
US9116975B2 (en) 2013-10-18 2015-08-25 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive simultaneous querying of multiple data stores
US8786605B1 (en) 2013-10-24 2014-07-22 Palantir Technologies Inc. Systems and methods for distance and congestion-aware resource deployment
US8832594B1 (en) 2013-11-04 2014-09-09 Palantir Technologies Inc. Space-optimized display of multi-column tables with selective text truncation based on a combined text width
US9021384B1 (en) 2013-11-04 2015-04-28 Palantir Technologies Inc. Interactive vehicle information map
US9396246B2 (en) 2013-11-08 2016-07-19 International Business Machines Corporation Reporting and summarizing metrics in sparse relationships on an OLTP database
US8868537B1 (en) 2013-11-11 2014-10-21 Palantir Technologies, Inc. Simple web search
US9235638B2 (en) 2013-11-12 2016-01-12 International Business Machines Corporation Document retrieval using internal dictionary-hierarchies to adjust per-subject match results
US9356937B2 (en) 2013-11-13 2016-05-31 International Business Machines Corporation Disambiguating conflicting content filter rules
US9727622B2 (en) 2013-12-16 2017-08-08 Palantir Technologies, Inc. Methods and systems for analyzing entity performance
EP2884440A1 (en) 2013-12-16 2015-06-17 Palantir Technologies, Inc. Methods and systems for analyzing entity performance
US9552615B2 (en) 2013-12-20 2017-01-24 Palantir Technologies Inc. Automated database analysis to detect malfeasance
US20150178825A1 (en) 2013-12-23 2015-06-25 Citibank, N.A. Methods and Apparatus for Quantitative Assessment of Behavior in Financial Entities and Transactions
US10356032B2 (en) 2013-12-26 2019-07-16 Palantir Technologies Inc. System and method for detecting confidential information emails
US20150186821A1 (en) 2014-01-02 2015-07-02 Palantir Technologies Inc. Computer-implemented methods and systems for analyzing healthcare data
US8832832B1 (en) 2014-01-03 2014-09-09 Palantir Technologies Inc. IP reputation
US9043696B1 (en) 2014-01-03 2015-05-26 Palantir Technologies Inc. Systems and methods for visual definition of data associations
US9836502B2 (en) 2014-01-30 2017-12-05 Splunk Inc. Panel templates for visualization of data within an interactive dashboard
US20150235334A1 (en) 2014-02-20 2015-08-20 Palantir Technologies Inc. Healthcare fraud sharing system
US9483162B2 (en) 2014-02-20 2016-11-01 Palantir Technologies Inc. Relationship visualizations
US9009827B1 (en) 2014-02-20 2015-04-14 Palantir Technologies Inc. Security sharing system
US9857958B2 (en) 2014-04-28 2018-01-02 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive access of, investigation of, and analysis of data objects stored in one or more databases
US9009171B1 (en) 2014-05-02 2015-04-14 Palantir Technologies Inc. Systems and methods for active column filtering
US20150324868A1 (en) 2014-05-12 2015-11-12 Quixey, Inc. Query Categorizer
KR102147246B1 (en) 2014-05-26 2020-08-24 삼성전자 주식회사 Method and apparatus to improve performance in communication network
US9536329B2 (en) 2014-05-30 2017-01-03 Adobe Systems Incorporated Method and apparatus for performing sentiment analysis based on user reactions to displayable content
US9619557B2 (en) 2014-06-30 2017-04-11 Palantir Technologies, Inc. Systems and methods for key phrase characterization of documents
US9535974B1 (en) 2014-06-30 2017-01-03 Palantir Technologies Inc. Systems and methods for identifying key phrase clusters within documents
US9129219B1 (en) 2014-06-30 2015-09-08 Palantir Technologies, Inc. Crime risk forecasting
US9256664B2 (en) 2014-07-03 2016-02-09 Palantir Technologies Inc. System and method for news events detection and visualization
US9021260B1 (en) 2014-07-03 2015-04-28 Palantir Technologies Inc. Malware data item analysis
US9202249B1 (en) 2014-07-03 2015-12-01 Palantir Technologies Inc. Data item clustering and analysis
EP3731166A1 (en) 2014-07-03 2020-10-28 Palantir Technologies Inc. Data clustering
US10133806B2 (en) 2014-07-31 2018-11-20 Splunk Inc. Search result replication in a search head cluster
US9454281B2 (en) 2014-09-03 2016-09-27 Palantir Technologies Inc. System for providing dynamic linked panels in user interface
US9785328B2 (en) 2014-10-06 2017-10-10 Palantir Technologies Inc. Presentation of multivariate data on a graphical user interface of a computing system
US9146954B1 (en) 2014-10-09 2015-09-29 Splunk, Inc. Creating entity definition from a search result set
US9229952B1 (en) 2014-11-05 2016-01-05 Palantir Technologies, Inc. History preserving data pipeline system and method
US9043894B1 (en) 2014-11-06 2015-05-26 Palantir Technologies Inc. Malicious software detection in a computing system
US9348920B1 (en) 2014-12-22 2016-05-24 Palantir Technologies Inc. Concept indexing among database of documents using machine learning techniques
US9367872B1 (en) 2014-12-22 2016-06-14 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive investigation of bad actor behavior based on automatic clustering of related data in various data structures
US10552994B2 (en) 2014-12-22 2020-02-04 Palantir Technologies Inc. Systems and interactive user interfaces for dynamic retrieval, analysis, and triage of data items
US20160253672A1 (en) 2014-12-23 2016-09-01 Palantir Technologies, Inc. System and methods for detecting fraudulent transactions
US9952934B2 (en) 2015-01-20 2018-04-24 Commvault Systems, Inc. Synchronizing selected portions of data in a storage management system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080072180A1 (en) * 2006-09-15 2008-03-20 Emc Corporation User readability improvement for dynamic updating of search results
US20110016118A1 (en) * 2009-07-20 2011-01-20 Lexisnexis Method and apparatus for determining relevant search results using a matrix framework
US10318630B1 (en) * 2016-11-21 2019-06-11 Palantir Technologies Inc. Analysis of large bodies of textual data

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11113286B2 (en) * 2019-12-26 2021-09-07 Snowflake Inc. Generation of pruning index for pattern matching queries

Also Published As

Publication number Publication date
US10318630B1 (en) 2019-06-11

Similar Documents

Publication Publication Date Title
US10318630B1 (en) Analysis of large bodies of textual data
US11657044B2 (en) Semantic parsing engine
US11954300B2 (en) User interface based variable machine modeling
US9946703B2 (en) Title extraction using natural language processing
US11640436B2 (en) Methods and systems for query segmentation
US20230177579A1 (en) System and method for computing features that apply to infrequent queries
US20200020000A1 (en) Generating product descriptions from user reviews
US10528871B1 (en) Structuring data in a knowledge graph
US20210056265A1 (en) Snippet generation and item description summarizer
US20190130023A1 (en) Expanding search queries
US10552465B2 (en) Generating text snippets using universal concept graph
US20180276302A1 (en) Search provider selection using statistical characterizations
US11334564B2 (en) Expanding search queries
US20200005181A1 (en) Vector generation for distributed data sets
CN110168591B (en) Determining industry similarity to enhance job searching
AU2017280238B2 (en) Search system employing result feedback
EP3430528A1 (en) Catalogue management
WO2015171952A1 (en) Methods and systems to identify query recommendations
US20200387517A1 (en) Search result page ranking optimization
US10769695B2 (en) Generating titles for a structured browse page
US20170161814A1 (en) Discovering products in item inventory
WO2017015792A1 (en) Sql performance recommendations and scoring

Legal Events

Date Code Title Description
AS Assignment

Owner name: PALANTIR TECHNOLOGIES INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KESIN, MAXIM;GRIBELYUK, PAUL;SIGNING DATES FROM 20180110 TO 20180302;REEL/FRAME:049976/0825

AS Assignment

Owner name: ROYAL BANK OF CANADA, AS ADMINISTRATIVE AGENT, CANADA

Free format text: SECURITY INTEREST;ASSIGNOR:PALANTIR TECHNOLOGIES INC.;REEL/FRAME:051709/0471

Effective date: 20200127

Owner name: MORGAN STANLEY SENIOR FUNDING, INC., AS ADMINISTRATIVE AGENT, NEW YORK

Free format text: SECURITY INTEREST;ASSIGNOR:PALANTIR TECHNOLOGIES INC.;REEL/FRAME:051713/0149

Effective date: 20200127

STPP Information on status: patent application and granting procedure in general

Free format text: PRE-INTERVIEW COMMUNICATION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

AS Assignment

Owner name: PALANTIR TECHNOLOGIES INC., CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:ROYAL BANK OF CANADA;REEL/FRAME:052856/0382

Effective date: 20200604

Owner name: MORGAN STANLEY SENIOR FUNDING, INC., NEW YORK

Free format text: SECURITY INTEREST;ASSIGNOR:PALANTIR TECHNOLOGIES INC.;REEL/FRAME:052856/0817

Effective date: 20200604

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: PALANTIR TECHNOLOGIES INC., CALIFORNIA

Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE ERRONEOUSLY LISTED PATENT BY REMOVING APPLICATION NO. 16/832267 FROM THE RELEASE OF SECURITY INTEREST PREVIOUSLY RECORDED ON REEL 052856 FRAME 0382. ASSIGNOR(S) HEREBY CONFIRMS THE RELEASE OF SECURITY INTEREST;ASSIGNOR:ROYAL BANK OF CANADA;REEL/FRAME:057335/0753

Effective date: 20200604

AS Assignment

Owner name: WELLS FARGO BANK, N.A., NORTH CAROLINA

Free format text: ASSIGNMENT OF INTELLECTUAL PROPERTY SECURITY AGREEMENTS;ASSIGNOR:MORGAN STANLEY SENIOR FUNDING, INC.;REEL/FRAME:060572/0640

Effective date: 20220701

Owner name: WELLS FARGO BANK, N.A., NORTH CAROLINA

Free format text: SECURITY INTEREST;ASSIGNOR:PALANTIR TECHNOLOGIES INC.;REEL/FRAME:060572/0506

Effective date: 20220701