US20040163034A1 - Systems and methods for labeling clusters of documents - Google Patents

Systems and methods for labeling clusters of documents Download PDF

Info

Publication number
US20040163034A1
US20040163034A1 US10/685,479 US68547903A US2004163034A1 US 20040163034 A1 US20040163034 A1 US 20040163034A1 US 68547903 A US68547903 A US 68547903A US 2004163034 A1 US2004163034 A1 US 2004163034A1
Authority
US
United States
Prior art keywords
documents
topics
ones
associated
clusters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/685,479
Inventor
Sean Colbath
Francis Kubala
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Raytheon BBN Technologies Corp
Original Assignee
Raytheon BBN Technologies Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US41921402P priority Critical
Application filed by Raytheon BBN Technologies Corp filed Critical Raytheon BBN Technologies Corp
Priority to US10/685,479 priority patent/US20040163034A1/en
Assigned to BBNT SOLUTIONS LLC reassignment BBNT SOLUTIONS LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KUBALA, FRANCIS G., COLBATH, SEAN
Assigned to FLEET NATIONAL BANK, AS AGENT reassignment FLEET NATIONAL BANK, AS AGENT PATENT & TRADEMARK SECURITY AGREEMENT Assignors: BBNT SOLUTIONS LLC
Publication of US20040163034A1 publication Critical patent/US20040163034A1/en
Assigned to BBNT SOLUTIONS LLC reassignment BBNT SOLUTIONS LLC CORRECTIVE ASSIGNMENT TO CORRECT THE ASSIGNEE'S ADDRESS, PREVIOUSLY RECORDED AT REEL 014608 FRAME 0961. Assignors: KUBALA, FRANCIS G., COLBATH, SEAN
Assigned to BBN TECHNOLOGIES CORP. reassignment BBN TECHNOLOGIES CORP. MERGER (SEE DOCUMENT FOR DETAILS). Assignors: BBNT SOLUTIONS LLC
Assigned to BBN TECHNOLOGIES CORP. (AS SUCCESSOR BY MERGER TO BBNT SOLUTIONS LLC) reassignment BBN TECHNOLOGIES CORP. (AS SUCCESSOR BY MERGER TO BBNT SOLUTIONS LLC) RELEASE OF SECURITY INTEREST Assignors: BANK OF AMERICA, N.A. (SUCCESSOR BY MERGER TO FLEET NATIONAL BANK)
Application status is Abandoned legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/32Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems

Abstract

A system (520) generates labels for clusters of documents. The system (520) identifies topics associated with the documents in the clusters and determines whether the topics are associated with approximately half or more of the documents in the clusters. The system (520) then generates labels for the clusters using the topics that are associated with approximately half or more of the documents in the clusters.

Description

    RELATED APPLICATION
  • This application is related to U.S. application Ser. No. 10/ ______ (Docket No. 02-4034), entitled “SYSTEMS AND METHODS FOR INTERACTIVE CLUSTERING OF DOCUMENTS,” filed concurrently herewith and incorporated herein by reference. [0001]
  • This application claims priority under 35 U.S.C. § 119 based on U.S. Provisional Application No. 60/419,214, filed Oct. 17, 2002, the contents of which are incorporated herein by reference.[0002]
  • GOVERNMENT CONTRACT
  • [0003] The U.S. Government may have a paid-up license in this invention and the right in limited circumstances to require the patent owner to license others on reasonable terms as provided for by the terms of Contract No. N66001-00-C-8008 awarded by the Defense Advanced Research Projects Agency.
  • BACKGROUND OF THE INVENTION
  • 1. Field of the Invention [0004]
  • The present invention relates generally to multimedia environments and, more particularly, to systems and methods for labeling clusters of similar documents. [0005]
  • 2. Description of Related Art [0006]
  • When trying to organize large collections of documents, it is sometimes useful to organize these documents into similar groupings, where similarity is determined by some metric, such as the topics of the documents or their relevance to a particular event. Conventional systems typically receive streams of documents and group the documents into clusters that ideally concern a single event, or more typically, a single topic. [0007]
  • One particular conventional system includes an event or topic detection system that uses natural language techniques to make a decision about each of the documents it receives. The decision involves the determination of whether a particular document relates to a new event (or topic) that the system has not seen before or an existing event (or topic) that the system has seen before. If the document relates to a new event, then the system creates a new cluster and assigns the document to this new cluster. If the document, instead, relates to an existing event, then the system assigns the document to an existing cluster relating to the event. [0008]
  • The system usually operates based on a set of rules. One rule is that a document can only be assigned to one cluster. Another rule is that the clusters can only grow and may never be broken. To this effect, the system may never revisit documents that have already been assigned to clusters to determine whether the documents should have been assigned to different clusters. [0009]
  • The conventional system usually presents the clusters to an end user with no labeling other than, possibly, the number of documents in the clusters. This is of limited usefulness to a user looking for a document in one of the clusters. [0010]
  • As a result, there is a need for a labeling scheme that creates cluster labels that are indicative of the documents in the clusters and are meaningful to an end user. [0011]
  • SUMMARY OF THE INVENTION
  • Systems and methods consistent with the present invention address this and other needs by creating labels for clusters based on document topics that are associated with at least half of the documents in the clusters. The topics may be ranked based on the number of documents relating to the corresponding topics. The topics may then be presented in rank order as labels for the clusters. [0012]
  • In one aspect consistent with the principles of the invention, a system that generates labels for clusters of documents is provided. The system identifies topics associated with the documents in the clusters and determines whether the topics are associated with approximately half or more of the documents in the clusters. The system then generates labels for the clusters using the topics that are associated with approximately half or more of the documents in the clusters. [0013]
  • In another aspect consistent with the present invention, a method of creating labels for clusters of documents is provided. The method includes identifying topics associated with the documents in the clusters; determining whether the topics are associated with at least half of the documents in the clusters; adding ones of the topics that are associated with at least half of the documents in the clusters to cluster lists; and forming labels for the clusters from the cluster lists. [0014]
  • In yet another aspect consistent with the present invention, a system for creating a label for a cluster of documents is provided. The system is configured to identify topics associated with the documents in the cluster and determine whether the topics are associated with approximately half or more of the documents in the cluster. The system is further configured to rank the topics that that are associated with approximately half or more of the documents in the cluster and generate a label for the cluster using the ranked topics. [0015]
  • In a further aspect consistent with the present invention, a topic detection system is provided. The topic detection system includes a decision engine and a label engine. The decision engine is configured to receive documents and group the documents into clusters. The label engine is configured to identify topics associated with the documents in the clusters, determine whether the topics are associated with at least half of the documents in the clusters, and form labels for the clusters using the topics that are associated with at least half of the documents in the clusters.[0016]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate the invention and, together with the description, explain the invention. In the drawings, [0017]
  • FIG. 1 is a diagram of a system in which systems and methods consistent with the present invention may be implemented; [0018]
  • FIG. 2 is an exemplary diagram of the server system of FIG. 1 according to an implementation consistent with the principles of the invention; [0019]
  • FIG. 3 is an exemplary diagram of the server of FIG. 2 according to an implementation consistent with the principles of the invention; [0020]
  • FIG. 4 is an exemplary diagram of a portion of the indexing system of FIG. 2 according to an implementation consistent with the principles of the invention; [0021]
  • FIG. 5 is an exemplary diagram of the event detection system of FIG. 2 according to an implementation consistent with the present invention; [0022]
  • FIG. 6 is a flowchart of exemplary processing for grouping documents into clusters according to an implementation consistent with the principles of the invention; [0023]
  • FIG. 7 is a flowchart of exemplary processing for creating a label for a cluster according to an implementation consistent with the principles of the invention; and [0024]
  • FIGS. 8A and 8B are exemplary diagrams of a graphical user interface that may be presented to a user according to an implementation consistent with the principles of the invention.[0025]
  • DETAILED DESCRIPTION
  • The following detailed description of the invention refers to the accompanying drawings. The same reference numbers in different drawings may identify the same or similar elements. Also, the following detailed description does not limit the invention. Instead, the scope of the invention is defined by the appended claims and equivalents. [0026]
  • Systems and methods consistent with the present invention create cluster labels that are indicative of the documents in the clusters and are meaningful to an end user. The labels may be based on document topics that are associated with at least half of the documents in the clusters. The topics may be ranked based on their occurrence in the documents of the cluster. The topics may then be presented in rank order as a label for the cluster. [0027]
  • In the discussion that follows, a document corresponds to a body of media that is contiguous in time (from beginning to end or from time A to time B). Documents might include audio documents (e.g., radio broadcasts), video documents (e.g., television broadcasts), and/or text documents (e.g., word processing documents) in any language. [0028]
  • Exemplary System
  • FIG. 1 is a diagram of an exemplary system [0029] 100 in which systems and methods consistent with the present invention may be implemented. System 100 may include clients 110 connected to server system 120 via a network 130. Network 130 may include any type of network, such as a local area network (LAN), a wide area network (WAN), a public telephone network (e.g., the Public Switched Telephone Network (PSTN)), a virtual private network (VPN), or a combination of networks. Clients 110 and server system 120 may connect to network 130 via wired, wireless, and/or optical connections.
  • Generally, clients [0030] 110 may interact with server system 120 to obtain documents of interest. A user of one of clients 110 may then cause the documents to be automatically grouped into clusters on demand. A client 110 may include a personal computer, a laptop, a personal digital assistant, or another type of device that is capable of interacting with server system 120 to obtain documents of interest. A client 110 may present the documents to a user via a graphical user interface (GUI), possibly within a web browser window.
  • Generally, server system [0031] 120 may process and maintain documents. Server system 120 may receive documents in a wide variety of formats (e.g., audio, video, and text) and process the documents to extract features and other relevant information from the documents. Server system 120 may also group documents into clusters and, when requested, provide documents to clients 110.
  • FIG. 2 is an exemplary diagram of server system [0032] 120 according to an implementation consistent with the principles of the invention. Server system 120 may include a server 210, an indexing system 220, an event detection system 230, and a database 240 connected via a network 250. Network 250 may include a LAN, WAN, the Internet, network 130, or other types of direct or indirect connections.
  • Server [0033] 210 may include a computer or another type of device capable of interacting with clients 110. In one implementation consistent with the principles of the invention, server 210 includes indexing system 220 and/or event detection system 230.
  • FIG. 3 is an exemplary diagram of server [0034] 210 according to an implementation consistent with the principles of the invention. Server 210 may include bus 310, processor 320, main memory 330, read only memory (ROM) 340, storage device 350, input device 360, output device 370, and communication interface 380. Bus 310 permits communication among the components of server 210.
  • Processor [0035] 320 may include any type of conventional processor or microprocessor that interprets and executes instructions. Main memory 330 may include a random access memory (RAM) or another type of dynamic storage device that stores information and instructions for execution by processor 320. ROM 340 may include a conventional ROM device or another type of static storage device that stores static information and instructions for use by processor 320. Storage device 350 may include a magnetic and/or optical recording medium and its corresponding drive.
  • Input device [0036] 360 may include one or more conventional mechanisms that permit an operator to input information to server 210, such as a keyboard, a mouse, a pen, voice recognition and/or biometric mechanisms, etc. Output device 370 may include one or more conventional mechanisms that output information to the operator, including a display, a printer, a speaker, etc. Communication interface 380 may include any transceiver-like mechanism that enables server 210 to communicate with other devices and/or systems. For example, communication interface 380 may include mechanisms for communicating with another device or system via a network, such as network 250 or network 130.
  • As will be described in detail below, server [0037] 210, consistent with the present invention, may interact with clients 110, event detection system 230, and/or database 240 to provide documents of interest. Server 210 may perform these tasks in response to processor 320 executing sequences of instructions contained in, for example, memory 330. Alternatively, hardwired circuitry may be used in place of or in combination with software instructions to implement processes consistent with the present invention. Thus, processes performed by server 210 are not limited to any specific combination of hardware circuitry and software.
  • Returning to FIG. 2, indexing system [0038] 220 may receive document data, including real time data, in a variety of formats (e.g., audio, video, and text), process the data to extract features and other relevant information from the documents, and record the date and time at which the documents were created. In one implementation consistent with the principles of the invention, indexing system 220 may include mechanisms, such as the ones described in John Makhoul et al., “Speech and Language Technologies for Audio Indexing and Retrieval,” Proceedings of the IEEE, Vol. 88, No. 8, August 2000, pp. 1338-1353, which is incorporated herein by reference.
  • FIG. 4 is an exemplary diagram of a portion of indexing system [0039] 220 according to an implementation consistent with the principles of the invention. The portion of indexing system 220 shown in FIG. 4 operates upon audio documents. Indexing system 220 may include similar or dissimilar mechanisms for operating upon other types of media, such as video and text.
  • As shown in FIG. 4, indexing system [0040] 220 includes audio classification logic 410, speech recognition logic 420, speaker clustering logic 430, speaker identification logic 440, name spotting logic 450, topic classification logic 460, and story segmentation logic 470. Audio classification logic 410 may distinguish speech from silence, noise, and other audio signals in input audio data. For example, audio classification logic 410 may analyze each thirty second window of the input data to determine whether it contains speech. Audio classification logic 410 may also identify boundaries between speakers in the input stream. Audio classification logic 410 may group speech segments from the same speaker and send the segments to speech recognition logic 420.
  • Speech recognition logic [0041] 420 may perform continuous speech recognition to recognize the words spoken in the segments that it receives from audio classification logic 410. Speech recognition logic 420 may generate a transcription of the speech using a statistical language model. Speaker clustering logic 430 may identify all of the segments from the same speaker in a single document and group them into speaker clusters. Speaker clustering logic 430 may then assign each of the speaker clusters a unique label. Speaker identification logic 440 may identify the speaker in each speaker cluster by name or gender.
  • Name spotting logic [0042] 450 may locate the names of people, places, and organizations in the transcription. Name spotting logic 450 may extract the names and store them in a database. Topic classification logic 460 may use a probabilistic Hidden Markov Model (HMM) to assign topics to the transcription. In one implementation consistent with the present invention, topic classification logic 460 uses a technique similar to the one described in John Makhoul et al., “Speech and Language Technologies for Audio Indexing and Retrieval,” Proceedings of the IEEE, Vol. 88, No. 8, August 2000, pp. 1338-1353, which was previously incorporated by reference. Topic classification logic 460 may generate a rank-ordered list of all possible topics and corresponding scores for the transcription.
  • Story segmentation logic [0043] 470 may change the continuous stream of words in the transcription into document-like units with coherent sets of topic labels and other document features generated or identified by the components of indexing system 220. This information may constitute metadata corresponding to the input audio data. Story segmentation logic 470 may store the metadata in database 240.
  • Returning to FIG. 2, event detection system [0044] 230 may group documents into clusters based on events or topics to which the documents relate. FIG. 5 is an exemplary diagram of event detection system 230 according to an implementation consistent with the principles of the invention. Event detection system 230 may include a decision engine 510 and a label engine 520. The decision engine 510 may include a conventional event or topic detection system, such as the Topic Detection Tracking system developed by the University of Massachusetts, Amherst, as described in J. Allan et al., “UMass at TDT2000,” November 2000, pages 109-115.
  • Decision engine [0045] 510 may include logic that receives a stream of documents over time from, for example, indexing system 220 and/or server 210, and determines, for each of the documents, whether the document is related to an event or topic that decision engine 510 has seen before. If the document is related to a new event or topic (i.e., one that has not yet been seen by decision engine 510), then decision engine 510 may create a new cluster relating to the event or topic and assign the document to the new cluster. If the document is, instead, related to an existing event or topic, then decision engine 510 may assign the document to an existing cluster that is also related to the event or topic.
  • Decision engine [0046] 510 may follow the same rules as conventional systems. In other words, decision engine 510 may assign a document to only one cluster. Decision engine 510 may also get only one chance to make a decision about a document and, thereafter, may not change its decision regarding the cluster to which the document is assigned. Decision engine 510 may store its document assignment decisions in an internal memory or, alternatively, in database 240.
  • Label engine [0047] 520 may include logic that creates labels for the clusters generated by decision engine 510. In another implementation, the functions of label engine 520 are performed by server 210. For each of the clusters, label engine 520 may examine the topics assigned to the cluster documents by indexing system 220. Label engine 520 may then label the cluster with the topics that appear on at least half of the documents in the cluster. The theory is that if a topic does not appear on at least half of the documents in the cluster, then the topic is not representative of the cluster.
  • Label engine [0048] 520 may rank the topics assigned to a cluster. For example, a topic that is associated with more of the documents in the cluster may be ranked higher than a topic associated with fewer of the documents in the cluster. This ranked list of topics may form a label for the cluster. The clusters with attached labels may be presented to a user via client 110.
  • Returning to FIG. 2, database [0049] 240 may include a relational database that stores documents from indexing system 220 and, possibly, cluster information from event detection system 230. The contents of database 240 may be accessible to users via clients 110.
  • Exemplary Processing
  • FIG. 6 is a flowchart of exemplary processing for grouping documents into clusters according to an implementation consistent with the principles of the invention. Processing may begin with decision engine [0050] 510 receiving a stream of documents over time (act 610). Decision engine 510 may receive the documents from indexing system 220 and/or server 210.
  • Decision engine [0051] 510 may operate upon the documents to group the documents into clusters (act 620). For example, decision engine 510 may determine, for each of the documents, whether the document relates to a new event (or topic) that decision engine 510 has not seen before or an existing event (or topic) that decision engine 510 has seen before. If the document relates to a new event (or topic), then decision engine 510 creates a new cluster and assigns the document to this new cluster. If the document, instead, relates to an existing event (or topic), then decision engine 510 assigns the document to an existing cluster relating to the event (or topic).
  • Label engine [0052] 520 may create labels for the clusters generated by decision engine 510 (act 630). Label engine 520 may create a label or reassess a previous label assignment for a cluster on a periodic basis, when a new document is assigned to the cluster, or when cluster information is requested by a user (via client 110).
  • FIG. 7 is a flowchart of exemplary processing for creating a label for a cluster according to an implementation consistent with the principles of the invention. Processing may begin with label engine [0053] 520 identifying the topics assigned to the documents in the cluster (act 710). In one implementation, label engine 520 obtains the topic information from indexing system 220. In another implementation, label 520 generates the topic information, possibly using a technique similar to the one described in John Makhoul et al., “Speech and Language Technologies for Audio Indexing and Retrieval,” Proceedings of the IEEE, Vol. 88, No. 8, August 2000, pp. 1338-1353, which was previously incorporated by reference. In yet another implementation, label 520 obtains the topic information in some other manner.
  • Label engine [0054] 520 may then examine each of the topics in the cluster. For example, label engine 520 may determine whether a topic M (where M≧1) is associated with at least half of the documents in the cluster (act 720). If so, label engine 520 may add topic M to a cluster list (act 730). If topic M is not associated with at least half of the documents in the cluster, label engine 520 may determine whether all of the topics in the cluster have been considered (act 740). If one or more topics have not yet been considered, then label engine 520 may examine the next topic (M+1), returning to act 720.
  • If all of the topics have been considered, then label engine [0055] 520 may rank the topics in the cluster list to form a label for the cluster (act 750). For example, label engine 520 may rank a topic that is associated with the majority of the documents in the cluster higher than all other topics. Label engine 520 may rank the topic associated with the next highest majority of the documents in the cluster higher than all other remaining topics, and so on down to one or more topics that are associated with half of the documents in the cluster. Label engine 520 may use this ranked list of topics to form a label for the cluster.
  • Label engine [0056] 520, or event detection system 230, may store cluster information in database 240. The cluster information, in this case, may include information regarding the clusters to which the documents are assigned and the labels associated with those clusters.
  • Returning to FIG. 6, server [0057] 210 may present the cluster information to a user upon request (act 640). For example, server 210 may send the cluster information to client 110 for display via, for example, a graphical user interface, such as a browser interface. The cluster information may be presented to the user as a list of clusters that may be sorted based on the number of documents contained in the clusters. In other words, clusters containing larger numbers of documents may be presented higher on the list than clusters containing fewer numbers of documents. The clusters may include assigned labels to make the cluster list meaningful to the user.
  • FIGS. 8A and 8B are exemplary diagrams of a graphical user interface that may be presented to a user according to an implementation consistent with the principles of the invention. If the user requests to view the clusters generated by event detection system [0058] 230, the user may be presented with a graphical user interface, possibly in the form of a browser interface, such as graphical user interface 800 in FIG. 8A. Graphical user interface 800 may include cluster data 810, barchart view option 820, and timeline view option 830.
  • Cluster data [0059] 810 may include data that identifies the current document count and the current cluster count. The current document count may specify the total number of documents that have been received and processed by event detection system 230. The current cluster count may specify the total number of clusters in which the documents have been grouped.
  • Barchart view option [0060] 820 and timeline view option 830 are two manners by which the clusters may be presented to the user. In other implementations consistent with the present invention, there are more or fewer ways of presenting the clusters to the user. Barchart view option 820 may display the clusters in the form of a barchart. Timeline view option 830 may display the clusters in the form of a timeline.
  • FIG. 8B is an exemplary diagram of graphical user interface [0061] 800 that may be presented when providing clusters in barchart form according to an implementation consistent with the principles of the invention. Graphical user interface 800 may present the clusters as a series of bars, the length of which relate to the number of documents in the clusters. The bars may be sorted by cluster size, with larger clusters being presented first. Each of the bars may have an associated label that corresponds to the label generated for the cluster by label engine 520.
  • The user may select one of the bars to view the documents included in the cluster. The documents may then be presented to the user in chronological order (i.e., sorted based on the date and time at which the document was created), with the more recent documents being presented first. In other implementations, the documents are presented in other ways. [0062]
  • CONCLUSION
  • Systems and methods consistent with the present invention create labels for clusters of documents, such that the labels are indicative of the documents in the cluster and are valuable to an end user seeking a document in one of the clusters. The labels may be based on document topics that are associated with at least half of the documents in the clusters. The topics may be ranked based on the number of documents with which the topics are associated. The topics may then be presented in rank order as a label for the cluster. [0063]
  • The foregoing description of preferred embodiments of the present invention provides illustration and description, but is not intended to be exhaustive or to limit the invention to the precise form disclosed. Modifications and variations are possible in light of the above teachings or may be acquired from practice of the invention. [0064]
  • For example, it has been described that only topics that are associated with at least half of the documents in the cluster are used for the cluster label. In other implementations, the criteria is changed to include topics associated with more or fewer than half of the documents. [0065]
  • While series of acts have been described with regard to FIGS. 6 and 7, the order of the acts may be varied in other implementations consistent with the principles of the invention. Also, non-dependent acts may be performed in parallel. [0066]
  • Further, certain portions of the invention have been described as “logic” that performs one or more functions. This logic may include hardware, such as an application specific integrated circuit or a field programmable gate array, software, or a combination of hardware and software. [0067]
  • No element, act, or instruction used in the description of the present application should be construed as critical or essential to the invention unless explicitly described as such. Also, as used herein, the article “a” is intended to include one or more items. Where only one item is intended, the term “one” or similar language is used. The scope of the invention is defined by the claims and their equivalents. [0068]

Claims (19)

What is claimed is:
1. A method of creating labels for clusters of documents, comprising:
identifying topics associated with the documents in the clusters;
determining whether the topics are associated with at least half of the documents in the clusters;
adding ones of the topics that are associated with at least half of the documents in the clusters to cluster lists; and
forming labels for the clusters from the cluster lists.
2. The method of claim 1, wherein the identifying topics includes:
using a probabilistic Hidden Markov Model to determine the topics.
3. The method of claim 1, wherein the forming labels includes:
ranking the ones of the topics, and
placing the ones of the topics in the labels in ranked order.
4. The method of claim 3, wherein the ranking the ones of the topics includes:
assigning ranks to the ones of the topics based on a number of the documents with which the ones of the topics are associated.
5. The method of claim 1, further comprising:
ranking the ones of the topics based on a number of the documents with which the ones of the topics are associated.
6. The method of claim 5, wherein when a first one of the ones of the topics, as a first topic, is associated with a majority of the documents in one of the clusters and a second one of the ones of the topics, as a second topic, is associated with less than the majority of the documents in the one of the clusters, the first topic is ranked higher than the second topic.
7. The method of claim 5, wherein the ranking the ones of the topics includes:
assigning higher ranks to first ones of the ones of the topics that are associated with larger numbers of the documents than second ones of the ones of the topics that are associated with smaller numbers of the documents.
8. The method of claim 5, wherein the forming labels includes:
sorting the cluster lists based on the rankings of the ones of the topics.
9. A system for generating a label for a cluster of documents, comprising:
means for identifying topics associated with the documents in the cluster;
means for determining whether the topics are associated with at least half of the documents in the cluster; and
means for generating a label for the cluster based on one or more of the topics that are associated with at least half of the documents in the cluster.
10. The system of claim 9, further comprising:
means for ranking the one or more of the topics based on a number of the documents with which the one or more of the topics are associated.
11. The system of claim 10, wherein the means for generating a label includes:
means for sorting the one or more of the topics based on the ranking to form the label for the cluster.
12. A system for creating a label for a cluster of documents, comprising:
logic configured to identify topics associated with the documents in the cluster;
logic configured to determine whether the topics are associated with approximately half or more of the documents in the cluster;
logic configured to rank ones of the topics that that are associated with approximately half or more of the documents in the cluster; and
logic configured to generate a label for the cluster using the ones of the topics in ranked order.
13. The system of claim 12, wherein when a first one of the ones of the topics, as a first topic, is associated with a majority of the documents in the cluster and a second one of the ones of the topics, as a second topic, is associated with less than the majority of the documents in the cluster, the first topic is ranked higher than the second topic.
14. The system of claim 12, wherein the logic configured to rank ones of the topics includes:
logic configured to assign higher ranks to first ones of the ones of the topics that are associated with larger numbers of the documents than second ones of the ones of the topics that are associated with smaller numbers of the documents.
15. The system of claim 12, wherein the logic configured to generate a label includes:
logic configured to sort the ones of the topics based on the rankings of the ones of the topics.
16. A topic detection system, comprising:
a decision engine configured to:
receive a plurality of documents, and
group the documents into a plurality of clusters; and a label engine configured to:
identify topics associated with the documents in the clusters,
determine whether the topics are associated with at least half of the documents in the clusters, and
form labels for the clusters using ones of the topics that are associated with at least half of the documents in the clusters.
17. The system of claim 16, wherein the label engine is further configured to:
rank the ones of the topics based on a number of the documents with which the ones of the topics are associated.
18. A method for creating labels for clusters of documents, comprising:
identifying topics associated with the documents in the clusters;
determining whether the topics are associated with a predetermined portion of the documents in the clusters; and
generating labels for the clusters using ones of the topics that are associated with approximately half or more of the documents in the clusters.
19. The method of claim 18, wherein the predetermined portion of the documents is equal to approximately half of the documents.
US10/685,479 2002-10-17 2003-10-16 Systems and methods for labeling clusters of documents Abandoned US20040163034A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US41921402P true 2002-10-17 2002-10-17
US10/685,479 US20040163034A1 (en) 2002-10-17 2003-10-16 Systems and methods for labeling clusters of documents

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/685,479 US20040163034A1 (en) 2002-10-17 2003-10-16 Systems and methods for labeling clusters of documents

Publications (1)

Publication Number Publication Date
US20040163034A1 true US20040163034A1 (en) 2004-08-19

Family

ID=32110223

Family Applications (9)

Application Number Title Priority Date Filing Date
US10/685,585 Active 2026-01-10 US7424427B2 (en) 2002-10-17 2003-10-16 Systems and methods for classifying audio into broad phoneme classes
US10/685,403 Abandoned US20040083090A1 (en) 2002-10-17 2003-10-16 Manager for integrating language technology components
US10/685,566 Abandoned US20040176946A1 (en) 2002-10-17 2003-10-16 Pronunciation symbols based on the orthographic lexicon of a language
US10/685,410 Active 2026-01-29 US7389229B2 (en) 2002-10-17 2003-10-16 Unified clustering tree
US10/685,478 Abandoned US20040083104A1 (en) 2002-10-17 2003-10-16 Systems and methods for providing interactive speaker identification training
US10/685,445 Abandoned US20040138894A1 (en) 2002-10-17 2003-10-16 Speech transcription tool for efficient speech transcription
US10/685,586 Abandoned US20040204939A1 (en) 2002-10-17 2003-10-16 Systems and methods for speaker change detection
US10/685,479 Abandoned US20040163034A1 (en) 2002-10-17 2003-10-16 Systems and methods for labeling clusters of documents
US10/685,565 Active - Reinstated 2026-04-05 US7292977B2 (en) 2002-10-17 2003-10-16 Systems and methods for providing online fast speaker adaptation in speech recognition

Family Applications Before (7)

Application Number Title Priority Date Filing Date
US10/685,585 Active 2026-01-10 US7424427B2 (en) 2002-10-17 2003-10-16 Systems and methods for classifying audio into broad phoneme classes
US10/685,403 Abandoned US20040083090A1 (en) 2002-10-17 2003-10-16 Manager for integrating language technology components
US10/685,566 Abandoned US20040176946A1 (en) 2002-10-17 2003-10-16 Pronunciation symbols based on the orthographic lexicon of a language
US10/685,410 Active 2026-01-29 US7389229B2 (en) 2002-10-17 2003-10-16 Unified clustering tree
US10/685,478 Abandoned US20040083104A1 (en) 2002-10-17 2003-10-16 Systems and methods for providing interactive speaker identification training
US10/685,445 Abandoned US20040138894A1 (en) 2002-10-17 2003-10-16 Speech transcription tool for efficient speech transcription
US10/685,586 Abandoned US20040204939A1 (en) 2002-10-17 2003-10-16 Systems and methods for speaker change detection

Family Applications After (1)

Application Number Title Priority Date Filing Date
US10/685,565 Active - Reinstated 2026-04-05 US7292977B2 (en) 2002-10-17 2003-10-16 Systems and methods for providing online fast speaker adaptation in speech recognition

Country Status (1)

Country Link
US (9) US7424427B2 (en)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060058998A1 (en) * 2004-09-16 2006-03-16 Kabushiki Kaisha Toshiba Indexing apparatus and indexing method
US20070078846A1 (en) * 2005-09-30 2007-04-05 Antonino Gulli Similarity detection and clustering of images
WO2009018223A1 (en) * 2007-07-27 2009-02-05 Sparkip, Inc. System and methods for clustering large database of documents
US20110153589A1 (en) * 2009-12-21 2011-06-23 Ganesh Vaitheeswaran Document indexing based on categorization and prioritization
US20110202530A1 (en) * 2010-02-15 2011-08-18 Sony Corporation Information processing device, method and program
WO2012033873A1 (en) * 2010-09-10 2012-03-15 Icosystem Corporation Methods and systems for online advertising with interactive text clouds
US20120174007A1 (en) * 2010-12-31 2012-07-05 Seungwon Lee Mobile terminal and method of grouping applications thereof
US20140067812A1 (en) * 2011-06-22 2014-03-06 Rogers Communications Inc. Systems and methods for ranking document clusters
US20140207783A1 (en) * 2013-01-22 2014-07-24 Equivio Ltd. System and method for computerized identification and effective presentation of semantic themes occurring in a set of electronic documents
US9002848B1 (en) * 2011-12-27 2015-04-07 Google Inc. Automatic incremental labeling of document clusters
US20150100583A1 (en) * 2013-10-08 2015-04-09 Cisco Technology, Inc. Method and apparatus for organizing multimedia content
US20150100582A1 (en) * 2013-10-08 2015-04-09 Cisco Technology, Inc. Association of topic labels with digital content
US20150205791A1 (en) * 2006-06-26 2015-07-23 Scenera Technologies, Llc Methods, Systems, And Computer Program Products For Identifying A Container Associated With A Plurality Of Files
US10083396B2 (en) 2009-07-28 2018-09-25 Fti Consulting, Inc. Computer-implemented system and method for assigning concept classification suggestions
US10111601B2 (en) 2013-09-25 2018-10-30 Bardy Diagnostics, Inc. Extended wear electrocardiography monitor optimized for capturing low amplitude cardiac action potential propagation
US10123703B2 (en) 2015-10-05 2018-11-13 Bardy Diagnostics, Inc. Health monitoring apparatus with wireless capabilities for initiating a patient treatment with the aid of a digital computer
US10154793B2 (en) 2013-09-25 2018-12-18 Bardy Diagnostics, Inc. Extended wear electrocardiography patch with wire contact surfaces
US10172534B2 (en) 2013-09-25 2019-01-08 Bardy Diagnostics, Inc. Remote interfacing electrocardiography patch
US10251576B2 (en) 2013-09-25 2019-04-09 Bardy Diagnostics, Inc. System and method for ECG data classification for use in facilitating diagnosis of cardiac rhythm disorders with the aid of a digital computer
US10251575B2 (en) 2013-09-25 2019-04-09 Bardy Diagnostics, Inc. Wearable electrocardiography and physiology monitoring ensemble
US10265015B2 (en) 2013-09-25 2019-04-23 Bardy Diagnostics, Inc. Monitor recorder optimized for electrocardiography and respiratory data acquisition and processing
US10264992B2 (en) 2013-09-25 2019-04-23 Bardy Diagnostics, Inc. Extended wear sewn electrode electrocardiography monitor
US10271755B2 (en) 2013-09-25 2019-04-30 Bardy Diagnostics, Inc. Method for constructing physiological electrode assembly with sewn wire interconnects
US10271756B2 (en) 2013-09-25 2019-04-30 Bardy Diagnostics, Inc. Monitor recorder optimized for electrocardiographic signal processing
US10278603B2 (en) 2013-09-25 2019-05-07 Bardy Diagnostics, Inc. System and method for secure physiological data acquisition and storage

Families Citing this family (77)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7763189B2 (en) * 2001-05-16 2010-07-27 E. I. Du Pont De Nemours And Company Dielectric composition with reduced resistance
US7346509B2 (en) * 2002-09-27 2008-03-18 Callminer, Inc. Software for statistical analysis of speech
WO2004090870A1 (en) * 2003-04-04 2004-10-21 Kabushiki Kaisha Toshiba Method and apparatus for encoding or decoding wide-band audio
US7567908B2 (en) * 2004-01-13 2009-07-28 International Business Machines Corporation Differential dynamic content delivery with text display in dependence upon simultaneous speech
JP2005202014A (en) * 2004-01-14 2005-07-28 Sony Corp Audio signal processor, audio signal processing method, and audio signal processing program
US8923838B1 (en) 2004-08-19 2014-12-30 Nuance Communications, Inc. System, method and computer program product for activating a cellular phone account
US7956905B2 (en) * 2005-02-28 2011-06-07 Fujifilm Corporation Titling apparatus, a titling method, and a machine readable medium storing thereon a computer program for titling
GB0511307D0 (en) * 2005-06-03 2005-07-13 South Manchester University Ho A method for generating output data
US7382933B2 (en) * 2005-08-24 2008-06-03 International Business Machines Corporation System and method for semantic video segmentation based on joint audiovisual and text analysis
WO2007023436A1 (en) 2005-08-26 2007-03-01 Koninklijke Philips Electronics N.V. System and method for synchronizing sound and manually transcribed text
US20070094023A1 (en) * 2005-10-21 2007-04-26 Callminer, Inc. Method and apparatus for processing heterogeneous units of work
US20070094270A1 (en) * 2005-10-21 2007-04-26 Callminer, Inc. Method and apparatus for the processing of heterogeneous units of work
US8756057B2 (en) * 2005-11-02 2014-06-17 Nuance Communications, Inc. System and method using feedback speech analysis for improving speaking ability
KR100755677B1 (en) * 2005-11-02 2007-09-05 삼성전자주식회사 Apparatus and method for dialogue speech recognition using topic detection
WO2007061947A2 (en) * 2005-11-18 2007-05-31 Blacklidge Emulsions, Inc. Method for bonding prepared substrates for roadways using a low-tracking asphalt emulsion coating
US20070129943A1 (en) * 2005-12-06 2007-06-07 Microsoft Corporation Speech recognition using adaptation and prior knowledge
CA2536976A1 (en) * 2006-02-20 2007-08-20 Diaphonics, Inc. Method and apparatus for detecting speaker change in a voice transaction
US20080004876A1 (en) * 2006-06-30 2008-01-03 Chuang He Non-enrolled continuous dictation
US20080051916A1 (en) * 2006-08-28 2008-02-28 Arcadyan Technology Corporation Method and apparatus for recording streamed audio
KR100826875B1 (en) * 2006-09-08 2008-05-06 한국전자통신연구원 On-line speaker recognition method and apparatus for thereof
US8073681B2 (en) * 2006-10-16 2011-12-06 Voicebox Technologies, Inc. System and method for a cooperative conversational voice user interface
US20080104066A1 (en) * 2006-10-27 2008-05-01 Yahoo! Inc. Validating segmentation criteria
US7272558B1 (en) 2006-12-01 2007-09-18 Coveo Solutions Inc. Speech recognition training method for audio and video file indexing on a search engine
US20080154579A1 (en) * 2006-12-21 2008-06-26 Krishna Kummamuru Method of analyzing conversational transcripts
US7818176B2 (en) 2007-02-06 2010-10-19 Voicebox Technologies, Inc. System and method for selecting and presenting advertisements based on natural language processing of voice-based input
US8386254B2 (en) * 2007-05-04 2013-02-26 Nuance Communications, Inc. Multi-class constrained maximum likelihood linear regression
AT457511T (en) * 2007-10-10 2010-02-15 Harman Becker Automotive Sys speaker recognition
JP4405542B2 (en) * 2007-10-24 2010-01-27 株式会社東芝 Apparatus for clustering phonemic model, method and program
US9386154B2 (en) 2007-12-21 2016-07-05 Nuance Communications, Inc. System, method and software program for enabling communications between customer service agents and users of communication devices
JPWO2009122779A1 (en) * 2008-04-03 2011-07-28 日本電気株式会社 Text data processing apparatus, method, program
WO2010019831A1 (en) 2008-08-14 2010-02-18 21Ct, Inc. Hidden markov model for speech processing with training method
CA2680304C (en) * 2008-09-25 2017-08-22 Multimodal Technologies, Inc. Decoding-time prediction of non-verbalized tokens
US8458105B2 (en) 2009-02-12 2013-06-04 Decisive Analytics Corporation Method and apparatus for analyzing and interrelating data
US8301446B2 (en) * 2009-03-30 2012-10-30 Adacel Systems, Inc. System and method for training an acoustic model with reduced feature space variation
US8412525B2 (en) * 2009-04-30 2013-04-02 Microsoft Corporation Noise robust speech classifier ensemble
US8554562B2 (en) * 2009-11-15 2013-10-08 Nuance Communications, Inc. Method and system for speaker diarization
EP2539511A4 (en) 2010-02-24 2014-07-30 Blacklidge Emulsions Inc Hot applied tack coat
US9305553B2 (en) * 2010-04-28 2016-04-05 William S. Meisel Speech recognition accuracy improvement through speaker categories
US9009040B2 (en) * 2010-05-05 2015-04-14 Cisco Technology, Inc. Training a transcription system
US8391464B1 (en) 2010-06-24 2013-03-05 Nuance Communications, Inc. Customer service system, method, and software program product for responding to queries using natural language understanding
JP2012038131A (en) * 2010-08-09 2012-02-23 Sony Corp Information processing unit, information processing method, and program
US8630854B2 (en) * 2010-08-31 2014-01-14 Fujitsu Limited System and method for generating videoconference transcriptions
US8791977B2 (en) 2010-10-05 2014-07-29 Fujitsu Limited Method and system for presenting metadata during a videoconference
CN102455997A (en) * 2010-10-27 2012-05-16 鸿富锦精密工业(深圳)有限公司 Component name extraction system and method
US20120197643A1 (en) * 2011-01-27 2012-08-02 General Motors Llc Mapping obstruent speech energy to lower frequencies
GB2489489B (en) * 2011-03-30 2013-08-21 Toshiba Res Europ Ltd A speech processing system and method
US9774747B2 (en) * 2011-04-29 2017-09-26 Nexidia Inc. Transcription system
US9313336B2 (en) 2011-07-21 2016-04-12 Nuance Communications, Inc. Systems and methods for processing audio signals captured using microphones of multiple devices
JP2013025299A (en) * 2011-07-26 2013-02-04 Toshiba Corp Transcription support system and transcription support method
JP5638479B2 (en) * 2011-07-26 2014-12-10 株式会社東芝 Transcription support system and transcription support method
JP5404726B2 (en) * 2011-09-26 2014-02-05 株式会社東芝 The information processing apparatus, information processing method and program
US8433577B2 (en) 2011-09-27 2013-04-30 Google Inc. Detection of creative works on broadcast media
US20130144414A1 (en) * 2011-12-06 2013-06-06 Cisco Technology, Inc. Method and apparatus for discovering and labeling speakers in a large and growing collection of videos with minimal user effort
JP2013161205A (en) * 2012-02-03 2013-08-19 Sony Corp Information processing device, information processing method and program
US20130266127A1 (en) 2012-04-10 2013-10-10 Raytheon Bbn Technologies Corp System and method for removing sensitive data from a recording
US20140365221A1 (en) * 2012-07-31 2014-12-11 Novospeech Ltd. Method and apparatus for speech recognition
US8676590B1 (en) 2012-09-26 2014-03-18 Google Inc. Web-based audio transcription tool
US20140136204A1 (en) * 2012-11-13 2014-05-15 GM Global Technology Operations LLC Methods and systems for speech systems
US9865266B2 (en) * 2013-02-25 2018-01-09 Nuance Communications, Inc. Method and apparatus for automated speaker parameters adaptation in a deployed speaker verification system
US9942396B2 (en) * 2013-11-01 2018-04-10 Adobe Systems Incorporated Document distribution and interaction
CN104143326B (en) * 2013-12-03 2016-11-02 腾讯科技(深圳)有限公司 A speech recognition method and apparatus command
US9544149B2 (en) 2013-12-16 2017-01-10 Adobe Systems Incorporated Automatic E-signatures in response to conditions and/or events
US9413891B2 (en) 2014-01-08 2016-08-09 Callminer, Inc. Real-time conversational analytics facility
JP6392012B2 (en) * 2014-07-14 2018-09-19 株式会社東芝 Speech synthesis dictionary creating apparatus, a speech synthesizer, speech synthesis dictionary creating method and speech synthesis dictionary creating program
US9728190B2 (en) * 2014-07-25 2017-08-08 International Business Machines Corporation Summarization of audio data
US9703982B2 (en) 2014-11-06 2017-07-11 Adobe Systems Incorporated Document distribution and interaction
US9531545B2 (en) 2014-11-24 2016-12-27 Adobe Systems Incorporated Tracking and notification of fulfillment events
US9432368B1 (en) 2015-02-19 2016-08-30 Adobe Systems Incorporated Document distribution and interaction
JP6464411B6 (en) * 2015-02-25 2019-03-13 Dynabook株式会社 Electronic devices, methods and program
US10068445B2 (en) * 2015-06-24 2018-09-04 Google Llc Systems and methods of home-specific sound event detection
US10089061B2 (en) * 2015-08-28 2018-10-02 Kabushiki Kaisha Toshiba Electronic device and method
US9935777B2 (en) 2015-08-31 2018-04-03 Adobe Systems Incorporated Electronic signature framework with enhanced security
US9626653B2 (en) 2015-09-21 2017-04-18 Adobe Systems Incorporated Document distribution and interaction with delegation of signature authority
US9754593B2 (en) * 2015-11-04 2017-09-05 International Business Machines Corporation Sound envelope deconstruction to identify words and speakers in continuous speech
US10255905B2 (en) * 2016-06-10 2019-04-09 Google Llc Predicting pronunciations with word stress
US10217453B2 (en) * 2016-10-14 2019-02-26 Soundhound, Inc. Virtual assistant configured by selection of wake-up phrase
US20180336892A1 (en) * 2017-05-16 2018-11-22 Apple Inc. Detecting a trigger of a digital assistant

Citations (96)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4879648A (en) * 1986-09-19 1989-11-07 Nancy P. Cochran Search system which continuously displays search terms during scrolling and selections of individually displayed data sets
US4908866A (en) * 1985-02-04 1990-03-13 Eric Goldwasser Speech transcribing system
US5317732A (en) * 1991-04-26 1994-05-31 Commodore Electronics Limited System for relocating a multimedia presentation on a different platform by extracting a resource map in order to remap and relocate resources
US5404295A (en) * 1990-08-16 1995-04-04 Katz; Boris Method and apparatus for utilizing annotations to facilitate computer retrieval of database material
US5418716A (en) * 1990-07-26 1995-05-23 Nec Corporation System for recognizing sentence patterns and a system for recognizing sentence patterns and grammatical cases
US5544257A (en) * 1992-01-08 1996-08-06 International Business Machines Corporation Continuous parameter hidden Markov model approach to automatic handwriting recognition
US5559875A (en) * 1995-07-31 1996-09-24 Latitude Communications Method and apparatus for recording and retrieval of audio conferences
US5572728A (en) * 1993-12-24 1996-11-05 Hitachi, Ltd. Conference multimedia summary support system and method
US5613032A (en) * 1994-09-02 1997-03-18 Bell Communications Research, Inc. System and method for recording, playing back and searching multimedia events wherein video, audio and text can be searched and retrieved
US5614940A (en) * 1994-10-21 1997-03-25 Intel Corporation Method and apparatus for providing broadcast information with indexing
US5684924A (en) * 1995-05-19 1997-11-04 Kurzweil Applied Intelligence, Inc. User adaptable speech recognition system
US5715367A (en) * 1995-01-23 1998-02-03 Dragon Systems, Inc. Apparatuses and methods for developing and using models for speech recognition
US5752021A (en) * 1994-05-24 1998-05-12 Fuji Xerox Co., Ltd. Document database management apparatus capable of conversion between retrieval formulae for different schemata
US5757960A (en) * 1994-09-30 1998-05-26 Murdock; Michael Chase Method and system for extracting features from handwritten text
US5768607A (en) * 1994-09-30 1998-06-16 Intel Corporation Method and apparatus for freehand annotation and drawings incorporating sound and for compressing and synchronizing sound
US5777614A (en) * 1994-10-14 1998-07-07 Hitachi, Ltd. Editing support system including an interactive interface
US5787198A (en) * 1992-11-24 1998-07-28 Lucent Technologies Inc. Text recognition using two-dimensional stochastic models
US5806032A (en) * 1996-06-14 1998-09-08 Lucent Technologies Inc. Compilation of weighted finite-state transducers from decision trees
US5835667A (en) * 1994-10-14 1998-11-10 Carnegie Mellon University Method and apparatus for creating a searchable digital video library and a system and method of using such a library
US5862259A (en) * 1996-03-27 1999-01-19 Caere Corporation Pattern recognition employing arbitrary segmentation and compound probabilistic evaluation
US5875108A (en) * 1991-12-23 1999-02-23 Hoffberg; Steven M. Ergonomic man-machine interface incorporating adaptive pattern recognition based control system
US5960447A (en) * 1995-11-13 1999-09-28 Holt; Douglas Word tagging and editing system for speech recognition
US5963940A (en) * 1995-08-16 1999-10-05 Syracuse University Natural language information retrieval system and method
US5970473A (en) * 1997-12-31 1999-10-19 At&T Corp. Video communication device providing in-home catalog services
US6006184A (en) * 1997-01-28 1999-12-21 Nec Corporation Tree structured cohort selection for speaker recognition system
US6024571A (en) * 1996-04-25 2000-02-15 Renegar; Janet Elaine Foreign language communication system/device and learning aid
US6026388A (en) * 1995-08-16 2000-02-15 Textwise, Llc User interface and other enhancements for natural language information retrieval system and method
US6029195A (en) * 1994-11-29 2000-02-22 Herz; Frederick S. M. System for customized electronic identification of desirable objects
US6029124A (en) * 1997-02-21 2000-02-22 Dragon Systems, Inc. Sequential, nonparametric speech recognition and speaker identification
US6052657A (en) * 1997-09-09 2000-04-18 Dragon Systems, Inc. Text segmentation and identification of topic using language models
US6064963A (en) * 1997-12-17 2000-05-16 Opus Telecom, L.L.C. Automatic key word or phrase speech recognition for the corrections industry
US6067517A (en) * 1996-02-02 2000-05-23 International Business Machines Corporation Transcription of speech data with segments from acoustically dissimilar environments
US6067514A (en) * 1998-06-23 2000-05-23 International Business Machines Corporation Method for automatically punctuating a speech utterance in a continuous speech recognition system
US6073096A (en) * 1998-02-04 2000-06-06 International Business Machines Corporation Speaker adaptation system and method based on class-specific pre-clustering training speakers
US6076053A (en) * 1998-05-21 2000-06-13 Lucent Technologies Inc. Methods and apparatus for discriminative training and adaptation of pronunciation networks
US6088669A (en) * 1997-01-28 2000-07-11 International Business Machines, Corporation Speech recognition with attempted speaker recognition for speaker model prefetching or alternative speech modeling
US6112172A (en) * 1998-03-31 2000-08-29 Dragon Systems, Inc. Interactive searching
US6151598A (en) * 1995-08-14 2000-11-21 Shaw; Venson M. Digital dictionary with a communication system for the creating, updating, editing, storing, maintaining, referencing, and managing the digital dictionary
US6169789B1 (en) * 1996-12-16 2001-01-02 Sanjay K. Rao Intelligent keyboard system
US6185531B1 (en) * 1997-01-09 2001-02-06 Gte Internetworking Incorporated Topic indexing method
US6219640B1 (en) * 1999-08-06 2001-04-17 International Business Machines Corporation Methods and apparatus for audio-visual speaker recognition and utterance verification
US6246983B1 (en) * 1998-08-05 2001-06-12 Matsushita Electric Corporation Of America Text-to-speech e-mail reader with multi-modal reply processor
US6253179B1 (en) * 1999-01-29 2001-06-26 International Business Machines Corporation Method and apparatus for multi-environment speaker verification
US6266667B1 (en) * 1998-01-15 2001-07-24 Telefonaktiebolaget Lm Ericsson (Publ) Information routing
US20010026377A1 (en) * 2000-03-21 2001-10-04 Katsumi Ikegami Image display system, image registration terminal device and image reading terminal device used in the image display system
US6308222B1 (en) * 1996-06-03 2001-10-23 Microsoft Corporation Transcoding of audio data
US6317716B1 (en) * 1997-09-19 2001-11-13 Massachusetts Institute Of Technology Automatic cueing of speech
US20020001261A1 (en) * 2000-04-21 2002-01-03 Yoshinori Matsui Data playback apparatus
US20020010575A1 (en) * 2000-04-08 2002-01-24 International Business Machines Corporation Method and system for the automatic segmentation of an audio stream into semantic or syntactic units
US20020010916A1 (en) * 2000-05-22 2002-01-24 Compaq Computer Corporation Apparatus and method for controlling rate of playback of audio data
US6345252B1 (en) * 1999-04-09 2002-02-05 International Business Machines Corporation Methods and apparatus for retrieving audio information using content and speaker information
US6347295B1 (en) * 1998-10-26 2002-02-12 Compaq Computer Corporation Computer method and apparatus for grapheme-to-phoneme rule-set-generation
US6360237B1 (en) * 1998-10-05 2002-03-19 Lernout & Hauspie Speech Products N.V. Method and system for performing text edits during audio recording playback
US6360234B2 (en) * 1997-08-14 2002-03-19 Virage, Inc. Video cataloger system with synchronized encoders
US6373985B1 (en) * 1998-08-12 2002-04-16 Lucent Technologies, Inc. E-mail signature block analysis
US20020049589A1 (en) * 2000-06-28 2002-04-25 Poirier Darrell A. Simultaneous multi-user real-time voice recognition system
US6381640B1 (en) * 1998-09-11 2002-04-30 Genesys Telecommunications Laboratories, Inc. Method and apparatus for automated personalization and presentation of workload assignments to agents within a multimedia communication center
US20020059204A1 (en) * 2000-07-28 2002-05-16 Harris Larry R. Distributed search system and method
US6434520B1 (en) * 1999-04-16 2002-08-13 International Business Machines Corporation System and method for indexing and querying audio archives
US6437818B1 (en) * 1993-10-01 2002-08-20 Collaboration Properties, Inc. Video conferencing on existing UTP infrastructure
US20020133477A1 (en) * 2001-03-05 2002-09-19 Glenn Abel Method for profile-based notice and broadcast of multimedia content
US6463444B1 (en) * 1997-08-14 2002-10-08 Virage, Inc. Video cataloger system with extensibility
US6480826B2 (en) * 1999-08-31 2002-11-12 Accenture Llp System and method for a telephonic emotion detection that provides operator feedback
US20030051214A1 (en) * 1997-12-22 2003-03-13 Ricoh Company, Ltd. Techniques for annotating portions of a document relevant to concepts of interest
US20030088414A1 (en) * 2001-05-10 2003-05-08 Chao-Shih Huang Background learning of speaker voices
US20030093580A1 (en) * 2001-11-09 2003-05-15 Koninklijke Philips Electronics N.V. Method and system for information alerts
US6567980B1 (en) * 1997-08-14 2003-05-20 Virage, Inc. Video cataloger system with hyperlinked output
US6571208B1 (en) * 1999-11-29 2003-05-27 Matsushita Electric Industrial Co., Ltd. Context-dependent acoustic models for medium and large vocabulary speech recognition with eigenvoice training
US6602300B2 (en) * 1998-02-03 2003-08-05 Fujitsu Limited Apparatus and method for retrieving data from a document database
US6604110B1 (en) * 2000-08-31 2003-08-05 Ascential Software, Inc. Automated software code generation from a metadata-based repository
US6611803B1 (en) * 1998-12-17 2003-08-26 Matsushita Electric Industrial Co., Ltd. Method and apparatus for retrieving a video and audio scene using an index generated by speech recognition
US20030167163A1 (en) * 2002-02-22 2003-09-04 Nec Research Institute, Inc. Inferring hierarchical descriptions of a set of documents
US6624826B1 (en) * 1999-09-28 2003-09-23 Ricoh Co., Ltd. Method and apparatus for generating visual representations for audio documents
US6647383B1 (en) * 2000-09-01 2003-11-11 Lucent Technologies Inc. System and method for providing interactive dialogue and iterative search functions to find information
US6654735B1 (en) * 1999-01-08 2003-11-25 International Business Machines Corporation Outbound information analysis for generating user interest profiles and improving user productivity
US20040024739A1 (en) * 1999-06-15 2004-02-05 Kanisa Inc. System and method for implementing a knowledge management system
US6708148B2 (en) * 2001-10-12 2004-03-16 Koninklijke Philips Electronics N.V. Correction device to mark parts of a recognized text
US6711541B1 (en) * 1999-09-07 2004-03-23 Matsushita Electric Industrial Co., Ltd. Technique for developing discriminative sound units for speech recognition and allophone modeling
US6714911B2 (en) * 2001-01-25 2004-03-30 Harcourt Assessment, Inc. Speech transcription and analysis system and method
US6718305B1 (en) * 1999-03-19 2004-04-06 Koninklijke Philips Electronics N.V. Specifying a tree structure for speech recognizers using correlation between regression classes
US6718303B2 (en) * 1998-05-13 2004-04-06 International Business Machines Corporation Apparatus and method for automatically generating punctuation marks in continuous speech recognition
US20040073444A1 (en) * 2001-01-16 2004-04-15 Li Li Peh Method and apparatus for a financial database structure
US6732183B1 (en) * 1996-12-31 2004-05-04 Broadware Technologies, Inc. Video and audio streaming for multiple users
US6748356B1 (en) * 2000-06-07 2004-06-08 International Business Machines Corporation Methods and apparatus for identifying unknown speakers using a hierarchical tree structure
US6778979B2 (en) * 2001-08-13 2004-08-17 Xerox Corporation System for automatically generating queries
US6778958B1 (en) * 1999-08-30 2004-08-17 International Business Machines Corporation Symbol insertion apparatus and method
US6792409B2 (en) * 1999-12-20 2004-09-14 Koninklijke Philips Electronics N.V. Synchronous reproduction in a speech recognition system
US6847961B2 (en) * 1999-06-30 2005-01-25 Silverbrook Research Pty Ltd Method and system for searching information using sensor with identifier
US20050060162A1 (en) * 2000-11-10 2005-03-17 Farhad Mohit Systems and methods for automatic identification and hyperlinking of words or other data items and for information retrieval using hyperlinked words or data items
US6922691B2 (en) * 2000-08-28 2005-07-26 Emotion, Inc. Method and apparatus for digital media management, retrieval, and collaboration
US6931376B2 (en) * 2000-07-20 2005-08-16 Microsoft Corporation Speech-related event notification system
US6961954B1 (en) * 1997-10-27 2005-11-01 The Mitre Corporation Automated segmentation, information extraction, summarization, and presentation of broadcast news
US6999918B2 (en) * 2002-09-20 2006-02-14 Motorola, Inc. Method and apparatus to facilitate correlating symbols to sounds
US20060129541A1 (en) * 2002-06-11 2006-06-15 Microsoft Corporation Dynamically updated quick searches and strategies
US7131117B2 (en) * 2002-09-04 2006-10-31 Sbc Properties, L.P. Method and system for automating the analysis of word frequencies
US7257528B1 (en) * 1998-02-13 2007-08-14 Zi Corporation Of Canada, Inc. Method and apparatus for Chinese character text input

Family Cites Families (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0693221B2 (en) 1985-06-12 1994-11-16 株式会社日立製作所 Voice input device
US4908868A (en) * 1989-02-21 1990-03-13 Mctaggart James E Phase polarity test instrument and method
US6978277B2 (en) * 1989-10-26 2005-12-20 Encyclopaedia Britannica, Inc. Multimedia search system
JP2524472B2 (en) * 1992-09-21 1996-08-14 インターナショナル・ビジネス・マシーンズ・コーポレイション How to train the voice recognition system of the telephone line use
GB2285895A (en) 1994-01-19 1995-07-26 Ibm Audio conferencing system which generates a set of minutes
US5729656A (en) 1994-11-30 1998-03-17 International Business Machines Corporation Reduction of search space in speech recognition using phone boundaries and phone ranking
US5638487A (en) * 1994-12-30 1997-06-10 Purespeech, Inc. Automatic speech recognition
AU6849196A (en) * 1995-08-16 1997-03-19 Syracuse University Multilingual document retrieval system and method using semantic vector matching
US20020002562A1 (en) * 1995-11-03 2002-01-03 Thomas P. Moran Computer controlled display system using a graphical replay device to control playback of temporal data representing collaborative activities
JPH09269931A (en) * 1996-01-30 1997-10-14 Canon Inc Cooperative work environment constructing system, its method and medium
WO1997032299A1 (en) * 1996-02-27 1997-09-04 Philips Electronics N.V. Method and apparatus for automatic speech segmentation into phoneme-like units
US5778187A (en) * 1996-05-09 1998-07-07 Netcast Communications Corp. Multicasting method and apparatus
US5897614A (en) * 1996-12-20 1999-04-27 International Business Machines Corporation Method and apparatus for sibilant classification in a speech recognition system
WO1999017235A1 (en) 1997-10-01 1999-04-08 At & T Corp. Method and apparatus for storing and retrieving labeled interval data for multimedia recordings
US6327343B1 (en) 1998-01-16 2001-12-04 International Business Machines Corporation System and methods for automatic call and data transfer processing
US6243680B1 (en) * 1998-06-15 2001-06-05 Nortel Networks Limited Method and apparatus for obtaining a transcription of phrases through text and spoken utterances
US6161087A (en) * 1998-10-05 2000-12-12 Lernout & Hauspie Speech Products N.V. Speech-recognition-assisted selective suppression of silent and filled speech pauses during playback of an audio recording
US6332139B1 (en) 1998-11-09 2001-12-18 Mega Chips Corporation Information communication system
WO2000059223A1 (en) 1999-03-30 2000-10-05 Tivo, Inc. Data storage management and scheduling system
IES990800A2 (en) 1999-08-20 2000-09-06 Digitake Software Systems Ltd An audio processing system
WO2001063597A1 (en) * 2000-02-25 2001-08-30 Koninklijke Philips Electronics N.V. Speech recognition device with reference transformation means
JP2002008389A (en) * 2000-06-20 2002-01-11 Mitsubishi Electric Corp Semiconductor memory
WO2002010887A2 (en) 2000-07-28 2002-02-07 Jan Pathuel Method and system of securing data and systems
AU7639400A (en) 2000-09-30 2002-04-15 Intel Corp Method and system for generating and searching an optimal maximum likelihood decision tree for hidden markov model (hmm) based speech recognition
US7472064B1 (en) 2000-09-30 2008-12-30 Intel Corporation Method and system to scale down a decision tree-based hidden markov model (HMM) for speech recognition
US6934756B2 (en) * 2000-11-01 2005-08-23 International Business Machines Corporation Conversational networking via transport, coding and control conversational protocols
US6973428B2 (en) * 2001-05-24 2005-12-06 International Business Machines Corporation System and method for searching, analyzing and displaying text transcripts of speech after imperfect speech recognition
US6748350B2 (en) * 2001-09-27 2004-06-08 Intel Corporation Method to compensate for stress between heat spreader and thermal interface material
US7221663B2 (en) 2001-12-31 2007-05-22 Polycom, Inc. Method and apparatus for wideband conferencing
US7580838B2 (en) 2002-11-22 2009-08-25 Scansoft, Inc. Automatic insertion of non-verbalized punctuation

Patent Citations (99)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4908866A (en) * 1985-02-04 1990-03-13 Eric Goldwasser Speech transcribing system
US4879648A (en) * 1986-09-19 1989-11-07 Nancy P. Cochran Search system which continuously displays search terms during scrolling and selections of individually displayed data sets
US5418716A (en) * 1990-07-26 1995-05-23 Nec Corporation System for recognizing sentence patterns and a system for recognizing sentence patterns and grammatical cases
US5404295A (en) * 1990-08-16 1995-04-04 Katz; Boris Method and apparatus for utilizing annotations to facilitate computer retrieval of database material
US5317732A (en) * 1991-04-26 1994-05-31 Commodore Electronics Limited System for relocating a multimedia presentation on a different platform by extracting a resource map in order to remap and relocate resources
US5875108A (en) * 1991-12-23 1999-02-23 Hoffberg; Steven M. Ergonomic man-machine interface incorporating adaptive pattern recognition based control system
US5544257A (en) * 1992-01-08 1996-08-06 International Business Machines Corporation Continuous parameter hidden Markov model approach to automatic handwriting recognition
US5787198A (en) * 1992-11-24 1998-07-28 Lucent Technologies Inc. Text recognition using two-dimensional stochastic models
US6437818B1 (en) * 1993-10-01 2002-08-20 Collaboration Properties, Inc. Video conferencing on existing UTP infrastructure
US5572728A (en) * 1993-12-24 1996-11-05 Hitachi, Ltd. Conference multimedia summary support system and method
US5752021A (en) * 1994-05-24 1998-05-12 Fuji Xerox Co., Ltd. Document database management apparatus capable of conversion between retrieval formulae for different schemata
US5613032A (en) * 1994-09-02 1997-03-18 Bell Communications Research, Inc. System and method for recording, playing back and searching multimedia events wherein video, audio and text can be searched and retrieved
US5757960A (en) * 1994-09-30 1998-05-26 Murdock; Michael Chase Method and system for extracting features from handwritten text
US5768607A (en) * 1994-09-30 1998-06-16 Intel Corporation Method and apparatus for freehand annotation and drawings incorporating sound and for compressing and synchronizing sound
US5835667A (en) * 1994-10-14 1998-11-10 Carnegie Mellon University Method and apparatus for creating a searchable digital video library and a system and method of using such a library
US5777614A (en) * 1994-10-14 1998-07-07 Hitachi, Ltd. Editing support system including an interactive interface
US5614940A (en) * 1994-10-21 1997-03-25 Intel Corporation Method and apparatus for providing broadcast information with indexing
US6029195A (en) * 1994-11-29 2000-02-22 Herz; Frederick S. M. System for customized electronic identification of desirable objects
US5715367A (en) * 1995-01-23 1998-02-03 Dragon Systems, Inc. Apparatuses and methods for developing and using models for speech recognition
US5684924A (en) * 1995-05-19 1997-11-04 Kurzweil Applied Intelligence, Inc. User adaptable speech recognition system
US5559875A (en) * 1995-07-31 1996-09-24 Latitude Communications Method and apparatus for recording and retrieval of audio conferences
US6151598A (en) * 1995-08-14 2000-11-21 Shaw; Venson M. Digital dictionary with a communication system for the creating, updating, editing, storing, maintaining, referencing, and managing the digital dictionary
US5963940A (en) * 1995-08-16 1999-10-05 Syracuse University Natural language information retrieval system and method
US6026388A (en) * 1995-08-16 2000-02-15 Textwise, Llc User interface and other enhancements for natural language information retrieval system and method
US5960447A (en) * 1995-11-13 1999-09-28 Holt; Douglas Word tagging and editing system for speech recognition
US6067517A (en) * 1996-02-02 2000-05-23 International Business Machines Corporation Transcription of speech data with segments from acoustically dissimilar environments
US5862259A (en) * 1996-03-27 1999-01-19 Caere Corporation Pattern recognition employing arbitrary segmentation and compound probabilistic evaluation
US6024571A (en) * 1996-04-25 2000-02-15 Renegar; Janet Elaine Foreign language communication system/device and learning aid
US6308222B1 (en) * 1996-06-03 2001-10-23 Microsoft Corporation Transcoding of audio data
US5806032A (en) * 1996-06-14 1998-09-08 Lucent Technologies Inc. Compilation of weighted finite-state transducers from decision trees
US6169789B1 (en) * 1996-12-16 2001-01-02 Sanjay K. Rao Intelligent keyboard system
US6732183B1 (en) * 1996-12-31 2004-05-04 Broadware Technologies, Inc. Video and audio streaming for multiple users
US6185531B1 (en) * 1997-01-09 2001-02-06 Gte Internetworking Incorporated Topic indexing method
US6088669A (en) * 1997-01-28 2000-07-11 International Business Machines, Corporation Speech recognition with attempted speaker recognition for speaker model prefetching or alternative speech modeling
US6006184A (en) * 1997-01-28 1999-12-21 Nec Corporation Tree structured cohort selection for speaker recognition system
US6029124A (en) * 1997-02-21 2000-02-22 Dragon Systems, Inc. Sequential, nonparametric speech recognition and speaker identification
US6567980B1 (en) * 1997-08-14 2003-05-20 Virage, Inc. Video cataloger system with hyperlinked output
US6360234B2 (en) * 1997-08-14 2002-03-19 Virage, Inc. Video cataloger system with synchronized encoders
US6877134B1 (en) * 1997-08-14 2005-04-05 Virage, Inc. Integrated data and real-time metadata capture system and method
US6463444B1 (en) * 1997-08-14 2002-10-08 Virage, Inc. Video cataloger system with extensibility
US6052657A (en) * 1997-09-09 2000-04-18 Dragon Systems, Inc. Text segmentation and identification of topic using language models
US6317716B1 (en) * 1997-09-19 2001-11-13 Massachusetts Institute Of Technology Automatic cueing of speech
US6961954B1 (en) * 1997-10-27 2005-11-01 The Mitre Corporation Automated segmentation, information extraction, summarization, and presentation of broadcast news
US6064963A (en) * 1997-12-17 2000-05-16 Opus Telecom, L.L.C. Automatic key word or phrase speech recognition for the corrections industry
US20030051214A1 (en) * 1997-12-22 2003-03-13 Ricoh Company, Ltd. Techniques for annotating portions of a document relevant to concepts of interest
US5970473A (en) * 1997-12-31 1999-10-19 At&T Corp. Video communication device providing in-home catalog services
US6266667B1 (en) * 1998-01-15 2001-07-24 Telefonaktiebolaget Lm Ericsson (Publ) Information routing
US6602300B2 (en) * 1998-02-03 2003-08-05 Fujitsu Limited Apparatus and method for retrieving data from a document database
US6073096A (en) * 1998-02-04 2000-06-06 International Business Machines Corporation Speaker adaptation system and method based on class-specific pre-clustering training speakers
US7257528B1 (en) * 1998-02-13 2007-08-14 Zi Corporation Of Canada, Inc. Method and apparatus for Chinese character text input
US6112172A (en) * 1998-03-31 2000-08-29 Dragon Systems, Inc. Interactive searching
US6718303B2 (en) * 1998-05-13 2004-04-06 International Business Machines Corporation Apparatus and method for automatically generating punctuation marks in continuous speech recognition
US6076053A (en) * 1998-05-21 2000-06-13 Lucent Technologies Inc. Methods and apparatus for discriminative training and adaptation of pronunciation networks
US6067514A (en) * 1998-06-23 2000-05-23 International Business Machines Corporation Method for automatically punctuating a speech utterance in a continuous speech recognition system
US6246983B1 (en) * 1998-08-05 2001-06-12 Matsushita Electric Corporation Of America Text-to-speech e-mail reader with multi-modal reply processor
US6373985B1 (en) * 1998-08-12 2002-04-16 Lucent Technologies, Inc. E-mail signature block analysis
US6381640B1 (en) * 1998-09-11 2002-04-30 Genesys Telecommunications Laboratories, Inc. Method and apparatus for automated personalization and presentation of workload assignments to agents within a multimedia communication center
US6360237B1 (en) * 1998-10-05 2002-03-19 Lernout & Hauspie Speech Products N.V. Method and system for performing text edits during audio recording playback
US6347295B1 (en) * 1998-10-26 2002-02-12 Compaq Computer Corporation Computer method and apparatus for grapheme-to-phoneme rule-set-generation
US6728673B2 (en) * 1998-12-17 2004-04-27 Matsushita Electric Industrial Co., Ltd Method and apparatus for retrieving a video and audio scene using an index generated by speech recognition
US6611803B1 (en) * 1998-12-17 2003-08-26 Matsushita Electric Industrial Co., Ltd. Method and apparatus for retrieving a video and audio scene using an index generated by speech recognition
US6654735B1 (en) * 1999-01-08 2003-11-25 International Business Machines Corporation Outbound information analysis for generating user interest profiles and improving user productivity
US6253179B1 (en) * 1999-01-29 2001-06-26 International Business Machines Corporation Method and apparatus for multi-environment speaker verification
US6718305B1 (en) * 1999-03-19 2004-04-06 Koninklijke Philips Electronics N.V. Specifying a tree structure for speech recognizers using correlation between regression classes
US6345252B1 (en) * 1999-04-09 2002-02-05 International Business Machines Corporation Methods and apparatus for retrieving audio information using content and speaker information
US6434520B1 (en) * 1999-04-16 2002-08-13 International Business Machines Corporation System and method for indexing and querying audio archives
US20040024739A1 (en) * 1999-06-15 2004-02-05 Kanisa Inc. System and method for implementing a knowledge management system
US6847961B2 (en) * 1999-06-30 2005-01-25 Silverbrook Research Pty Ltd Method and system for searching information using sensor with identifier
US6219640B1 (en) * 1999-08-06 2001-04-17 International Business Machines Corporation Methods and apparatus for audio-visual speaker recognition and utterance verification
US6778958B1 (en) * 1999-08-30 2004-08-17 International Business Machines Corporation Symbol insertion apparatus and method
US6480826B2 (en) * 1999-08-31 2002-11-12 Accenture Llp System and method for a telephonic emotion detection that provides operator feedback
US6711541B1 (en) * 1999-09-07 2004-03-23 Matsushita Electric Industrial Co., Ltd. Technique for developing discriminative sound units for speech recognition and allophone modeling
US6624826B1 (en) * 1999-09-28 2003-09-23 Ricoh Co., Ltd. Method and apparatus for generating visual representations for audio documents
US6571208B1 (en) * 1999-11-29 2003-05-27 Matsushita Electric Industrial Co., Ltd. Context-dependent acoustic models for medium and large vocabulary speech recognition with eigenvoice training
US6792409B2 (en) * 1999-12-20 2004-09-14 Koninklijke Philips Electronics N.V. Synchronous reproduction in a speech recognition system
US20010026377A1 (en) * 2000-03-21 2001-10-04 Katsumi Ikegami Image display system, image registration terminal device and image reading terminal device used in the image display system
US20020010575A1 (en) * 2000-04-08 2002-01-24 International Business Machines Corporation Method and system for the automatic segmentation of an audio stream into semantic or syntactic units
US20020001261A1 (en) * 2000-04-21 2002-01-03 Yoshinori Matsui Data playback apparatus
US20020010916A1 (en) * 2000-05-22 2002-01-24 Compaq Computer Corporation Apparatus and method for controlling rate of playback of audio data
US6748356B1 (en) * 2000-06-07 2004-06-08 International Business Machines Corporation Methods and apparatus for identifying unknown speakers using a hierarchical tree structure
US20020049589A1 (en) * 2000-06-28 2002-04-25 Poirier Darrell A. Simultaneous multi-user real-time voice recognition system
US6931376B2 (en) * 2000-07-20 2005-08-16 Microsoft Corporation Speech-related event notification system
US20020059204A1 (en) * 2000-07-28 2002-05-16 Harris Larry R. Distributed search system and method
US6922691B2 (en) * 2000-08-28 2005-07-26 Emotion, Inc. Method and apparatus for digital media management, retrieval, and collaboration
US6604110B1 (en) * 2000-08-31 2003-08-05 Ascential Software, Inc. Automated software code generation from a metadata-based repository
US6647383B1 (en) * 2000-09-01 2003-11-11 Lucent Technologies Inc. System and method for providing interactive dialogue and iterative search functions to find information
US20050060162A1 (en) * 2000-11-10 2005-03-17 Farhad Mohit Systems and methods for automatic identification and hyperlinking of words or other data items and for information retrieval using hyperlinked words or data items
US20040073444A1 (en) * 2001-01-16 2004-04-15 Li Li Peh Method and apparatus for a financial database structure
US6714911B2 (en) * 2001-01-25 2004-03-30 Harcourt Assessment, Inc. Speech transcription and analysis system and method
US20020133477A1 (en) * 2001-03-05 2002-09-19 Glenn Abel Method for profile-based notice and broadcast of multimedia content
US20030088414A1 (en) * 2001-05-10 2003-05-08 Chao-Shih Huang Background learning of speaker voices
US7171360B2 (en) * 2001-05-10 2007-01-30 Koninklijke Philips Electronics N.V. Background learning of speaker voices
US6778979B2 (en) * 2001-08-13 2004-08-17 Xerox Corporation System for automatically generating queries
US6708148B2 (en) * 2001-10-12 2004-03-16 Koninklijke Philips Electronics N.V. Correction device to mark parts of a recognized text
US20030093580A1 (en) * 2001-11-09 2003-05-15 Koninklijke Philips Electronics N.V. Method and system for information alerts
US20030167163A1 (en) * 2002-02-22 2003-09-04 Nec Research Institute, Inc. Inferring hierarchical descriptions of a set of documents
US20060129541A1 (en) * 2002-06-11 2006-06-15 Microsoft Corporation Dynamically updated quick searches and strategies
US7131117B2 (en) * 2002-09-04 2006-10-31 Sbc Properties, L.P. Method and system for automating the analysis of word frequencies
US6999918B2 (en) * 2002-09-20 2006-02-14 Motorola, Inc. Method and apparatus to facilitate correlating symbols to sounds

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060058998A1 (en) * 2004-09-16 2006-03-16 Kabushiki Kaisha Toshiba Indexing apparatus and indexing method
US7801893B2 (en) * 2005-09-30 2010-09-21 Iac Search & Media, Inc. Similarity detection and clustering of images
US20070078846A1 (en) * 2005-09-30 2007-04-05 Antonino Gulli Similarity detection and clustering of images
US20150205791A1 (en) * 2006-06-26 2015-07-23 Scenera Technologies, Llc Methods, Systems, And Computer Program Products For Identifying A Container Associated With A Plurality Of Files
WO2009018223A1 (en) * 2007-07-27 2009-02-05 Sparkip, Inc. System and methods for clustering large database of documents
US20090043797A1 (en) * 2007-07-27 2009-02-12 Sparkip, Inc. System And Methods For Clustering Large Database of Documents
US10083396B2 (en) 2009-07-28 2018-09-25 Fti Consulting, Inc. Computer-implemented system and method for assigning concept classification suggestions
US20110153589A1 (en) * 2009-12-21 2011-06-23 Ganesh Vaitheeswaran Document indexing based on categorization and prioritization
US8983958B2 (en) * 2009-12-21 2015-03-17 Business Objects Software Limited Document indexing based on categorization and prioritization
US8812503B2 (en) * 2010-02-15 2014-08-19 Sony Corporation Information processing device, method and program
US20110202530A1 (en) * 2010-02-15 2011-08-18 Sony Corporation Information processing device, method and program
WO2012033873A1 (en) * 2010-09-10 2012-03-15 Icosystem Corporation Methods and systems for online advertising with interactive text clouds
US20120174007A1 (en) * 2010-12-31 2012-07-05 Seungwon Lee Mobile terminal and method of grouping applications thereof
US20140067812A1 (en) * 2011-06-22 2014-03-06 Rogers Communications Inc. Systems and methods for ranking document clusters
US9002848B1 (en) * 2011-12-27 2015-04-07 Google Inc. Automatic incremental labeling of document clusters
US20140207783A1 (en) * 2013-01-22 2014-07-24 Equivio Ltd. System and method for computerized identification and effective presentation of semantic themes occurring in a set of electronic documents
US10002182B2 (en) * 2013-01-22 2018-06-19 Microsoft Israel Research And Development (2002) Ltd System and method for computerized identification and effective presentation of semantic themes occurring in a set of electronic documents
US10278603B2 (en) 2013-09-25 2019-05-07 Bardy Diagnostics, Inc. System and method for secure physiological data acquisition and storage
US10278606B2 (en) 2013-09-25 2019-05-07 Bardy Diagnostics, Inc. Ambulatory electrocardiography monitor optimized for capturing low amplitude cardiac action potential propagation
US10271756B2 (en) 2013-09-25 2019-04-30 Bardy Diagnostics, Inc. Monitor recorder optimized for electrocardiographic signal processing
US10111601B2 (en) 2013-09-25 2018-10-30 Bardy Diagnostics, Inc. Extended wear electrocardiography monitor optimized for capturing low amplitude cardiac action potential propagation
US10271755B2 (en) 2013-09-25 2019-04-30 Bardy Diagnostics, Inc. Method for constructing physiological electrode assembly with sewn wire interconnects
US10154793B2 (en) 2013-09-25 2018-12-18 Bardy Diagnostics, Inc. Extended wear electrocardiography patch with wire contact surfaces
US10172534B2 (en) 2013-09-25 2019-01-08 Bardy Diagnostics, Inc. Remote interfacing electrocardiography patch
US10251576B2 (en) 2013-09-25 2019-04-09 Bardy Diagnostics, Inc. System and method for ECG data classification for use in facilitating diagnosis of cardiac rhythm disorders with the aid of a digital computer
US10251575B2 (en) 2013-09-25 2019-04-09 Bardy Diagnostics, Inc. Wearable electrocardiography and physiology monitoring ensemble
US10265015B2 (en) 2013-09-25 2019-04-23 Bardy Diagnostics, Inc. Monitor recorder optimized for electrocardiography and respiratory data acquisition and processing
US10264992B2 (en) 2013-09-25 2019-04-23 Bardy Diagnostics, Inc. Extended wear sewn electrode electrocardiography monitor
US20150100583A1 (en) * 2013-10-08 2015-04-09 Cisco Technology, Inc. Method and apparatus for organizing multimedia content
US20150100582A1 (en) * 2013-10-08 2015-04-09 Cisco Technology, Inc. Association of topic labels with digital content
US9495439B2 (en) * 2013-10-08 2016-11-15 Cisco Technology, Inc. Organizing multimedia content
US10123703B2 (en) 2015-10-05 2018-11-13 Bardy Diagnostics, Inc. Health monitoring apparatus with wireless capabilities for initiating a patient treatment with the aid of a digital computer

Also Published As

Publication number Publication date
US20040204939A1 (en) 2004-10-14
US7424427B2 (en) 2008-09-09
US7292977B2 (en) 2007-11-06
US7389229B2 (en) 2008-06-17
US20040083090A1 (en) 2004-04-29
US20040172250A1 (en) 2004-09-02
US20040083104A1 (en) 2004-04-29
US20040230432A1 (en) 2004-11-18
US20040176946A1 (en) 2004-09-09
US20040138894A1 (en) 2004-07-15
US20050038649A1 (en) 2005-02-17

Similar Documents

Publication Publication Date Title
Turney et al. Measuring praise and criticism: Inference of semantic orientation from association
Harabagiu et al. Topic themes for multi-document summarization
Hiemstra et al. Parsimonious language models for information retrieval
US8972840B2 (en) Time ordered indexing of an information stream
US8190627B2 (en) Machine assisted query formulation
Henzinger et al. Query-free news search
US7290207B2 (en) Systems and methods for providing multimedia information management
CA2607596C (en) System and method for utilizing the content of an online conversation to select advertising content and/or other relevant information for display
JP4664423B2 (en) How to find information that is compatible
CN100405366C (en) System and method for generating refinement categories for a set of search results
US9323738B2 (en) Classification of ambiguous geographic references
US8689113B2 (en) Methods and apparatus for presenting content
US8117211B2 (en) Information processing device and method, and program
US6928407B2 (en) System and method for the automatic discovery of salient segments in speech transcripts
US8706735B2 (en) Method and system for indexing and searching timed media information based upon relevance intervals
US6647383B1 (en) System and method for providing interactive dialogue and iterative search functions to find information
US7292979B2 (en) Time ordered indexing of audio data
KR101122869B1 (en) Annotation management in a pen-based computing system
US6816858B1 (en) System, method and apparatus providing collateral information for a video/audio stream
US7085771B2 (en) System and method for automatically discovering a hierarchy of concepts from a corpus of documents
US9619467B2 (en) Personalization engine for building a dynamic classification dictionary
US20090006345A1 (en) Voice-based search processing
EP1555625A1 (en) Query recognizer
KR101242369B1 (en) Sensing, storing, indexing, and retrieving data leveraging measures of user activity, attention, and interest
US7502780B2 (en) Information storage and retrieval

Legal Events

Date Code Title Description
AS Assignment

Owner name: BBNT SOLUTIONS LLC, TEXAS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:COLBATH, SEAN;KUBALA, FRANCIS G.;REEL/FRAME:014608/0961;SIGNING DATES FROM 20031001 TO 20031003

AS Assignment

Owner name: FLEET NATIONAL BANK, AS AGENT, MASSACHUSETTS

Free format text: PATENT & TRADEMARK SECURITY AGREEMENT;ASSIGNOR:BBNT SOLUTIONS LLC;REEL/FRAME:014624/0196

Effective date: 20040326

Owner name: FLEET NATIONAL BANK, AS AGENT,MASSACHUSETTS

Free format text: PATENT & TRADEMARK SECURITY AGREEMENT;ASSIGNOR:BBNT SOLUTIONS LLC;REEL/FRAME:014624/0196

Effective date: 20040326

AS Assignment

Owner name: BBNT SOLUTIONS LLC, MASSACHUSETTS

Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE ASSIGNEE'S ADDRESS, PREVIOUSLY RECORDED AT REEL 014608 FRAME 0961;ASSIGNORS:COLBATH, SEAN;KUBALA, FRANCIS G.;REEL/FRAME:015815/0330;SIGNING DATES FROM 20031001 TO 20031003

AS Assignment

Owner name: BBN TECHNOLOGIES CORP., MASSACHUSETTS

Free format text: MERGER;ASSIGNOR:BBNT SOLUTIONS LLC;REEL/FRAME:017274/0318

Effective date: 20060103

Owner name: BBN TECHNOLOGIES CORP.,MASSACHUSETTS

Free format text: MERGER;ASSIGNOR:BBNT SOLUTIONS LLC;REEL/FRAME:017274/0318

Effective date: 20060103

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: BBN TECHNOLOGIES CORP. (AS SUCCESSOR BY MERGER TO

Free format text: RELEASE OF SECURITY INTEREST;ASSIGNOR:BANK OF AMERICA, N.A. (SUCCESSOR BY MERGER TO FLEET NATIONAL BANK);REEL/FRAME:023427/0436

Effective date: 20091026