US9628620B1 - Method and system for providing captioned telephone service with automated speech recognition - Google Patents
Method and system for providing captioned telephone service with automated speech recognition Download PDFInfo
- Publication number
- US9628620B1 US9628620B1 US15/204,072 US201615204072A US9628620B1 US 9628620 B1 US9628620 B1 US 9628620B1 US 201615204072 A US201615204072 A US 201615204072A US 9628620 B1 US9628620 B1 US 9628620B1
- Authority
- US
- United States
- Prior art keywords
- captioner
- captions
- call
- telephone service
- human
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/42391—Systems providing special services or facilities to subscribers where the subscribers are hearing-impaired persons, e.g. telephone devices for the deaf
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/01—Assessment or evaluation of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transforming into visible information
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M11/00—Telephonic communication systems specially adapted for combination with other electrical systems
- H04M11/06—Simultaneous speech and data transmission, e.g. telegraphic transmission over the same conductors
- H04M11/066—Telephone sets adapted for data transmision
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M7/00—Arrangements for interconnection between switching centres
- H04M7/006—Networks other than PSTN/ISDN providing telephone service, e.g. Voice over Internet Protocol (VoIP), including next generation networks with a packet-switched transport layer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/38—Displays
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/40—Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2203/00—Aspects of automatic or semi-automatic exchanges
- H04M2203/55—Aspects of automatic or semi-automatic exchanges related to network data storage and management
- H04M2203/555—Statistics, e.g. about subscribers but not being call statistics
- H04M2203/556—Statistical analysis and interpretation
Definitions
- the present invention relates to telecommunications services for callers who are deaf, hard-of-hearing, or speech impaired, and in particular to captioned telephone service.
- D-HOH-SI speech-impaired
- Text-based TRS services allow a D-HOH-SI person to communicate with other people over an existing telecommunications network using devices capable of transmitting and receiving text characters over the telecommunications network. Such devices include the telecommunications device for the deaf (TDD) and the teletypewriter (TTY). Text-based TRS services were well-suited to the bandwidth limitations of subscriber lines of the time. The bandwidth limitations of subscriber lines were also a limiting factor in the widespread use of video telephony.
- VRS video relay services
- D-HOH-SI persons can place video calls to communicate between themselves and with hearing individuals using sign language.
- VRS equipment enables D-HOH-SI persons to talk to Hearing individuals via a sign language interpreter, who uses a conventional telephone at the same time to communicate with the party or parties with whom the D-HOH-SI person wants to communicate.
- the interpretation flow is normally within the same principal language, such as American Sign Language (ASL) to spoken English or spoken Spanish.
- ASL American Sign Language
- Captioned telephone service can be used by people who can use their own voice to speak but need assistance to hear what is being said to them on the other end of a telephone call.
- Captioned telephone service is a telecommunication service that enables people who are hard of hearing, oral deaf, or late-deafened to speak directly to another party on a telephone call.
- a telephone displays substantially in real-time captions of what the hearing party speaks during a conversation. The captions are displayed on a screen embedded in the telephone base.
- Captioned telephone services can be provided in traditional telephone environments as well as in voice-over-internet-protocol (VOIP) environments.
- VOIP voice-over-internet-protocol
- IP-CTS internet protocol caption telephone service
- IP-CTS requires an internet connection to deliver the captions to the user.
- Most users also rely on their regular land-line telephone for the audio portion of the call, but some configurations of IP-CTS allow the use of VOIP to carry the call audio.
- IP-CTS has allowed captioned telephone service to be provided on smartphones and tablets.
- IP-CTS is a relatively new industry that is growing extremely fast. IP-CTS has services paid for by the FCC's TRS fund and delivered by private companies, such as ClearCaptions, LLC, assignee of the present application. IP-CTS is particularly useful to anyone who can use their own voice to speak but who needs assistance to hear or understand what is being said by the other end of the call.
- ASR automated speech recognition
- ASR may not work to caption everyone with an acceptable level of accuracy (i.e., ASR would work with sufficient accuracy for some people's voices, but not for others). What is needed is a way to provide accurate captioned telephone service using automated speech recognition assisted by human captioning.
- Embodiments of the present invention are directed to a method for providing captioned telephone service.
- the method includes initiating a first captioned telephone service call.
- a first set of captions is created using a human captioner.
- a second set of captions is created using an automated speech recognition captioner.
- the first set of captions and the second set of captions are compared using a scoring algorithm.
- the call is continued using only the automated speech recognition captioner.
- the call is continued using a human captioner.
- Alternative embodiments of the present invention are directed to another method for providing captioned telephone service.
- the method includes initiating a first captioned telephone service call.
- captions are created based on words spoken by a first party to the first captioned telephone service call.
- the captions are displayed to a second party to the first captioned telephone service call.
- the accuracy of the captions are measured with respect to the words spoken by the first party.
- the creation of captions using automated speech recognition is continued.
- captions for the remainder of the captioned telephone service call are created using a human captioner.
- Alternative embodiments of the present invention are directed to another method for providing captioned telephone service.
- the method includes initiating a first captioned telephone service call.
- a first set of captions is created using a human captioner.
- a second set of captions is created using an automated speech recognition captioner.
- the first set of captions and the second set of captions are compared using a scoring algorithm.
- a determination is made as to whether the first set of captions is outside of a predetermined range of scores.
- an electronic flag is set, the flag being indicative of the human captioner being in need of corrective action.
- FIG. 1 is a diagram of an exemplary captioned telephone service (CTS) system 100 suited for implementing embodiments of the present invention
- FIG. 2 is a flowchart of an exemplary method 200 for providing captioned telephone service (CTS) in accordance with one or more embodiments of the present invention
- FIG. 3 is a flowchart of an exemplary method 300 for providing captioned telephone service (CTS) in accordance with one or more alternative embodiments of the present invention
- FIG. 4 is a flowchart of an exemplary method 400 for providing captioned telephone service (CTS) in accordance with one or more alternative embodiments of the present invention.
- CTS captioned telephone service
- FIG. 5 is a flowchart of an exemplary method 500 for providing captioned telephone service (CTS) in accordance with one or more embodiments of the present invention.
- CTS captioned telephone service
- Embodiments of the present invention are directed to improved methods and systems for providing captioned telephone service (CTS), including internet protocol caption telephone service (IP-CTS).
- CTS captioned telephone service
- IP-CTS internet protocol caption telephone service
- Embodiments of the present invention are further directed to methods and systems for providing accurate captioned telephone service using automated speech recognition assisted by human captioning.
- an automated software recognition (ASR) software-based captioner is run alongside a live, human captioner for a short period of time. The accuracy level of the two sets of captions is compared to determine whether the party being captioned is someone for whom ASR is an acceptable service. If the accuracy of ASR captioning is acceptable, then the human captioner's stream is cut off and the ASR captioner takes the call forward. If the ASR stream is below some threshold (or significantly worse that the human captioner), then the ASR captioner cuts off and the human captioner handles captioning for the rest of the call.
- ASR automated software recognition
- Embodiments of the present invention are further directed to creating captions with a live, human captioner and an ASR technology at the same time, comparing the captioning from both sources in real time via a scoring algorithm, and making a decision to continue the call with the ASR technology only if the ASR reaches an acceptable level of ongoing success, as determined, for example, by a service level agreement (SLA).
- SLA service level agreement
- Embodiments of the present invention are directed to storing performance statistics in a database of telephone numbers and/or other individual identifiers that will enable the ASR probability of success decision to be made at the beginning of the call in future versus starting the call with a human captioner. This could include the score across various ASR choices if more than one ASR engine is employed. Stored scoring can also be used to speed the decision. Stored scoring can also be used in conjunction with algorithm calculations of the current call to speed the decision or decide whether to continue to retest during the call and how often retesting should occur.
- Embodiments of the present invention are directed to routing calls to human captioners or ASR technology based on past call history.
- Embodiments of the present invention are directed to applying different ASR technologies in future calls based on historical performance of various ASR technologies for that individual.
- Embodiments of the present invention are directed to a sequential test for accuracy—first, the ASR stream starts and is measured for accuracy, then it kicks automatically to a human captioner if the ASR captioner not hitting an established accuracy target.
- Embodiments of the present invention are directed to evaluating the voice of the party being captioned by a technology that will indicate whether the party's voice will be able to be captioned successfully, for example, a probability-based evaluation. For example, if a person speaks a language (e.g., a creole or pidgin language) which is known to cause a particular ASR captioner to produce captions with insufficient accuracy, the system will switch the call immediately to a human captioner. Other tests can be used in addition to comparisons of the human captioner to the ASR captioner. Other tests could include, for example, volume or noise level detection, static detection, echo, etc., to flag that a call will probably need help due to poor audio quality. Geographic data, such as the TN location of the other party, can be used to flag geographical or demographic needs to utilize a human captioner.
- a probability-based evaluation For example, if a person speaks a language (e.g., a creole or pidgin language) which is known to
- FIG. 1 is a diagram of an exemplary captioned telephone service (CTS) system 100 suited for implementing embodiments of the present invention.
- Telephone 108 is communicatively coupled to captioned telephone service manager 102 .
- Telephone 108 can be, but is not limited to, a traditional telephone connected to the public switched telephone network (PSTN), a cellular telephone connected to a cellular telephony network, a voice-over-internet-protocol (VOIP) device connected to an internet protocol (IP) network, or a telephony application executing on a network-connected computing device, such as a smartphone, tablet computer, or personal computer (PC).
- PSTN public switched telephone network
- VOIP voice-over-internet-protocol
- IP internet protocol
- PC personal computer
- Communicatively coupled refers to communications devices connected by means for sending and receiving voice and/or data communications over wired and wireless communication channels, including, but not limited to, PSTN, cellular telephony networks, private branch exchanges (PBX), and packet switched IP networks.
- PSTN public switched telephone network
- cellular telephony networks cellular telephony networks
- PBX private branch exchanges
- packet switched IP networks packet switched IP networks
- Captioned telephone service (CTS) manager 102 is communicatively coupled to telephone 108 , human captioner 104 , automated speech recognition (ASR) captioner 106 , and telephone 110 .
- CTS manager 102 manages the communications between telephone 108 , human captioner 104 , automated speech recognition (ASR) captioner 106 , and telephone 110 .
- CTS manager 102 includes the logic, described in further detail below, for generating captions.
- CTS manager 102 can comprise a discrete electronic component, as shown in FIG. 1 .
- CTS manager 102 can comprise software hosted in one or more of the devices of CTS system 100 .
- CTS manager 102 can comprise a software service hosted on a networked cloud-based server in a software-as-a-service (SaaS) configuration.
- SaaS software-as-a-service
- Human captioner 104 can comprise a human operator 116 and a computer workstation 114 .
- Human operator 116 can listen to the party speaking via telephone 108 .
- Human operator 116 can use computer workstation 114 to create captions of the words spoken by the party speaking via telephone 108 .
- Human operator 116 can create captions by repeating the words spoken into telephone 108 by the party using telephone into an automated speech recognition engine executing on computer workstation 114 .
- the automated speech recognition engine executing on computer workstation 114 is trained to recognize speech from human operator 116 and generate text captions based on the speech of human operator 116 .
- the generated text captions are transmitted to telephone 110 by CTS manager 102 .
- Human operator 116 can alternatively create captions by manually transcribing the words spoken into telephone 108 into written text that is transmitted to telephone 110 by CTS manager 102 . Human operator 116 can also use computer workstation 114 to edit the ASR generated captions as necessary to correct errors.
- Telephone 110 includes display 112 adapted for displaying captions received from CTS manager 102 . While FIG. 1 shows telephone 110 as a caption-enabled telephony device, telephone 110 could also comprise software application running on a smartphone, tablet, or PC, that is capable of providing captioned telephone service.
- ASR captioner 106 can generate captions directly from the speech of the party using telephone 108 .
- a human operator is not needed to generate captions.
- the automated speech recognition engine executing on computer workstation 114 differs from ASR captioner 106 in that the automated speech recognition engine executing on computer workstation 114 is trained to recognize speech from human operator 116 and human operator 116 is required for human captioner 104 to generate captions.
- FIG. 2 is a flowchart of an exemplary method 200 for providing captioned telephone service (CTS) in accordance with one or more embodiments of the present invention.
- the method starts at step 202 .
- a captioned telephone service call is initiated.
- a person without hearing loss uses telephone 108 to call a person who has some degree of hearing loss for which captioning would be a benefit or necessity for the telephone conversation.
- both parties to the call have hearing loss and captions are generated based on the words spoken by each party. The generated captions are then displayed on the telephone of the non-speaking party.
- captions of the spoken words are created substantially in real-time using human captioner 104 .
- captions of the same spoken words are also created substantially in real-time using ASR captioner 106 . That is, two sets of captions of the same spoken words are simultaneously created by human captioner 104 and ASR captioner 106 , substantially in real-time.
- a first set of captions is created by human captioner 104 .
- a second set of captions is created by ASR captioner 106 .
- the captions created at step 206 by human captioner 104 are transmitted for display on telephone 110 .
- the captions created using human captioner 104 are compared to the captions created using ASR captioner 106 . That is, the set of captions created by ASR captioner 106 is compared to the set of captions created by human captioner 104 .
- the comparison can be made using a scoring algorithm.
- the scoring algorithm assigns a score to the set of captions created by ASR captioner 106 based on the number of captions that are different from the set of captions created by human captioner 104 .
- the set of captions created by human captioner 104 is presumed to be the more accurate set of captions because there is human oversight in the generation of the captions by human captioner 104 .
- the determination can be made by CTS manager 102 .
- the determination can be based on an acceptable level of accuracy as defined in a service level agreement (SLA).
- SLA service level agreement
- the determination can be made by determining differences between the set of captions created by ASR captioner 106 and the set of captions created by human captioner 104 . If the differences between the two sets of captions are less than a predetermined threshold of differences, then the captions created by ASR captioner 106 are sufficiently accurate to continue the call without using human captioner 104 .
- the captions created by ASR captioner 106 are not sufficiently accurate to continue the call without using human captioner 104 . If a scoring algorithm is used to make the comparison in step 208 , then the accuracy determination can be made based on the score of the captions created by ASR captioner 106 being within a predetermined range of scores that are indicative of sufficient accuracy. If the score of the set of captions generated by ASR captioner 106 is within the predetermined ranges of scores, then the captions created by ASR captioner 106 are sufficiently accurate to continue the call without using human captioner 104 . If the score of the set of captions generated by ASR captioner 106 is not within the predetermined ranges of scores, then the captions created by ASR captioner 106 are not sufficiently accurate to continue the call without using human captioner 104 .
- the method proceeds to step 214 .
- the call is continued without using human captioner 104 . That is, the call is continued with captions being generated only by the ASR captioner 106 . Continuing the call with captions being generated only by the ASR captioner 106 frees up human captioner 104 to service another CTS call. This reduces demand for human captioners 104 in a particular CTS provider's call center, enabling fewer human captioners 104 to service a greater number of calls.
- the method proceeds to step 212 .
- the call is continued using human captioner 104 to generate captions and the generation of captions from ASR captioner 106 can be discontinued.
- Continuing the call using human captioner 104 to generate captions ensures that the captions are within an acceptable level of accuracy in situations where the captions generated by the ASR captioner 106 are not sufficiently accurate (e.g., noisy line quality, quiet speaker, language or dialect not recognized, etc.).
- the generation of captions from ASR captioner 106 can be continued, and the accuracy continuously monitored.
- Performance statistics can be stored in a database of telephone numbers and/or other identifiers. Performance statistics can include, but are not limited to, a percent word match between the ASR captioner and the human captioner, an average number of words not matched per minute of call audio, words per minute of call duration (which could strain either the ASR captioner or the human captioner), and a percent of time speaking versus dead air time on the call. Performance statistics can be used to make a probability of success determination with respect to using ASR captioner 106 at the beginning of the call instead of starting the call using human captioner 104 . Such an embodiment is shown in FIG. 3 .
- FIG. 3 is a flowchart of an exemplary method 300 for providing captioned telephone service (CTS) in accordance with one or more alternative embodiments of the present invention.
- the method starts at step 302 .
- a captioned telephone service call is initiated in a matter similar to the manner described with respect to step 204 of FIG. 2 .
- a determination is made based on stored data whether to use human captioner 104 or ASR captioner 106 to generate captions for the call. The determination can be made by CTS manager 102 .
- Stored data can comprise performance statistics based on previous calls associated with a telephone number or other individual identifiers.
- Stored data can also be used to select among ASR captioners using different automated speech recognition technology. That is, if one ASR engine is known, based on historical performance, to produce better captions than other ASR engines used by the CTS service for a particular individual or telephone number, then that ASR engine can be selected as the ASR captioner 106 .
- step 308 if the stored data indicates that ASR captioner 106 can be used to generate captions for the call, the method proceeds to step 312 .
- ASR captioner 106 is used to generate captions for the call.
- step 308 if the stored data indicates that ASR captioner 106 can be used to generate captions for the call, the method proceeds to step 314 .
- human captioner 104 is used to generate captions for the call. Making the determination whether to use human captioner 104 before creating captions with human captioner 104 reduces the load on human captioners in a call center. That is, if stored data indicates in advance that ASR captioner 106 should be sufficiently accurate for the call, then human captioner 104 is not used for the call at all, enabling human captioner 104 to service calls for which a human captioner is needed.
- FIG. 4 is a flowchart of an exemplary method 400 for providing captioned telephone service (CTS) in accordance with one or more alternative embodiments of the present invention.
- the method starts at step 402 .
- a captioned telephone service call is initiated in a matter similar to the manner described with respect to step 204 of FIG. 2 .
- captions are created used ASR captioner 106 .
- ASR captioner 106 creates captions for the predetermined set of words.
- CTS manager 102 compares the captions created by ASR captioner 106 to the predetermined set of words and determines the accuracy of the captions created by ASR captioner 106 . If the captions created by ASR captioner 106 are sufficiently accurate, then the method proceeds to step 412 and the call is continued using ASR captioner 106 . If the captions created by ASR captioner 106 are not sufficiently accurate, then the method proceeds to step 410 and the call is continued using human captioner 104 . Making the determination whether to use human captioner 104 before creating captions with human captioner 104 reduces the load on human captioners in a call center.
- human captioner 104 is not used for the call at all, enabling human captioner 104 to service calls for which a human captioner is needed.
- FIG. 5 is a flowchart of an exemplary method 500 for providing captioned telephone service (CTS) in accordance with one or more embodiments of the present invention.
- the method starts at step 502 .
- a captioned telephone service call is initiated.
- a person without hearing loss uses telephone 108 to call a person who has some degree of hearing loss for which captioning would be a benefit or necessity for the telephone conversation.
- both parties to the call have hearing loss and captions are generated based on the words spoken by each party. The generated captions are then displayed on the telephone of the non-speaking party.
- captions of the spoken words are created substantially in real-time using human captioner 104 .
- captions of the same spoken words are also created substantially in real-time using ASR captioner 106 . That is, two sets of captions of the same spoken words are simultaneously created by human captioner 104 and ASR captioner 106 , substantially in real-time.
- a first set of captions is created by human captioner 104 .
- a second set of captions is created by ASR captioner 106 .
- the captions created at step 206 by human captioner 104 are transmitted for display on telephone 110 .
- the captions created using human captioner 104 are compared to the captions created using ASR captioner 106 . That is, the set of captions created by ASR captioner 106 is compared to the set of captions created by human captioner 104 .
- the comparison can be made using a scoring algorithm.
- the scoring algorithm assigns a score to the set of captions created by ASR captioner 106 based on the number of captions that are different from the set of captions created by human captioner 104 .
- the set of captions created by human captioner 104 is presumed to be the more accurate set of captions because there is human oversight in the generation of the captions by human captioner 104 .
- the determination can be made by CTS manager 102 .
- the determination can be based on an acceptable level of accuracy as defined in a service level agreement (SLA).
- SLA service level agreement
- the determination can be made by determining differences between the set of captions created by ASR captioner 106 and the set of captions created by human captioner 104 . If the differences between the two sets of captions are less than a predetermined threshold of differences, then the captions created by human captioner 104 are sufficiently accurate such that human captioner 104 is not in need of corrective action.
- Corrective action can include additional training for human captioner 104 . If a scoring algorithm is used to make the comparison in step 208 , then the accuracy determination can be made based on the score of the captions created by human captioner 104 being within a predetermined range of scores that are indicative of sufficient accuracy. If the score of the set of captions generated by human captioner 104 is within the predetermined ranges of scores, then the captions created by human captioner 104 are sufficiently accurate to indicate that human captioner 104 is not in need of corrective action. If the score of the set of captions generated by human captioner 104 is not within the predetermined ranges of scores, then the captions created by human captioner 104 are not sufficiently accurate and human captioner 104 is in need of corrective action.
- step 512 an electronic flag is set, for example by CTS manager 102 .
- the electronic flag is indicative of human captioner 104 being in need of corrective action.
- Corrective action can include providing additional training to human captioner 104 .
- Corrective action can include terminating the employment of human captioner 104 .
- Corrective action can include placing human captioner 104 on a probation period. The probation period can be a period of time during which human captioner 104 must improve the accuracy of the captions he or she generates.
- an electronic communication can be sent, for example by CTS manager 102 , to a person responsible for initiating the corrective action.
- the person responsible for initiating the corrective action can be a manager within the call center in which human captioner 104 works.
- the electronic communication can be in the form of an automated email, automated SMS message, or the like.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computer Networks & Wireless Communication (AREA)
- Data Mining & Analysis (AREA)
- Quality & Reliability (AREA)
- Telephonic Communication Services (AREA)
Abstract
Description
Claims (20)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US15/204,072 US9628620B1 (en) | 2016-07-07 | 2016-07-07 | Method and system for providing captioned telephone service with automated speech recognition |
| US15/489,357 US10044854B2 (en) | 2016-07-07 | 2017-04-17 | Method and system for providing captioned telephone service with automated speech recognition |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US15/204,072 US9628620B1 (en) | 2016-07-07 | 2016-07-07 | Method and system for providing captioned telephone service with automated speech recognition |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/489,357 Continuation-In-Part US10044854B2 (en) | 2016-07-07 | 2017-04-17 | Method and system for providing captioned telephone service with automated speech recognition |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| US9628620B1 true US9628620B1 (en) | 2017-04-18 |
Family
ID=58772093
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/204,072 Active US9628620B1 (en) | 2016-07-07 | 2016-07-07 | Method and system for providing captioned telephone service with automated speech recognition |
Country Status (1)
| Country | Link |
|---|---|
| US (1) | US9628620B1 (en) |
Cited By (21)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20170206808A1 (en) * | 2014-02-28 | 2017-07-20 | Ultratec, Inc. | Semiautomated Relay Method and Apparatus |
| US10057256B1 (en) | 2017-08-02 | 2018-08-21 | Chris Talbot | Method and system for visually authenticating the identity of a caller using a video relay service |
| US10122968B1 (en) | 2017-08-30 | 2018-11-06 | Chris Talbot | Method and system for using a video relay service with deaf, hearing-impaired or speech-impaired called parties |
| US10129505B2 (en) | 2016-10-27 | 2018-11-13 | Chris Talbot | Method and system for providing a visual indication that a video relay service call originates from an inmate at a corrections facility |
| US10192554B1 (en) | 2018-02-26 | 2019-01-29 | Sorenson Ip Holdings, Llc | Transcription of communications using multiple speech recognition systems |
| US10388272B1 (en) | 2018-12-04 | 2019-08-20 | Sorenson Ip Holdings, Llc | Training speech recognition systems using word sequences |
| US10389876B2 (en) | 2014-02-28 | 2019-08-20 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US10423237B2 (en) | 2016-08-15 | 2019-09-24 | Purple Communications, Inc. | Gesture-based control and usage of video relay service communications |
| US10547813B2 (en) | 2017-12-29 | 2020-01-28 | Chris Talbot | Method and system for enabling automated audio keyword monitoring with video relay service calls |
| US10573312B1 (en) | 2018-12-04 | 2020-02-25 | Sorenson Ip Holdings, Llc | Transcription generation from multiple speech recognition systems |
| US10708419B1 (en) | 2019-06-17 | 2020-07-07 | Chris Talbot | Method and system for rating multiple call destination types from a single video relay kiosk in a corrections facility |
| US10748523B2 (en) | 2014-02-28 | 2020-08-18 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US10917519B2 (en) | 2014-02-28 | 2021-02-09 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US10984229B2 (en) | 2018-10-11 | 2021-04-20 | Chris Talbot | Interactive sign language response system and method |
| US10992793B1 (en) | 2020-03-07 | 2021-04-27 | Eugenious Enterprises, LLC | Telephone system for the hearing impaired |
| US11017778B1 (en) * | 2018-12-04 | 2021-05-25 | Sorenson Ip Holdings, Llc | Switching between speech recognition systems |
| US11170761B2 (en) | 2018-12-04 | 2021-11-09 | Sorenson Ip Holdings, Llc | Training of speech recognition systems |
| US11445056B1 (en) | 2020-03-07 | 2022-09-13 | Eugenious Enterprises, LLC | Telephone system for the hearing impaired |
| US11488604B2 (en) | 2020-08-19 | 2022-11-01 | Sorenson Ip Holdings, Llc | Transcription of audio |
| US11539900B2 (en) | 2020-02-21 | 2022-12-27 | Ultratec, Inc. | Caption modification and augmentation systems and methods for use by hearing assisted user |
| US11664029B2 (en) | 2014-02-28 | 2023-05-30 | Ultratec, Inc. | Semiautomated relay method and apparatus |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20110040559A1 (en) * | 2009-08-17 | 2011-02-17 | At&T Intellectual Property I, L.P. | Systems, computer-implemented methods, and tangible computer-readable storage media for transcription alignment |
| US20150341486A1 (en) * | 2014-05-22 | 2015-11-26 | Voiceriver, Inc. | Adaptive Telephone Relay Service Systems |
| US20150371631A1 (en) * | 2014-06-23 | 2015-12-24 | Google Inc. | Caching speech recognition scores |
| US9396180B1 (en) * | 2013-01-29 | 2016-07-19 | Amazon Technologies, Inc. | System and method for analyzing video content and presenting information corresponding to video content to users |
-
2016
- 2016-07-07 US US15/204,072 patent/US9628620B1/en active Active
Patent Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20110040559A1 (en) * | 2009-08-17 | 2011-02-17 | At&T Intellectual Property I, L.P. | Systems, computer-implemented methods, and tangible computer-readable storage media for transcription alignment |
| US8843368B2 (en) * | 2009-08-17 | 2014-09-23 | At&T Intellectual Property I, L.P. | Systems, computer-implemented methods, and tangible computer-readable storage media for transcription alignment |
| US20150046160A1 (en) * | 2009-08-17 | 2015-02-12 | At&T Intellectual Property I, L.P. | Systems, Computer-Implemented Methods, and Tangible Computer-Readable Storage Media For Transcription Alighnment |
| US9305552B2 (en) * | 2009-08-17 | 2016-04-05 | At&T Intellectual Property I, L.P. | Systems, computer-implemented methods, and tangible computer-readable storage media for transcription alignment |
| US20160198234A1 (en) * | 2009-08-17 | 2016-07-07 | At&T Intellectual Property I, L.P. | Systems, computer-implemented methods, and tangible computer-readable storage media for transcription alignment |
| US9495964B2 (en) * | 2009-08-17 | 2016-11-15 | At&T Intellectual Property I, L.P. | Systems, computer-implemented methods, and tangible computer-readable storage media for transcription alignment |
| US9396180B1 (en) * | 2013-01-29 | 2016-07-19 | Amazon Technologies, Inc. | System and method for analyzing video content and presenting information corresponding to video content to users |
| US20150341486A1 (en) * | 2014-05-22 | 2015-11-26 | Voiceriver, Inc. | Adaptive Telephone Relay Service Systems |
| US20150371631A1 (en) * | 2014-06-23 | 2015-12-24 | Google Inc. | Caching speech recognition scores |
Cited By (46)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10542141B2 (en) | 2014-02-28 | 2020-01-21 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US20170206808A1 (en) * | 2014-02-28 | 2017-07-20 | Ultratec, Inc. | Semiautomated Relay Method and Apparatus |
| US11627221B2 (en) | 2014-02-28 | 2023-04-11 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US10917519B2 (en) | 2014-02-28 | 2021-02-09 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US12400660B2 (en) | 2014-02-28 | 2025-08-26 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US12136425B2 (en) | 2014-02-28 | 2024-11-05 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US10389876B2 (en) | 2014-02-28 | 2019-08-20 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US12136426B2 (en) | 2014-02-28 | 2024-11-05 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US10878721B2 (en) * | 2014-02-28 | 2020-12-29 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US12137183B2 (en) | 2014-02-28 | 2024-11-05 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US10748523B2 (en) | 2014-02-28 | 2020-08-18 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US11368581B2 (en) | 2014-02-28 | 2022-06-21 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US10742805B2 (en) | 2014-02-28 | 2020-08-11 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US11741963B2 (en) | 2014-02-28 | 2023-08-29 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US11664029B2 (en) | 2014-02-28 | 2023-05-30 | Ultratec, Inc. | Semiautomated relay method and apparatus |
| US10423237B2 (en) | 2016-08-15 | 2019-09-24 | Purple Communications, Inc. | Gesture-based control and usage of video relay service communications |
| US10531041B2 (en) | 2016-10-27 | 2020-01-07 | Chris Talbot | Method and system for providing a visual indication that a video relay service call originates from an inmate at a corrections facility |
| US10887547B2 (en) | 2016-10-27 | 2021-01-05 | Chris Talbot | Method and system for providing a visual indication that a video relay service call originates from an inmate at a corrections facility |
| US10129505B2 (en) | 2016-10-27 | 2018-11-13 | Chris Talbot | Method and system for providing a visual indication that a video relay service call originates from an inmate at a corrections facility |
| US11611721B2 (en) | 2016-10-27 | 2023-03-21 | Chris Talbot | Method and system for providing a visual indication that a video relay service call originates from an inmate at a corrections facility |
| US10057256B1 (en) | 2017-08-02 | 2018-08-21 | Chris Talbot | Method and system for visually authenticating the identity of a caller using a video relay service |
| US10122968B1 (en) | 2017-08-30 | 2018-11-06 | Chris Talbot | Method and system for using a video relay service with deaf, hearing-impaired or speech-impaired called parties |
| US10547813B2 (en) | 2017-12-29 | 2020-01-28 | Chris Talbot | Method and system for enabling automated audio keyword monitoring with video relay service calls |
| US10192554B1 (en) | 2018-02-26 | 2019-01-29 | Sorenson Ip Holdings, Llc | Transcription of communications using multiple speech recognition systems |
| WO2019164574A1 (en) * | 2018-02-26 | 2019-08-29 | Sorenson Ip Holdings, Llc | Transcription of communications |
| US11710488B2 (en) | 2018-02-26 | 2023-07-25 | Sorenson Ip Holdings, Llc | Transcription of communications using multiple speech recognition systems |
| US10984229B2 (en) | 2018-10-11 | 2021-04-20 | Chris Talbot | Interactive sign language response system and method |
| US10672383B1 (en) | 2018-12-04 | 2020-06-02 | Sorenson Ip Holdings, Llc | Training speech recognition systems using word sequences |
| US10573312B1 (en) | 2018-12-04 | 2020-02-25 | Sorenson Ip Holdings, Llc | Transcription generation from multiple speech recognition systems |
| US20210233530A1 (en) * | 2018-12-04 | 2021-07-29 | Sorenson Ip Holdings, Llc | Transcription generation from multiple speech recognition systems |
| US12380877B2 (en) | 2018-12-04 | 2025-08-05 | Sorenson Ip Holdings, Llc | Training of speech recognition systems |
| US10388272B1 (en) | 2018-12-04 | 2019-08-20 | Sorenson Ip Holdings, Llc | Training speech recognition systems using word sequences |
| US11594221B2 (en) * | 2018-12-04 | 2023-02-28 | Sorenson Ip Holdings, Llc | Transcription generation from multiple speech recognition systems |
| US20220028397A1 (en) * | 2018-12-04 | 2022-01-27 | Sorenson Ip Holdings, Llc | Switching between speech recognition systems |
| US10971153B2 (en) * | 2018-12-04 | 2021-04-06 | Sorenson Ip Holdings, Llc | Transcription generation from multiple speech recognition systems |
| US11145312B2 (en) * | 2018-12-04 | 2021-10-12 | Sorenson Ip Holdings, Llc | Switching between speech recognition systems |
| US11170761B2 (en) | 2018-12-04 | 2021-11-09 | Sorenson Ip Holdings, Llc | Training of speech recognition systems |
| US11017778B1 (en) * | 2018-12-04 | 2021-05-25 | Sorenson Ip Holdings, Llc | Switching between speech recognition systems |
| US11935540B2 (en) * | 2018-12-04 | 2024-03-19 | Sorenson Ip Holdings, Llc | Switching between speech recognition systems |
| US10708419B1 (en) | 2019-06-17 | 2020-07-07 | Chris Talbot | Method and system for rating multiple call destination types from a single video relay kiosk in a corrections facility |
| US12035070B2 (en) | 2020-02-21 | 2024-07-09 | Ultratec, Inc. | Caption modification and augmentation systems and methods for use by hearing assisted user |
| US11539900B2 (en) | 2020-02-21 | 2022-12-27 | Ultratec, Inc. | Caption modification and augmentation systems and methods for use by hearing assisted user |
| US10992793B1 (en) | 2020-03-07 | 2021-04-27 | Eugenious Enterprises, LLC | Telephone system for the hearing impaired |
| US11700325B1 (en) | 2020-03-07 | 2023-07-11 | Eugenious Enterprises LLC | Telephone system for the hearing impaired |
| US11445056B1 (en) | 2020-03-07 | 2022-09-13 | Eugenious Enterprises, LLC | Telephone system for the hearing impaired |
| US11488604B2 (en) | 2020-08-19 | 2022-11-01 | Sorenson Ip Holdings, Llc | Transcription of audio |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10044854B2 (en) | Method and system for providing captioned telephone service with automated speech recognition | |
| US9628620B1 (en) | Method and system for providing captioned telephone service with automated speech recognition | |
| US11114091B2 (en) | Method and system for processing audio communications over a network | |
| US7653543B1 (en) | Automatic signal adjustment based on intelligibility | |
| EP1331797B1 (en) | Communication system for hearing-impaired persons comprising speech to text conversion terminal | |
| US9571638B1 (en) | Segment-based queueing for audio captioning | |
| US8489397B2 (en) | Method and device for providing speech-to-text encoding and telephony service | |
| US7315612B2 (en) | Systems and methods for facilitating communications involving hearing-impaired parties | |
| US7657005B2 (en) | System and method for identifying telephone callers | |
| US8515025B1 (en) | Conference call voice-to-name matching | |
| US8849666B2 (en) | Conference call service with speech processing for heavily accented speakers | |
| CN109873907B (en) | Call processing method, device, computer equipment and storage medium | |
| US20110282669A1 (en) | Estimating a Listener's Ability To Understand a Speaker, Based on Comparisons of Their Styles of Speech | |
| US8737581B1 (en) | Pausing a live teleconference call | |
| US9444934B2 (en) | Speech to text training method and system | |
| US20210250441A1 (en) | Captioned Telephone Services Improvement | |
| JP2007108742A (en) | Bidirectional telephonic communication trainer and exerciser | |
| EP2973559B1 (en) | Audio transmission channel quality assessment | |
| US9491293B2 (en) | Speech analytics: conversation timing and adjustment | |
| CN102932561B (en) | For the system and method for real-time listening sound | |
| US10547813B2 (en) | Method and system for enabling automated audio keyword monitoring with video relay service calls | |
| KR20110106844A (en) | Dialogue voice quality evaluation method between nodes of communication network, voice quality test method between nodes of communication network and dialog voice quality test apparatus between nodes of communication network | |
| US20100142683A1 (en) | Method and apparatus for providing video relay service assisted calls with reduced bandwidth | |
| EP2693429A1 (en) | System and method for analyzing voice communications | |
| US10462286B2 (en) | Systems and methods for deriving contact names |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: CLEARCAPTIONS, LLC, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:RAE, ROBERT LEE;REEVE, BLAINE MICHAEL;REEL/FRAME:039827/0598 Effective date: 20160801 |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
| AS | Assignment |
Owner name: ACF FINCO I LP, NEW YORK Free format text: SECURITY INTEREST;ASSIGNOR:CLEARCAPTIONS LLC;REEL/FRAME:043282/0951 Effective date: 20170811 |
|
| AS | Assignment |
Owner name: WESTERN ALLIANCE BANK, AN ARIZONA CORPORATION, CAL Free format text: SECURITY INTEREST;ASSIGNOR:CLEARCAPTIONS LLC;REEL/FRAME:047612/0708 Effective date: 20181128 |
|
| MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2551); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY Year of fee payment: 4 |
|
| AS | Assignment |
Owner name: CLEARCAPTIONS LLC, CALIFORNIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:ACF FINCO I LP;REEL/FRAME:062672/0265 Effective date: 20230208 |
|
| MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2552); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY Year of fee payment: 8 |