US20170078463A1 - Automatic volume control of a voice signal provided to a captioning communication service - Google Patents

Automatic volume control of a voice signal provided to a captioning communication service Download PDF

Info

Publication number
US20170078463A1
US20170078463A1 US15/194,332 US201615194332A US2017078463A1 US 20170078463 A1 US20170078463 A1 US 20170078463A1 US 201615194332 A US201615194332 A US 201615194332A US 2017078463 A1 US2017078463 A1 US 2017078463A1
Authority
US
United States
Prior art keywords
voice signal
far
communication device
signal
echo
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US15/194,332
Other versions
US10574804B2 (en
Inventor
Jeffrey C. Bullough
Shane A. Roylance
Brian Chevrier
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sorenson IP Holdings LLC
Original Assignee
Sorenson IP Holdings LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Assigned to CAPTIONCALL LLC reassignment CAPTIONCALL LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BULLOUGH, JEFFREY CHARLES, CHEVRIER, BRIAN, ROYLANCE, SHANE A.
Priority to US15/194,332 priority Critical patent/US10574804B2/en
Application filed by Sorenson IP Holdings LLC filed Critical Sorenson IP Holdings LLC
Assigned to SORENSON IP HOLDINGS, LLC reassignment SORENSON IP HOLDINGS, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: CAPTIONCALL, LLC
Publication of US20170078463A1 publication Critical patent/US20170078463A1/en
Assigned to JPMORGAN CHASE BANK, N.A. reassignment JPMORGAN CHASE BANK, N.A. SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: CAPTIONCALL, LLC, SORENSON IP HOLDINGS, LLC
Assigned to U.S. BANK NATIONAL ASSOCIATION reassignment U.S. BANK NATIONAL ASSOCIATION SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: CAPTIONCALL, LLC, SORENSON IP HOLDINGS, LLC
Assigned to CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT reassignment CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT PATENT SECURITY AGREEMENT Assignors: CAPTIONCALL, LLC, SORENSEN COMMUNICATIONS, LLC
Assigned to SORENSON IP HOLDINGS, LLC, CAPTIONCALL, LLC, SORENSON COMMUNICATIONS, LLC, INTERACTIVECARE, LLC reassignment SORENSON IP HOLDINGS, LLC RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: JPMORGAN CHASE BANK, N.A.
Assigned to SORENSON COMMUNICATIONS, LLC, INTERACTIVECARE, LLC, CAPTIONCALL, LLC, SORENSON IP HOLDINGS, LLC reassignment SORENSON COMMUNICATIONS, LLC RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: U.S. BANK NATIONAL ASSOCIATION
Assigned to CORTLAND CAPITAL MARKET SERVICES LLC reassignment CORTLAND CAPITAL MARKET SERVICES LLC LIEN (SEE DOCUMENT FOR DETAILS). Assignors: CAPTIONCALL, LLC, SORENSON COMMUNICATIONS, LLC
Publication of US10574804B2 publication Critical patent/US10574804B2/en
Application granted granted Critical
Assigned to CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH reassignment CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH JOINDER NO. 1 TO THE FIRST LIEN PATENT SECURITY AGREEMENT Assignors: SORENSON IP HOLDINGS, LLC
Assigned to SORENSON COMMUNICATIONS, LLC, CAPTIONCALL, LLC reassignment SORENSON COMMUNICATIONS, LLC RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: CORTLAND CAPITAL MARKET SERVICES LLC
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72475User interfaces specially adapted for cordless or mobile telephones specially adapted for disabled users
    • H04M1/72478User interfaces specially adapted for cordless or mobile telephones specially adapted for disabled users for hearing-impaired users
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/247Telephone sets including user guidance or feature selection means facilitating their use
    • H04M1/2474Telephone terminals specially adapted for disabled people
    • H04M1/2475Telephone terminals specially adapted for disabled people for a hearing impaired user
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/60Substation equipment, e.g. for use by subscribers including speech amplifiers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/002Applications of echo suppressors or cancellers in telephonic connections
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M9/00Arrangements for interconnection not involving centralised switching
    • H04M9/08Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
    • H04M9/082Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using echo cancellers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/16Communication-related supplementary services, e.g. call-transfer or call-hold
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2207/00Type of exchange or network, i.e. telephonic medium, in which the telephonic communication takes place
    • H04M2207/30Type of exchange or network, i.e. telephonic medium, in which the telephonic communication takes place third party service providers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/42391Systems providing special services or facilities to subscribers where the subscribers are hearing-impaired persons, e.g. telephone devices for the deaf

Definitions

  • the application relates generally to telecommunications and more particularly to communicating with a captioning communication service for assisting hearing-impaired users in communicating with others.
  • the disclosure relates to automatic volume control for the far-end signal received by the captioning communication service during a captioning communication session.
  • Hearing-impaired individuals may benefit from communication systems and devices configured to provide assistance in order to communicate with other individuals over a communication network.
  • captioning communication services have been established to provide assistive services (e.g., text captions) to the hearing-impaired user communicating with a communication device (e.g., caption phone, caption enabled device, etc.) that is specifically configured to communicate with the captioning communication service.
  • a communication device e.g., caption phone, caption enabled device, etc.
  • a captioning communication service may be a telecommunication assistive service, which is intended to permit a hearing-impaired person to utilize a communication network and assist their understanding of a conversation by providing text captions to supplement the voice conversation.
  • the captioning communication service may include an operator, referred to as a “call assistant,” who serves as a human intermediary between the hearing-impaired user and a far-end user.
  • the call assistant may listen to the audio signal of a far-end user and “revoice” the words of the far-end user to a speech recognition computer program tuned to the voice of the call assistant.
  • Text captions may be generated by the speech recognition computer as a transcription of the audio signal of the far-end user, and then transmitted to the communication device being used by the hearing-impaired user.
  • the communication device may then display the text captions while the hearing-impaired user carries on a normal conversation with the far-end user.
  • the text captions may allow the hearing-impaired user to supplement the voice received from the far-end and confirm his or her understanding of the words spoken by the far-end user.
  • hybrid echo also referred to as “electric echo”
  • electric echo describes a phenomenon in which a fraction of the signal leaving the phone is reflected by a hybrid circuit and returns into the near-end communication device. This is particularly prevalent in voice-band communication circuits where there are impedance imbalances in local two-wire to four-wire hybrid circuits are used. The effect of hybrid echo is that the near-end user hears their own utterances repeated back to them. Echo cancellation systems are conventionally employed within communication devices to cancel hybrid echo and/or acoustic echo.
  • Embodiments of the disclosure include a communication device specifically configured for use by a hearing-impaired user.
  • the communication device comprises a microphone configured to generate a near-end voice signal, communication elements configured to receive a received far-end voice signal through a network from a far-end communication device, and a processor operably coupled with the microphone and the communication elements.
  • the processor is configured to automatically control a volume level of an audio stream signal reproduced by a third party captioning communication service responsive to determining which of the near-end voice signal and the received far-end voice signal is active.
  • Embodiments of the disclosure include a method of operating a captioning communication service for hearing-impaired users.
  • the method comprises determining an active talker situation responsive to comparing a near-end voice signal from a near-end communication device and a received far-end voice signal from a far-end communication device, and automatically adjusting a volume level of an audio stream reproduced by a third party captioning communication service based on the determined active talker situation.
  • Additional embodiments include a captioning communication system, comprising a near-end communication device and a captioning communication service.
  • the near-end communication device includes a microphone configured to capture a near-end voice signal during a communication session with a far-end communication device, communication elements configured to receive a far-end voice signal from the far-end communication device during the communication session, a speaker configured to reproduce the far-end voice signal, an electronic display configured to display text captions during the communication session, and a processor operably coupled with the microphone, the communication elements, the speaker, and the electronic display.
  • the captioning communication service is configured to generate a text transcription of the far-end voice signal during the communication session and transmit the text transcription in real time to the near-end communication device for the text captions to be displayed.
  • At least one of the near-end communication device and the captioning communication system is configured to operate a volume control system configured to automatically adjust a volume of an audio stream reproduced by a speaker of the captioning communication device responsive to a volume control command identifying which of the far-end voice signal and the near-end voice signal is active at a given time, and an echo modifier configured to add distortion to an echo portion of the far-end voice signal when generating the audio stream.
  • a volume control system configured to automatically adjust a volume of an audio stream reproduced by a speaker of the captioning communication device responsive to a volume control command identifying which of the far-end voice signal and the near-end voice signal is active at a given time
  • an echo modifier configured to add distortion to an echo portion of the far-end voice signal when generating the audio stream.
  • FIG. 1 illustrates a communication system configured to facilitate a call between a hearing-impaired user and a far-end user.
  • FIG. 2 is a simplified schematic block diagram of a communication device associated with a hearing-impaired user according to an embodiment of the disclosure.
  • FIG. 3 is a captioning communication system including an automatic volume control system according to an embodiment of the disclosure.
  • FIG. 4 is a captioning communication system including an automatic volume control system according to another embodiment of the disclosure.
  • FIG. 5 is a captioning communication system including an automatic volume control system according to another embodiment of the disclosure.
  • FIG. 6 is a captioning communication system including an automatic volume control system according to another embodiment of the disclosure.
  • FIG. 7 is a captioning communication system including an automatic volume control system according to another embodiment of the disclosure.
  • FIG. 8 is a captioning communication system including an automatic volume control system according to another embodiment of the disclosure.
  • FIG. 9 is a flowchart illustrating a method for operating a captioning communication system for a hearing-impaired user according to an embodiment of the disclosure.
  • FIG. 10 is a flowchart illustrating a method for determining an active talker situation for a captioning communication system for a hearing-impaired user according to an embodiment of the disclosure.
  • FIG. 11 is a flowchart illustrating a method for processing audio for a captioning communication service of a captioning communication system for a hearing-impaired user according to an embodiment of the disclosure.
  • Information and signals described herein may be represented using any of a variety of different technologies and techniques.
  • data, instructions, commands, information, signals, bits, symbols, and chips that may be referenced throughout the description may be represented by voltages, currents, electromagnetic waves, magnetic fields or particles, optical fields or particles, or any combination thereof.
  • Some drawings may illustrate signals as a single signal for clarity of presentation and description. It should be understood by a person of ordinary skill in the art that the signal may represent a bus of signals, wherein the bus may have a variety of bit widths and the disclosure may be implemented on any number of data signals including a single data signal.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • a processor herein may be any processor, controller, microcontroller, or state machine suitable for carrying out processes of the disclosure.
  • a processor may also be implemented as a combination of computing devices, such as a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
  • a special-purpose computer improves the function of a computer because, absent the disclosure, the computer would not be able to carry out the processes of the disclosure.
  • the disclosure also provides meaningful limitations in one or more particular technical environments that go beyond an abstract idea.
  • embodiments of the disclosure provide improvements in the technical field of telecommunications, particularly in a telecommunication system including a captioning communication service for providing text captions to a caption-enabled communication device to assist hearing-impaired users.
  • Embodiments include features that improve the functionality of the communication device such that new communication device and method for establishing captioning communication sessions are described.
  • the interaction of the communication device with other systems e.g., the captioning communication service
  • a process may be described in terms of a process that is depicted as a flowchart, a flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe operational acts as a sequential process, many of these acts can be performed in another sequence, in parallel, or substantially concurrently. In addition, the order of the acts may be re-arranged.
  • a process may correspond to a method, a function, a procedure, a subroutine, a subprogram, interfacing with an operating system, etc.
  • the methods disclosed herein may be implemented in hardware, software, or both. If implemented in software, the functions may be stored or transmitted as one or more instructions (e.g., software code) on a computer-readable medium.
  • Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another.
  • any reference to an element herein using a designation such as “first,” “second,” and so forth does not limit the quantity or order of those elements, unless such limitation is explicitly stated. Rather, these designations may be used herein as a convenient method of distinguishing between two or more elements or instances of an element. Thus, a reference to first and second elements does not mean that only two elements may be employed there or that the first element must precede the second element in some manner. Also, unless stated otherwise a set of elements may comprise one or more elements.
  • a “hearing-impaired user” may refer to a person with diminished hearing capabilities.
  • Hearing-impaired users of caption-enabled communication device often have some level of hearing ability that has usually diminished over a period of time such that they can communicate by speaking, but that they often struggle in hearing and/or understanding the far-end user.
  • call refers to the communication session between the hearing-impaired user's communication device and the far-end user's communication device.
  • the call may pass audio signals between the two parties.
  • the term call is used in order to be more easily distinguishable from the captioning communication session.
  • the call may be referred to as incoming or outgoing from the perspective of the hearing-impaired user's communication device.
  • Incoming and outgoing calls may refer to the period of time prior to when the call is “answered” by the other party to begin the communication of the audio signals there between.
  • they are often referred to from the perspective of the communication device associated with the audibly-impaired user.
  • an “incoming call” may originate from a far-end user to a near-end communication device and an “outgoing call” may originate from a near-end user to a far-end communication device.
  • near-end and far-end are relative terms depending on the perspective of the particular user.
  • the terms “near-end” and “far-end” are used as a convenient way to distinguish between users and devices.
  • captioning communication session refers to the communication session between the hearing-impaired user's communication device and the captioning communication service.
  • the captioning communication session may pass text captions from the captioning communication service to the hearing-impaired user's communication device.
  • the captioning communication session may also include the hearing-impaired user's communication device transmitting the far-end user's audio signal to the captioning communication service to generate the text captions.
  • audio signal refers to the signal generated and transmitted by a communication device during a call. Most examples are provided from the perspective of a hearing-impaired user using a captioning communication device, such that the audio signal captured by that device is sometimes referred to as the “near-end audio signal,” and the audio signal received to be reproduced by the speaker is sometimes referred to as the “far-end audio signal.”
  • the terms “near-end” and “far-end” may also be referred to as “local” and “remote,” respectively.
  • FIG. 1 illustrates a communication system 100 configured to facilitate an assisted call between a hearing-impaired user 102 and a far-end user 104 .
  • the communication system 100 may include a first communication device 110 , a second communication device 120 , and a third party communication service 130 , which may be a captioning communication service or a relay service (as illustrated in FIG. 1 ).
  • the first communication device 110 and the second communication device 120 may be coupled together to facilitate communication there between via a first network 140 .
  • the first communication device 110 and the third party communication service 130 may be coupled together to facilitate communication there between via a second network 150 .
  • the first network 140 and the second network 150 may each be implemented according to the standards and bandwidth requirements of a communication network (e.g., Public Switch Telephone Network (PSTN), cellular network, Voice Over Internet Protocol (VOIP) networks, etc.).
  • PSTN Public Switch Telephone Network
  • VOIP Voice Over Internet Protocol
  • the use of the terms “network” or “communication network” as used herein contemplates networks that are compatible and configured to provide communications using analog and/or digital standards unless specifically stated otherwise.
  • the first network 140 and the second network 150 may be the same network (e.g., both connections may be Internet-based connections). Thus, discussion of the first network 140 and the second network 150 separately may be for convenience of discussing a particular connection between two or more devices.
  • the first network 140 and the second network 150 may be different networks.
  • the first communication device 110 and the second communication device 120 may communicate via a PSTN network connection, while the first communication device 110 and the second communication device 120 may communicate via an internet connection.
  • Other variations and combinations of networks are also contemplated.
  • the first communication device 110 may include a device that is configured to assist the hearing-impaired user 102 in communicating with another individual (e.g., far-end user 104 ).
  • the first communication device 110 may include a caption-enabled communication device configured to receive and display text captions of at least a portion of the conversation.
  • the hearing-impaired user 102 may be able to read the text captions of the words spoken by the far-end user 104 to supplement the audio signal received by the first communication device 110 .
  • the hearing-impaired user 102 may have an improved experience in understanding the conversation.
  • the first communication device 110 may also be configured to receive and display video on an electronic display on the first communication device 110 .
  • the second communication device 120 may comprise a conventional voice telephone (e.g., landline phone, cellular phone, smart phone, VoIP phone, etc.). As such, the far-end user 104 may interact in a conventional manner with the second communication device 120 .
  • the second communication device 120 may be configured similarly as the first communication device (e.g., caption-enabled communication device). As a result, the second communication device 120 may likewise be operated by a hearing-impaired user.
  • FIG. 1 facilitating communication between the hearing-impaired user 102 and the far-end user 104 is shown in FIG. 1 to imply that the far-end user 104 is a hearing-capable user, such a situation is shown only as an example.
  • first and second communication devices 110 , 120 include both the first communication device 110 and the second communication device 120 coupled to the third party communication service 130 to facilitate the captioning services for each respective hearing-impaired user.
  • each of the first and second communication devices 110 , 120 may have its own communication session with the third party communication service 130 .
  • the third party communication service 130 may be configured to provide interpretive services (e.g., captioning) to the hearing-impaired user 102 . More specifically, a human “call assistant” within third party communication service 130 may be employed to facilitate an assisted call between a hearing-impaired user 102 and a far-end user 104 . As discussed above, in some embodiments the third party communication service 130 may be configured to provide text captions of at least a portion of the conversation. In such an embodiment, the call assistant may listen to the voice signal received and re-voice the portion of the conversation into a microphone so that voice recognition software may generate the text captions that are transmitted to the first communication device 110 . Thus, the third party communication service 130 may include one or more of an internet protocol captioned telephone service (IPCTS), captioned telephone service (CTS), or other telecommunications relay services (TRS).
  • IPCTS internet protocol captioned telephone service
  • CTS captioned telephone service
  • TRS telecommunications relay services
  • FIG. 1 shows a configuration where the first communication device 110 acts as a router for the voice signal from the second communication device 120 to the third party communication service 130 .
  • the voice signal of the far-end user 104 may be transmitted from the second communication device 120 to the first communication device 110 .
  • the voice signal of the far-end user 104 may then be transmitted from the first communication device 110 to the third party communication service 130 for the text captions to be generated in a text captioning embodiment.
  • the text captions may then be transmitted from the third party communication service 130 to the first communication device 110 to be displayed as text captions for the hearing-impaired user to read during the conversation.
  • the call assistant may also monitor the text captions that are generated and transmitted to the first communication device 110 to identify any errors that may have been generated by the voice recognition software.
  • the call assistant may correct such errors, such as described in U.S. Pat. No. 8,379,801, issued Feb. 19, 2013, entitled “Methods and Systems Related to Text Caption Error Correction,” the disclosure of which is incorporated herein in its entirety by this reference.
  • the third party communication service 130 may be configured to receive the far-end voice signal from the second communication device 120 and route the far-end voice signal to the first communication device 110 .
  • FIG. 1 shows only two communication devices 110 , 120
  • the communication system 100 may include more communication devices. It is contemplated that the communication system 100 may facilitate communication between any number and combinations of hearing-impaired users and far-end users. For example, in some embodiments two or more communication devices may be connected for facilitating communication between a hearing-impaired user and other hearing-impaired users and/or far-end users.
  • Embodiments of the disclosure include devices and methods for remote attenuation of the audio stream received by the captioning communication system. For example, talker direction detection may be performed on a local system then a command may be sent with the encoded audio stream (e.g., Speex) over a communication channel to a remote captioning communication system to allow the captioning communication service to determine the best method to process the audio stream by suppressing echo or otherwise modify the signal.
  • embodiments may combine an echo volume control with an echo modifier to reduce the effect of echo present in received audio when doubletalk is present.
  • FIG. 2 is a simplified schematic block diagram of a communication device 200 associated with a hearing-impaired user according to an embodiment of the disclosure.
  • the communication device 200 may be the first communication device 110 of FIG. 1 .
  • the communication device 200 may be configured to establish calls with other communication devices and captioning communication sessions with a captioning communication service configured to assist the hearing-impaired user.
  • the communication device 200 may be a caption enabled communication device, which may be implemented as a standalone device (e.g., a caption phone), or as implemented on another device (e.g., tablet computer, laptop computer, smart phone, etc.).
  • the communication device 200 may include a processor 210 operably coupled with an electronic display 220 , communication elements 230 , a memory device 240 , input devices 250 , and a speaker 260 .
  • the communication device 200 may include a camera for also participating in a video communication session.
  • the processor 210 may coordinate the communication between the various devices as well as execute instructions stored in computer-readable media of the memory device 240 .
  • the processor 210 may be configured to execute a wide variety of operating systems and applications including the computing instructions.
  • the memory device 240 may be used to hold computing instructions, data, and other information for performing a wide variety of tasks including performing embodiments disclosed herein.
  • the memory device 240 may include Synchronous Random Access Memory (SRAM), Dynamic RAM (DRAM), Read-Only Memory (ROM), Flash memory, and the like.
  • the memory device 240 may include volatile and non-volatile memory storage for the communication device 200 .
  • the communication elements 230 may be configured to communicate with other devices or communication networks, including other communication devices and the captioning communication service.
  • the communication elements 230 may include elements for communicating on wired and wireless communication media, such as for example, serial ports, parallel ports, Ethernet connections, universal serial bus (USB) connections IEEE 1394 (“firewire”) connections, Bluetooth wireless connections, 802.1 a/b/g/n type wireless connections, and other suitable communication interfaces and protocols.
  • the input devices 250 may include a numeric keypad, a keyboard, a touchscreen, a remote control, a mouse, buttons, other input devices, or combinations thereof.
  • FIG. 3 is a captioning communication system 300 including an automatic volume control system according to an embodiment of the disclosure.
  • the captioning communication system 300 includes a first communication device 110 (e.g., local caption communication device) specifically configured for use by a hearing-impaired user (i.e., a local user) to communicate with the second communication device 120 associated with a far-end user (i.e., remote user) over a first network 140 (e.g., PSTN network).
  • the captioning communication system 300 may further include a third party communication service 130 (i.e., third party communication service) that is configured to communicate with the first communication device 110 to provide text captions during a communication session to assist the hearing-impaired user.
  • a third party communication service 130 i.e., third party communication service
  • the first communication device 110 may be configured to receive the far-end voice signal, which may also be routed to the third party communication service 130 which generates the text transcription of the far-end voice signal that is provided to the first communication device 110 to display to the hearing-impaired user during the communication session.
  • the local outgoing signal is referred to as the near-end voice signal s[n]
  • the remote incoming signal is referred to as the far-end voice signal r[n].
  • the echo from the near-end voice signal s[n] that is caused by the first network 140 is referred as the echo signal e[n].
  • the received far-end voice signal g[n] is received by the echo modifier 320 , which adds distortion (e.g., resulting in modified echo estimate signal e′[n]) to generate the modified received far-end voice signal g′[n] (also referred to as the “audio stream”).
  • the modified echo estimate signal e′[n] is generated by an echo modifier 320 that will be discussed further below.
  • the packetized output signal a[n] may include the packetized form (via encoder 311 ) of the modified received far-end voice signal g′ [n] as well as an volume control command (d). These signals will be discussed further below.
  • the captioning communication system 300 further includes an echo volume control 310 that is configured to automatically control the volume of the audio signal (e.g., modified received far-end voice signal g′ [n]) received and reproduced by the third party communication service 130 during the communication session.
  • the echo volume control 310 may set the volume of the audio signal at a first level responsive to a determination that only the far-end user is speaking.
  • the echo volume control 310 is configured to set the volume of the audio signal received by the third party communication service 130 at a second level responsive to a determination that only the near-end user is speaking.
  • the first level is higher (i.e., louder) than the second level.
  • the volume level of the audio signal provided to the call assistant may be attenuated in comparison to the volume level of the audio signal provided to the call assistant when only the far-end user is speaking.
  • the second level may be completely attenuated (e.g., suppressed) such that no sound is produced for the call assistant.
  • the echo volume control 310 may include an active signal detector 312 that is configured to perform the determination of which talker is active at a given time.
  • the active signal detector 312 may be receive the near-end voice signal s[n] and the received far-end voice signal g[n] to determine which of the two signals s[n], g[n] are active to indicate whether the near-end user and/or the far-end user are active (i.e., talking) at a given time.
  • the active signal detector 312 determines whether the near-end voice signal s[n] or the far-end voice signal r[n] is active. Thus, the active signal detector 312 determines if the near-end user is active (i.e., talking), if the far-end user is active (i.e., talking), or if both the near-end user and the far-end use are active (i.e., a double-talk situation). Thus, it could be said that the active signal detector 312 determines the “direction” of which party is currently talking. For example, active signal detector 312 may compare (e.g., cross correlate) the near-end voice signal with the received far-end voice signal g[n].
  • the active signal detector 312 may be further configured to generate an volume control command (d) that indicates which user is active responsive to the determination discussed above.
  • the volume control command (d) may have different states for various situations.
  • the active signal detector 312 may be configured generate the volume control command to have a first state corresponding to the “near-end only” situation, and a second state corresponding to the “far-end only situation.”
  • the active signal detector 312 may include a third state corresponding to the “double talk” situation, whereas other embodiments may simply generate the volume control command (d) corresponding to the double talk situation to be the same state as the “far-end only” situation.
  • the first communication device 110 may be configured to send the volume control command (d) along with the speaker out signal g′ [n] to the encoder 311 , which encodes the two signals into the encoded signal packet a[n] that is transmitted to the third party communication service 130 through the communication channel 314 for use by the audio processing logic 316 of the third party communication service 130 when generating the text transcription of the far-end voice signal r[n].
  • the volume control command (d) may be a flag bit or other instruction that is interpreted by the audio processing as which talker situation should be applied to the particular audio packet received.
  • the volume control command (d) generated by the active signal detector 312 may be a binary value (e.g., 0 or 1), in which the logic of the audio processing logic 316 may interpret a first value (e.g., 0) to be a first volume level for the audio packet (e.g., no attenuation) and a second value (e.g., 1) to correspond to a second volume level for the audio packet (e.g., full attenuation) provided to the speaker 332 .
  • the binary values may be reversed in the way they are interpreted by the logic of the audio processing logic 316 .
  • the volume control command (d) may be in the form of a numerical value or other instruction that corresponds to a volume level or amount of attenuation of the audio packet to be passed onto the speaker 332 .
  • the volume control command (d) generated by the active signal detector 312 may be an attenuation value (e.g., integer) between a volume range (e.g., 0 and 5 ) supported by the audio processing logic 316 , in which the logic of the audio processing logic 316 may interpret a first value (e.g., 0) to correspond to attenuation for a first volume level (e.g., no attenuation) for the audio packet and a second attenuation value (e.g., 5) to correspond to attenuation for a second volume level (e.g., full attenuation) for the audio packet provided to the speaker 332 .
  • a first value e.g., 0
  • a second attenuation value e.g., 5
  • the intermediate values may be assigned to a scale of intermediate attenuation levels, if desired.
  • different schemes are contemplated for the volume control command (d) depending on how the logic for the audio processing logic 316 is configured to provide the audio packets to the speaker 332 at different levels for the third party communication assistant to hear (or not hear) the far-end voice signal depending on the situation determined by the active signal detector 312 .
  • the echo volume control provides the audio packets to the speaker 332 of the third party communication service 130 at a louder volume during the far-end talker only situation in comparison to the near-end talker only situation.
  • the double talk situation may be handled the same way as the far-end talker only situation in terms of the volume of the audio packets provided to the speaker 332 .
  • the features and functionality (e.g., active signal detector 312 ) of the echo volume control 310 may be included within the first communication device 110 .
  • at least some of the features and functionality (e.g., audio processing logic 316 ) of the echo volume control 310 may be included within the third party communication service 130 .
  • the active signal detector 312 may be configured to determine whether the local user or remote user is speaking, and send a volume control command (d) to the audio processing logic 316 .
  • the audio processing logic 316 may be configured to reduce the volume of the audio packets to the speaker 332 responsive to the information provided by the volume control command (d).
  • the captioning communication system 300 further includes an echo modifier 320 .
  • the echo modifier 320 may be configured to add distortion to the echo signal such that the audio packets received by the audio processing logic 316 may have an echo signal that is distorted from its original state such that the call assistant may better audibly distinguish between the far-end voice signal portion and the modified echo portion.
  • the echo modifier 320 may include an echo estimator 322 and an echo distortion logic 324 .
  • the echo estimator 322 may be configured to generate an estimate of the echo e[n].
  • the echo estimator 322 may include adaptive filter that is configured to generate an estimated echo signal as its output.
  • the adaptive filter may receive the near-end voice signal s[n], and be configured to train its coefficients based on the error signal generated from the difference between the received far-end voice signal g[n] and the output from the echo estimator 322 .
  • the output from the echo estimator 322 is approximately the echo e[n]; however, rather than subtracting out the echo as with conventional echo cancellation systems, the echo distortion logic 324 receive the estimate echo signal and add distortion to generate the modified echo estimate signal e′[n].
  • the modified echo signal e′[n] is summed (e.g., subtracted) with the received far-end voice signal g[n] to generate the modified received far-end voice signal g′[n]. Because the modified echo estimate signal e′[n] and the echo e[n] portion of the received far-end voice signal g[n] may be highly correlated, when the modified echo e′[n] is subtracted from the echo e[n] the remaining signal is substantially the difference caused by the modification that was performed on the estimate, plus a certain amount of error produced by inaccuracy in the echo estimator 322 .
  • the far-end voice signal r[n] portion of the received far-end voice signal g[n] and the modified echo estimate e′[n] may not be well correlated, subtracting the modified echo estimate signal e′[n] may have little effect on that portion.
  • the resulting modified received far-end voice signal g′[n] includes the far-end voice signal r[n] and a distorted version of the echo (e.g., e′[n]-e[n]).
  • the modified received far-end voice signal g′[n] when the modified received far-end voice signal g′[n] is reproduced by the speaker 332 of the third party communication service 130 , the distorted version of the echo may be audibly distinguishable from the far-end voice signal r[n] by the third party communications assistant when they listen to the far-end voice signal to generate the text transcription for the text captions.
  • the third party call assistant may have an improved experience in revoicing the correct voice signal, which may improve the accuracy of the text captions.
  • Echo distortion may include any process that makes the echo portion audibly distinguishable from the far-end voice portion of the received far-end voice signal g[n].
  • Non-limiting examples of echo modification may include frequency shifting, signal modulation, partial or complete attenuation, adding white or colored noise, etc.
  • the echo volume control 310 includes an active signal detector 312 that determines whether the local user or remote user is talking.
  • the active signal detector 312 may include a double talk detector.
  • the result of the active signal detector 312 may be generated in the form of the volume control command (d) that is packaged with the audio stream g′[n] to form a[n], which is received by the third party communication service 130 over a communication channel 314 (e.g., the Internet or other digital network, radio frequency communications network, optical communications network, serial or parallel bus, etc.).
  • a communication channel 314 e.g., the Internet or other digital network, radio frequency communications network, optical communications network, serial or parallel bus, etc.
  • the third party communication service 130 processes the audio stream g′[n] based, at least in part, on the results of the direction detector (e.g., according to the volume control command (d)) as discussed above. If the signal is from the local user the audio can be attenuated, or other processing can be performed as needed (e.g., filtering, amplification, etc.). If the signal is from the remote user the audio is passed unmodified, or other processing can be performed as needed (e.g., filtering, amplification, attenuation, etc.). After processing, the resulting signal is reproduced for the call assistant to hear and perform their duties of generating the text transcription of the far-end voice.
  • the direction detector e.g., according to the volume control command (d)
  • the echo modifier 320 alters the echo portion e[n] of the received far-end voice signal g[n], such that the communications assistant at the third party communication service 130 can audibly distinguish between the near-end voice and far-end voice signals.
  • FIG. 4 is a captioning communication system 400 including an automatic volume control system according to another embodiment of the disclosure.
  • the captioning communication system 400 includes similar elements as in FIG. 3 , but with additional third party communication services 130 A, 130 B.
  • Each third party communication service 130 A, 130 L may include audio processing logic 316 A, 316 L, and a speaker 332 A, 332 L for its call assistant.
  • the first communication device 110 may transmit the combined command (d) and the modified received far-end voice signal g′[n] (i.e., encoded signal packet a[n]) to any number of third party recipients over communication channels 314 A, 314 L.
  • the communication channels 314 A, 314 L may be the same or distinct communications channels for each third party communication service 130 A, 130 L.
  • Each third party communication service 130 A, 130 L may refer to different call assistants within the same location (e.g., call center) or different call assistants located within different locations, as desired.
  • Each audio processing logic 316 A, 316 L may process the encoded signal packet a[n] according to its specific needs.
  • the first audio processing logic 316 A associated with a first call assistant may be configured to process the encoded signal packet a[n] differently than the third party audio processing logic 316 B associated with a second call assistant.
  • the near-end voice signal s[n] may also be transmitted to one or more of the third party communication services 130 A, 130 L through the communication channels 314 A, 314 L. In such an embodiment, it may be desirable for one call assistant to transcribe the near-end voice signal s[n], while another call assistant transcribe the far-end voice signal r[n] from the modified received far-end voice signal g′[n].
  • the first call assistant may transcribe the near-end voice signal s[n] (in which case the modified received far-end voice signal g′[n] may be attenuated by audio processing logic 316 A), and the second call assistant may transcribe the far-end voice signal r[n] from the modified received far-end voice signal g′[n] (in which case the near-end voice signal s[n] may be attenuated by the audio processing logic 316 L).
  • FIG. 5 is a captioning communication system 500 including an automatic volume control system according to another embodiment of the disclosure.
  • the captioning communication system 400 includes similar elements as in FIG. 3 , but with the audio processing logic 316 being performed locally by the first communication device 110 .
  • the volume control command (d) from the active signal detector 312 may be used locally to process the modified received far-end voice signal g′[n].
  • the processed audio stream g′[n] may be transmitted to the third party communication service 130 through the communication channel 314 having a volume with the first level or second level based on the volume control command (d).
  • the audio processing logic 316 may be configured to not send any modified received far-end voice signal g′[n] in the near-end only situation determined by the active signal detector 312 .
  • the audio processing logic 316 may send the modified received far-end voice signal g′[n] to the third party communication service 130 through the communication channel 314 in the far-end only situation and/or the double talk situation determined by the active signal detector 312 .
  • the third party communication service 130 may receive the encoded version of the modified received far-end voice signal g′[n] and decode it to be reproduced by the speaker 332 for the call assistant to generate the text transcription of the far-end voice signal r[n] portion of the modified received far-end voice signal g′[n] as discussed above.
  • FIG. 6 is a captioning communication system 600 including an automatic volume control system according to another embodiment of the disclosure.
  • the captioning communication system 600 includes similar elements as in FIG. 3 , but with the audio processing logic 316 A being performed locally by the first communication device 110 as well with third party audio processing logic 316 B being performed remotely by the third party communication service 130 .
  • the active signal detector 312 may provide the volume control command (d) to the local audio processing logic 316 A, and the third party audio processing logic 316 B (via encoder 311 ) such that each may provide the appropriate audio processing of received version of the modified received far-end voice signal g′[n] according to its specific requirements based on the volume control command (d).
  • the third party communication service 130 may also include a decoder (not shown in FIG. 6 ) that is configured to decode the signal received through the communication channel 314 for processing.
  • FIG. 7 is a captioning communication system 700 including an automatic volume control system according to another embodiment of the disclosure.
  • the captioning communication system 700 includes similar elements as in FIG. 5 with the audio processing logic 316 of the echo volume control 310 being performed locally in the first communication device 110 , but with the processed modified received far-end voice signal g′[n] (via the audio processing logic 316 ) only being forwarded to the third party communication service 130 .
  • additional audio processing 366 may be performed remotely by the third party communication service 130 prior to being sent to the speaker 332 .
  • FIG. 8 is a captioning communication system 800 including an automatic volume control system according to another embodiment of the disclosure.
  • the captioning communication system 800 includes similar elements as in FIG. 3 , but with the elements of the echo volume control 310 and the echo modifier 320 being performed by the third party communication service 130 .
  • the first communication device 110 may be configured to transmit the near-end voice signal s[n] and the received far-end voice signal g[n] to the third party communication service 130 through the communication channel 314 .
  • the third party communication service 130 may include the active signal detector 312 , the audio processing logic 316 , the echo estimator 322 , and the echo distortion logic 324 that are configured as discussed above.
  • the third party communication service 130 may perform the different active talker situations and related attenuation scenarios, as well as the echo estimation and echo modification. It is therefore contemplated that the features and methodology described herein may be performed locally the first communication device 110 , by the third party communication service 130 , or any combination thereof.
  • the speaker 304 of the first communication device 110 may receive the received far-end voice signal g[n] or an echo canceled version thereof.
  • the received far-end voice signal g[n] may by processed through a conventional echo canceler locally even if the signal received by the second communication device 120 did not have an echo cancellation process performed thereon (see, e.g., echo canceller 305 in FIG. 3 ).
  • the speaker 304 of the first communication device 110 may receive substantially the far-end voice signal r[n] with the echo removed locally.
  • the speaker 304 of the first communication device 110 may receive modified received far-end voice signal g′[n] that has been processed responsive to the volume control command d.
  • FIG. 9 is a flowchart 900 illustrating a method for operating a captioning communication system for a hearing-impaired user according to an embodiment of the disclosure.
  • the active talker situation may be determined from the near-end voice signal and the received signal from the far-end communication device.
  • the active talker situation may be determined to be a far-end only situation, a near-end only situation, or a double talk situation by comparing (e.g., cross correlating) the received signal and the near-end voice signal.
  • the echo portion of the received signal from the far-end communication device may be estimated through an adaptive filter that receives the near-end voice signal, and trains the filter based on the error signal generated from the difference between the received signal and the output from the echo estimator.
  • the estimated echo is not subtracted from the received signal to generate an echo cancelled signal.
  • echo distortion is added to the received signal.
  • the echo distortion may include distorting the estimate echo signal and subtractive the result from the received signal.
  • the distortion may include frequency shifting, signal modulation, partial or complete attenuation, adding white or colored noise, or combinations thereof, to the estimated echo signal, which is then summed (e.g., subtracted) with the received signal to generate a modified received far-end voice signal that is used as the audio stream for the third party communication service.
  • the volume of the modified received far-end voice signal reproduced by the third party communication service may be automatically adjusted based on the determined active talker situation.
  • the volume for the far-end only situation may have a first level (e.g., high volume) and the volume for the near-end only situation may have a second level (e.g., low volume).
  • the second level for the near-end only situation may be complete attenuation of the modified received far-end voice signal such that the call assistant's speaker does not produce sound for generating a text transcription of the far-end voice signal portion of the modified audio signal.
  • Different operations of FIG. 9 may be performed by the near-end communication device, the third party communication service, or a combination thereof.
  • FIG. 10 is a flowchart 1000 illustrating a method for determining an active talker situation for a captioning communication system for a hearing-impaired user according to an embodiment of the disclosure.
  • the received signal may be received from the far-end communication device.
  • the near-end voice signal may be received from the microphone of the near-end communication device.
  • the received signal and the near-end voice signal may be compared (e.g., cross correlation) to determine which signal is active at a given time or if both signals are active.
  • the result of the comparison may determine which situation is occurring.
  • the situations may include the far-end only situation 1042 , the near-end only situation 1044 , and the double talk situation 1046 .
  • the active signal detector may generate an volume control command (d) that is used by the audio processing of the audio stream to determine the automatic volume control to the speaker of the third party communication service.
  • the volume control command (d) may include a binary flag, a numerical value, or other command that indicates to the audio processing the active talker situation, such that the audio processing can then take the appropriate actions (e.g., pass the audio, attenuate the audio, etc.).
  • Different operations of FIG. 10 may be performed by the near-end communication device, the third party communication service, or a combination thereof.
  • FIG. 11 is a flowchart 1100 illustrating a method for processing audio for a captioning communication service of a captioning communication system for a hearing-impaired user according to an embodiment of the disclosure.
  • the modified received far-end voice signal including the far-end voice signal and modified echo may be received.
  • the volume control command may be received.
  • the active talker situation may be determined from the volume control command received.
  • the situations may include the far-end only situation 1132 , the near-end only situation 1134 , and the double talk situation 1136 .
  • the volume level for the produced audio for the call assistant may be set at a first level (e.g., higher) at operation 1140 . If the situation is the near-end only situation, the volume level for the produced audio for the call assistant may be set at a second level (e.g., lower) at operation 1150 . In some embodiments, if the situation is a double talk situation, the volume level for the produced audio for the call assistant may be set at the first level (i.e., the same as the far-end only situation). In some embodiments, if the situation is a double talk situation, the volume level for the produced audio for the call assistant may be set at a third level (i.e., different than the far-end only situation). Different operations of FIG. 11 may be performed by the near-end communication device, the third party communication service, or a combination thereof.
  • Embodiments of the disclosure may be used to reduce negative effects of the presence of echo when traditional methods (e.g., echo cancellation) cannot be used or may not be preferred.
  • the performance of standard echo suppression may be improved in the presence of doubletalk.
  • remote third party devices e.g., call assistant devices for a captioning communication service
  • receiving the audio stream may determine how audio is to be processed before reproducing the audio to the third party end user (e.g., call assistant).
  • call assistants and other third party listeners may be provided with the ability to discern between local voice and remote voice signals as a result of the modified received far-end voice signal g′ [n] being used, which includes a distorted version of the echo that may assist the call assistant to audibly distinguish between the far-end voice signal and the echo that results from the near-end voice signal. This may make it easier for the call assistant to transcribe the correct talker's words in comparison to conventional systems that do not perform echo cancellation on the audio stream sent to the call assistant, or for which echo cancellation does not adequately eliminate all echo.

Abstract

Apparatuses and methods are disclosed for automatic volume control of an audio stream reproduced by a captioning communication service for use by a call assistant in generating a text transcription of a communication session between a hearing-impaired user and a far-end user. The automatic volume control automatically adjusts a volume of the audio stream reproduced by the captioning communication service responsive to a volume control command identifying which of the far-end voice signal and the near-end voice signal is active at a given time. The system further includes an echo modifier configured to add distortion to an echo portion of the far-end voice signal when generating the audio stream.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application claims the benefit of U.S. patent application Ser. No. 14/933,893 filed Nov. 5, 2015, which claims the benefit of U.S. Provisional Patent Application Ser. No. 62/219,654, filed Sep. 16, 2015, the disclosures of which are hereby incorporated herein in their entireties by this reference.
  • FIELD
  • The application relates generally to telecommunications and more particularly to communicating with a captioning communication service for assisting hearing-impaired users in communicating with others. In addition, the disclosure relates to automatic volume control for the far-end signal received by the captioning communication service during a captioning communication session.
  • BACKGROUND
  • Hearing-impaired individuals may benefit from communication systems and devices configured to provide assistance in order to communicate with other individuals over a communication network. For example, captioning communication services have been established to provide assistive services (e.g., text captions) to the hearing-impaired user communicating with a communication device (e.g., caption phone, caption enabled device, etc.) that is specifically configured to communicate with the captioning communication service.
  • In particular, a captioning communication service may be a telecommunication assistive service, which is intended to permit a hearing-impaired person to utilize a communication network and assist their understanding of a conversation by providing text captions to supplement the voice conversation. The captioning communication service may include an operator, referred to as a “call assistant,” who serves as a human intermediary between the hearing-impaired user and a far-end user. During a captioning communication session, the call assistant may listen to the audio signal of a far-end user and “revoice” the words of the far-end user to a speech recognition computer program tuned to the voice of the call assistant. Text captions (also referred to as “captions”) may be generated by the speech recognition computer as a transcription of the audio signal of the far-end user, and then transmitted to the communication device being used by the hearing-impaired user. The communication device may then display the text captions while the hearing-impaired user carries on a normal conversation with the far-end user. The text captions may allow the hearing-impaired user to supplement the voice received from the far-end and confirm his or her understanding of the words spoken by the far-end user.
  • During a communication session, the communication device may experience echo (e.g., hybrid echo, acoustic echo, etc.). The term “hybrid echo” (also referred to as “electric echo”) describes a phenomenon in which a fraction of the signal leaving the phone is reflected by a hybrid circuit and returns into the near-end communication device. This is particularly prevalent in voice-band communication circuits where there are impedance imbalances in local two-wire to four-wire hybrid circuits are used. The effect of hybrid echo is that the near-end user hears their own utterances repeated back to them. Echo cancellation systems are conventionally employed within communication devices to cancel hybrid echo and/or acoustic echo.
  • The subject matter claimed herein is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is only provided to illustrate one example technology area where some embodiments described herein may be practiced.
  • BRIEF SUMMARY
  • Embodiments of the disclosure include a communication device specifically configured for use by a hearing-impaired user. The communication device comprises a microphone configured to generate a near-end voice signal, communication elements configured to receive a received far-end voice signal through a network from a far-end communication device, and a processor operably coupled with the microphone and the communication elements. The processor is configured to automatically control a volume level of an audio stream signal reproduced by a third party captioning communication service responsive to determining which of the near-end voice signal and the received far-end voice signal is active.
  • Embodiments of the disclosure include a method of operating a captioning communication service for hearing-impaired users. The method comprises determining an active talker situation responsive to comparing a near-end voice signal from a near-end communication device and a received far-end voice signal from a far-end communication device, and automatically adjusting a volume level of an audio stream reproduced by a third party captioning communication service based on the determined active talker situation.
  • Additional embodiments include a captioning communication system, comprising a near-end communication device and a captioning communication service. The near-end communication device includes a microphone configured to capture a near-end voice signal during a communication session with a far-end communication device, communication elements configured to receive a far-end voice signal from the far-end communication device during the communication session, a speaker configured to reproduce the far-end voice signal, an electronic display configured to display text captions during the communication session, and a processor operably coupled with the microphone, the communication elements, the speaker, and the electronic display. The captioning communication service is configured to generate a text transcription of the far-end voice signal during the communication session and transmit the text transcription in real time to the near-end communication device for the text captions to be displayed. At least one of the near-end communication device and the captioning communication system is configured to operate a volume control system configured to automatically adjust a volume of an audio stream reproduced by a speaker of the captioning communication device responsive to a volume control command identifying which of the far-end voice signal and the near-end voice signal is active at a given time, and an echo modifier configured to add distortion to an echo portion of the far-end voice signal when generating the audio stream.
  • BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
  • FIG. 1 illustrates a communication system configured to facilitate a call between a hearing-impaired user and a far-end user.
  • FIG. 2 is a simplified schematic block diagram of a communication device associated with a hearing-impaired user according to an embodiment of the disclosure.
  • FIG. 3 is a captioning communication system including an automatic volume control system according to an embodiment of the disclosure.
  • FIG. 4 is a captioning communication system including an automatic volume control system according to another embodiment of the disclosure.
  • FIG. 5 is a captioning communication system including an automatic volume control system according to another embodiment of the disclosure.
  • FIG. 6 is a captioning communication system including an automatic volume control system according to another embodiment of the disclosure.
  • FIG. 7 is a captioning communication system including an automatic volume control system according to another embodiment of the disclosure.
  • FIG. 8 is a captioning communication system including an automatic volume control system according to another embodiment of the disclosure.
  • FIG. 9 is a flowchart illustrating a method for operating a captioning communication system for a hearing-impaired user according to an embodiment of the disclosure.
  • FIG. 10 is a flowchart illustrating a method for determining an active talker situation for a captioning communication system for a hearing-impaired user according to an embodiment of the disclosure.
  • FIG. 11 is a flowchart illustrating a method for processing audio for a captioning communication service of a captioning communication system for a hearing-impaired user according to an embodiment of the disclosure.
  • DETAILED DESCRIPTION
  • In the following detailed description, reference is made to the accompanying drawings which form a part hereof, and in which is illustrated specific embodiments in which the disclosure may be practiced. These embodiments are described in sufficient detail to enable those of ordinary skill in the art to practice the disclosure. It should be understood, however, that the detailed description and the specific examples, while indicating examples of embodiments of the disclosure, are given by way of illustration only and not by way of limitation. From this disclosure, various substitutions, modifications, additions, rearrangements, or combinations thereof within the scope of the disclosure may be made and will become apparent to those of ordinary skill in the art.
  • In accordance with common practice, the various features illustrated in the drawings may not be drawn to scale. The illustrations presented herein are not meant to be actual views of any particular apparatus (e.g., device, system, etc.) or method, but are merely idealized representations that are employed to describe various embodiments of the disclosure. Accordingly, the dimensions of the various features may be arbitrarily expanded or reduced for clarity. In addition, some of the drawings may be simplified for clarity. Thus, the drawings may not depict all of the components of a given apparatus (e.g., device) or all operations of a particular method. In addition, like reference numerals may be used to denote like features throughout the specification and figures.
  • Information and signals described herein may be represented using any of a variety of different technologies and techniques. For example, data, instructions, commands, information, signals, bits, symbols, and chips that may be referenced throughout the description may be represented by voltages, currents, electromagnetic waves, magnetic fields or particles, optical fields or particles, or any combination thereof. Some drawings may illustrate signals as a single signal for clarity of presentation and description. It should be understood by a person of ordinary skill in the art that the signal may represent a bus of signals, wherein the bus may have a variety of bit widths and the disclosure may be implemented on any number of data signals including a single data signal.
  • The various illustrative logical blocks, modules, circuits, and algorithm acts described in connection with embodiments disclosed herein may be implemented or performed with a general-purpose processor, a special-purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein.
  • A processor herein may be any processor, controller, microcontroller, or state machine suitable for carrying out processes of the disclosure. A processor may also be implemented as a combination of computing devices, such as a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration. When configured according to embodiments of the disclosure, a special-purpose computer improves the function of a computer because, absent the disclosure, the computer would not be able to carry out the processes of the disclosure. The disclosure also provides meaningful limitations in one or more particular technical environments that go beyond an abstract idea. For example, embodiments of the disclosure provide improvements in the technical field of telecommunications, particularly in a telecommunication system including a captioning communication service for providing text captions to a caption-enabled communication device to assist hearing-impaired users. Embodiments include features that improve the functionality of the communication device such that new communication device and method for establishing captioning communication sessions are described. As a result, the interaction of the communication device with other systems (e.g., the captioning communication service) may be improved in addition to an improved user experience.
  • In addition, it is noted that the embodiments may be described in terms of a process that is depicted as a flowchart, a flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe operational acts as a sequential process, many of these acts can be performed in another sequence, in parallel, or substantially concurrently. In addition, the order of the acts may be re-arranged. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, interfacing with an operating system, etc. Furthermore, the methods disclosed herein may be implemented in hardware, software, or both. If implemented in software, the functions may be stored or transmitted as one or more instructions (e.g., software code) on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another.
  • It should be understood that any reference to an element herein using a designation such as “first,” “second,” and so forth does not limit the quantity or order of those elements, unless such limitation is explicitly stated. Rather, these designations may be used herein as a convenient method of distinguishing between two or more elements or instances of an element. Thus, a reference to first and second elements does not mean that only two elements may be employed there or that the first element must precede the second element in some manner. Also, unless stated otherwise a set of elements may comprise one or more elements.
  • As used herein, a “hearing-impaired user” may refer to a person with diminished hearing capabilities. Hearing-impaired users of caption-enabled communication device often have some level of hearing ability that has usually diminished over a period of time such that they can communicate by speaking, but that they often struggle in hearing and/or understanding the far-end user.
  • The term “call” as used herein refers to the communication session between the hearing-impaired user's communication device and the far-end user's communication device. The call may pass audio signals between the two parties. The term call is used in order to be more easily distinguishable from the captioning communication session. At times, the call may be referred to as incoming or outgoing from the perspective of the hearing-impaired user's communication device. Incoming and outgoing calls may refer to the period of time prior to when the call is “answered” by the other party to begin the communication of the audio signals there between. Generally, when discussing calls herein, they are often referred to from the perspective of the communication device associated with the audibly-impaired user. Thus, an “incoming call” may originate from a far-end user to a near-end communication device and an “outgoing call” may originate from a near-end user to a far-end communication device. Of course, it is recognized that “near-end” and “far-end” are relative terms depending on the perspective of the particular user. Thus, the terms “near-end” and “far-end” are used as a convenient way to distinguish between users and devices.
  • The term “captioning communication session” as used herein refers to the communication session between the hearing-impaired user's communication device and the captioning communication service. The captioning communication session may pass text captions from the captioning communication service to the hearing-impaired user's communication device. In some embodiments, the captioning communication session may also include the hearing-impaired user's communication device transmitting the far-end user's audio signal to the captioning communication service to generate the text captions.
  • The term “audio signal” (or voice signal) refers to the signal generated and transmitted by a communication device during a call. Most examples are provided from the perspective of a hearing-impaired user using a captioning communication device, such that the audio signal captured by that device is sometimes referred to as the “near-end audio signal,” and the audio signal received to be reproduced by the speaker is sometimes referred to as the “far-end audio signal.” The terms “near-end” and “far-end” may also be referred to as “local” and “remote,” respectively.
  • FIG. 1 illustrates a communication system 100 configured to facilitate an assisted call between a hearing-impaired user 102 and a far-end user 104. The communication system 100 may include a first communication device 110, a second communication device 120, and a third party communication service 130, which may be a captioning communication service or a relay service (as illustrated in FIG. 1). The first communication device 110 and the second communication device 120 may be coupled together to facilitate communication there between via a first network 140. The first communication device 110 and the third party communication service 130 may be coupled together to facilitate communication there between via a second network 150. For example only, the first network 140 and the second network 150 may each be implemented according to the standards and bandwidth requirements of a communication network (e.g., Public Switch Telephone Network (PSTN), cellular network, Voice Over Internet Protocol (VOIP) networks, etc.). The use of the terms “network” or “communication network” as used herein contemplates networks that are compatible and configured to provide communications using analog and/or digital standards unless specifically stated otherwise. In some embodiments, the first network 140 and the second network 150 may be the same network (e.g., both connections may be Internet-based connections). Thus, discussion of the first network 140 and the second network 150 separately may be for convenience of discussing a particular connection between two or more devices. Of course, in some embodiments, the first network 140 and the second network 150 may be different networks. For example, the first communication device 110 and the second communication device 120 may communicate via a PSTN network connection, while the first communication device 110 and the second communication device 120 may communicate via an internet connection. Other variations and combinations of networks are also contemplated.
  • The first communication device 110 may include a device that is configured to assist the hearing-impaired user 102 in communicating with another individual (e.g., far-end user 104). In some embodiments, the first communication device 110 may include a caption-enabled communication device configured to receive and display text captions of at least a portion of the conversation. Thus, the hearing-impaired user 102 may be able to read the text captions of the words spoken by the far-end user 104 to supplement the audio signal received by the first communication device 110. As a result, the hearing-impaired user 102 may have an improved experience in understanding the conversation. Such an embodiment may be useful for people whose hearing has been damaged or decreased over time (e.g., the elderly); such that they can still speak but have diminished hearing that makes it difficult to communicate. In some embodiments, the first communication device 110 may also be configured to receive and display video on an electronic display on the first communication device 110.
  • The second communication device 120 may comprise a conventional voice telephone (e.g., landline phone, cellular phone, smart phone, VoIP phone, etc.). As such, the far-end user 104 may interact in a conventional manner with the second communication device 120. In some embodiments, the second communication device 120 may be configured similarly as the first communication device (e.g., caption-enabled communication device). As a result, the second communication device 120 may likewise be operated by a hearing-impaired user. Thus, although facilitating communication between the hearing-impaired user 102 and the far-end user 104 is shown in FIG. 1 to imply that the far-end user 104 is a hearing-capable user, such a situation is shown only as an example. Other embodiments include both the first communication device 110 and the second communication device 120 coupled to the third party communication service 130 to facilitate the captioning services for each respective hearing-impaired user. In such a situation, each of the first and second communication devices 110, 120 may have its own communication session with the third party communication service 130.
  • The third party communication service 130 may be configured to provide interpretive services (e.g., captioning) to the hearing-impaired user 102. More specifically, a human “call assistant” within third party communication service 130 may be employed to facilitate an assisted call between a hearing-impaired user 102 and a far-end user 104. As discussed above, in some embodiments the third party communication service 130 may be configured to provide text captions of at least a portion of the conversation. In such an embodiment, the call assistant may listen to the voice signal received and re-voice the portion of the conversation into a microphone so that voice recognition software may generate the text captions that are transmitted to the first communication device 110. Thus, the third party communication service 130 may include one or more of an internet protocol captioned telephone service (IPCTS), captioned telephone service (CTS), or other telecommunications relay services (TRS).
  • FIG. 1 shows a configuration where the first communication device 110 acts as a router for the voice signal from the second communication device 120 to the third party communication service 130. In such an embodiment, the voice signal of the far-end user 104 may be transmitted from the second communication device 120 to the first communication device 110. The voice signal of the far-end user 104 may then be transmitted from the first communication device 110 to the third party communication service 130 for the text captions to be generated in a text captioning embodiment. The text captions may then be transmitted from the third party communication service 130 to the first communication device 110 to be displayed as text captions for the hearing-impaired user to read during the conversation. The call assistant may also monitor the text captions that are generated and transmitted to the first communication device 110 to identify any errors that may have been generated by the voice recognition software. The call assistant may correct such errors, such as described in U.S. Pat. No. 8,379,801, issued Feb. 19, 2013, entitled “Methods and Systems Related to Text Caption Error Correction,” the disclosure of which is incorporated herein in its entirety by this reference. In some embodiments the third party communication service 130 may be configured to receive the far-end voice signal from the second communication device 120 and route the far-end voice signal to the first communication device 110.
  • In addition, although FIG. 1 shows only two communication devices 110, 120, the communication system 100 may include more communication devices. It is contemplated that the communication system 100 may facilitate communication between any number and combinations of hearing-impaired users and far-end users. For example, in some embodiments two or more communication devices may be connected for facilitating communication between a hearing-impaired user and other hearing-impaired users and/or far-end users.
  • Embodiments of the disclosure include devices and methods for remote attenuation of the audio stream received by the captioning communication system. For example, talker direction detection may be performed on a local system then a command may be sent with the encoded audio stream (e.g., Speex) over a communication channel to a remote captioning communication system to allow the captioning communication service to determine the best method to process the audio stream by suppressing echo or otherwise modify the signal. In addition, embodiments may combine an echo volume control with an echo modifier to reduce the effect of echo present in received audio when doubletalk is present.
  • FIG. 2 is a simplified schematic block diagram of a communication device 200 associated with a hearing-impaired user according to an embodiment of the disclosure. For example, the communication device 200 may be the first communication device 110 of FIG. 1. In particular, the communication device 200 may be configured to establish calls with other communication devices and captioning communication sessions with a captioning communication service configured to assist the hearing-impaired user. The communication device 200 may be a caption enabled communication device, which may be implemented as a standalone device (e.g., a caption phone), or as implemented on another device (e.g., tablet computer, laptop computer, smart phone, etc.).
  • The communication device 200 may include a processor 210 operably coupled with an electronic display 220, communication elements 230, a memory device 240, input devices 250, and a speaker 260. In some embodiments, the communication device 200 may include a camera for also participating in a video communication session. The processor 210 may coordinate the communication between the various devices as well as execute instructions stored in computer-readable media of the memory device 240. The processor 210 may be configured to execute a wide variety of operating systems and applications including the computing instructions. The memory device 240 may be used to hold computing instructions, data, and other information for performing a wide variety of tasks including performing embodiments disclosed herein. By way of example and not limitation, the memory device 240 may include Synchronous Random Access Memory (SRAM), Dynamic RAM (DRAM), Read-Only Memory (ROM), Flash memory, and the like. The memory device 240 may include volatile and non-volatile memory storage for the communication device 200.
  • The communication elements 230 may be configured to communicate with other devices or communication networks, including other communication devices and the captioning communication service. As non-limiting examples, the communication elements 230 may include elements for communicating on wired and wireless communication media, such as for example, serial ports, parallel ports, Ethernet connections, universal serial bus (USB) connections IEEE 1394 (“firewire”) connections, Bluetooth wireless connections, 802.1 a/b/g/n type wireless connections, and other suitable communication interfaces and protocols. The input devices 250 may include a numeric keypad, a keyboard, a touchscreen, a remote control, a mouse, buttons, other input devices, or combinations thereof.
  • FIG. 3 is a captioning communication system 300 including an automatic volume control system according to an embodiment of the disclosure. The captioning communication system 300 includes a first communication device 110 (e.g., local caption communication device) specifically configured for use by a hearing-impaired user (i.e., a local user) to communicate with the second communication device 120 associated with a far-end user (i.e., remote user) over a first network 140 (e.g., PSTN network). The captioning communication system 300 may further include a third party communication service 130 (i.e., third party communication service) that is configured to communicate with the first communication device 110 to provide text captions during a communication session to assist the hearing-impaired user. In particular, the first communication device 110 may be configured to receive the far-end voice signal, which may also be routed to the third party communication service 130 which generates the text transcription of the far-end voice signal that is provided to the first communication device 110 to display to the hearing-impaired user during the communication session.
  • Throughout this description, reference to various signals is made. For example, the local outgoing signal is referred to as the near-end voice signal s[n], while the remote incoming signal is referred to as the far-end voice signal r[n]. The echo from the near-end voice signal s[n] that is caused by the first network 140, e.g., a PSTN network, is referred as the echo signal e[n]. The signal that is received by the first communication device 110 is referred to as the received far-end voice signal g[n], which is the sum of the far-end voice signal r[n] and the echo e[n]. In other words, g[n]=r[n]+e[n]. When there is no echo, the received far-end voice signal and the far-end voice signal r[n] are substantially equal.
  • The received far-end voice signal g[n] is received by the echo modifier 320, which adds distortion (e.g., resulting in modified echo estimate signal e′[n]) to generate the modified received far-end voice signal g′[n] (also referred to as the “audio stream”). The modified echo estimate signal e′[n] is generated by an echo modifier 320 that will be discussed further below. The packetized output signal a[n] may include the packetized form (via encoder 311) of the modified received far-end voice signal g′ [n] as well as an volume control command (d). These signals will be discussed further below.
  • The captioning communication system 300 further includes an echo volume control 310 that is configured to automatically control the volume of the audio signal (e.g., modified received far-end voice signal g′ [n]) received and reproduced by the third party communication service 130 during the communication session. For example, the echo volume control 310 may set the volume of the audio signal at a first level responsive to a determination that only the far-end user is speaking. The echo volume control 310 is configured to set the volume of the audio signal received by the third party communication service 130 at a second level responsive to a determination that only the near-end user is speaking. The first level is higher (i.e., louder) than the second level. In other words, when only the near-end user is speaking, the volume level of the audio signal provided to the call assistant may be attenuated in comparison to the volume level of the audio signal provided to the call assistant when only the far-end user is speaking. In some embodiments, the second level may be completely attenuated (e.g., suppressed) such that no sound is produced for the call assistant.
  • The echo volume control 310 may include an active signal detector 312 that is configured to perform the determination of which talker is active at a given time. For example, the active signal detector 312 may be receive the near-end voice signal s[n] and the received far-end voice signal g[n] to determine which of the two signals s[n], g[n] are active to indicate whether the near-end user and/or the far-end user are active (i.e., talking) at a given time. Because the received far-end voice signal g[n] is a form of the far-end voice signal r[n] generated by the second communication device 120, it also follows that the active signal detector 312 determines whether the near-end voice signal s[n] or the far-end voice signal r[n] is active. Thus, the active signal detector 312 determines if the near-end user is active (i.e., talking), if the far-end user is active (i.e., talking), or if both the near-end user and the far-end use are active (i.e., a double-talk situation). Thus, it could be said that the active signal detector 312 determines the “direction” of which party is currently talking. For example, active signal detector 312 may compare (e.g., cross correlate) the near-end voice signal with the received far-end voice signal g[n].
  • The active signal detector 312 may be further configured to generate an volume control command (d) that indicates which user is active responsive to the determination discussed above. In some embodiments, the volume control command (d) may have different states for various situations. For example, the active signal detector 312 may be configured generate the volume control command to have a first state corresponding to the “near-end only” situation, and a second state corresponding to the “far-end only situation.” In some embodiments, the active signal detector 312 may include a third state corresponding to the “double talk” situation, whereas other embodiments may simply generate the volume control command (d) corresponding to the double talk situation to be the same state as the “far-end only” situation.
  • The first communication device 110 may be configured to send the volume control command (d) along with the speaker out signal g′ [n] to the encoder 311, which encodes the two signals into the encoded signal packet a[n] that is transmitted to the third party communication service 130 through the communication channel 314 for use by the audio processing logic 316 of the third party communication service 130 when generating the text transcription of the far-end voice signal r[n].
  • In some embodiments, the volume control command (d) may be a flag bit or other instruction that is interpreted by the audio processing as which talker situation should be applied to the particular audio packet received. For example, the volume control command (d) generated by the active signal detector 312 may be a binary value (e.g., 0 or 1), in which the logic of the audio processing logic 316 may interpret a first value (e.g., 0) to be a first volume level for the audio packet (e.g., no attenuation) and a second value (e.g., 1) to correspond to a second volume level for the audio packet (e.g., full attenuation) provided to the speaker 332. Of course, it is contemplated that the binary values may be reversed in the way they are interpreted by the logic of the audio processing logic 316.
  • In some embodiments, the volume control command (d) may be in the form of a numerical value or other instruction that corresponds to a volume level or amount of attenuation of the audio packet to be passed onto the speaker 332. For example, the volume control command (d) generated by the active signal detector 312 may be an attenuation value (e.g., integer) between a volume range (e.g., 0 and 5) supported by the audio processing logic 316, in which the logic of the audio processing logic 316 may interpret a first value (e.g., 0) to correspond to attenuation for a first volume level (e.g., no attenuation) for the audio packet and a second attenuation value (e.g., 5) to correspond to attenuation for a second volume level (e.g., full attenuation) for the audio packet provided to the speaker 332. The intermediate values may be assigned to a scale of intermediate attenuation levels, if desired. Of course, it should be recognized that different schemes are contemplated for the volume control command (d) depending on how the logic for the audio processing logic 316 is configured to provide the audio packets to the speaker 332 at different levels for the third party communication assistant to hear (or not hear) the far-end voice signal depending on the situation determined by the active signal detector 312. Regardless of the specific logic scheme, the echo volume control provides the audio packets to the speaker 332 of the third party communication service 130 at a louder volume during the far-end talker only situation in comparison to the near-end talker only situation. In some embodiments, the double talk situation may be handled the same way as the far-end talker only situation in terms of the volume of the audio packets provided to the speaker 332.
  • As shown in FIG. 3, at least some of the features and functionality (e.g., active signal detector 312) of the echo volume control 310 may be included within the first communication device 110. In addition, at least some of the features and functionality (e.g., audio processing logic 316) of the echo volume control 310 may be included within the third party communication service 130. As discussed above, the active signal detector 312 may be configured to determine whether the local user or remote user is speaking, and send a volume control command (d) to the audio processing logic 316. The audio processing logic 316 may be configured to reduce the volume of the audio packets to the speaker 332 responsive to the information provided by the volume control command (d).
  • The captioning communication system 300 further includes an echo modifier 320. The echo modifier 320 may be configured to add distortion to the echo signal such that the audio packets received by the audio processing logic 316 may have an echo signal that is distorted from its original state such that the call assistant may better audibly distinguish between the far-end voice signal portion and the modified echo portion.
  • The echo modifier 320 may include an echo estimator 322 and an echo distortion logic 324. The echo estimator 322 may be configured to generate an estimate of the echo e[n]. The echo estimator 322 may include adaptive filter that is configured to generate an estimated echo signal as its output. The adaptive filter may receive the near-end voice signal s[n], and be configured to train its coefficients based on the error signal generated from the difference between the received far-end voice signal g[n] and the output from the echo estimator 322. The output from the echo estimator 322 is approximately the echo e[n]; however, rather than subtracting out the echo as with conventional echo cancellation systems, the echo distortion logic 324 receive the estimate echo signal and add distortion to generate the modified echo estimate signal e′[n]. As a result, it is the modified echo signal e′[n] is summed (e.g., subtracted) with the received far-end voice signal g[n] to generate the modified received far-end voice signal g′[n]. Because the modified echo estimate signal e′[n] and the echo e[n] portion of the received far-end voice signal g[n] may be highly correlated, when the modified echo e′[n] is subtracted from the echo e[n] the remaining signal is substantially the difference caused by the modification that was performed on the estimate, plus a certain amount of error produced by inaccuracy in the echo estimator 322. Because the far-end voice signal r[n] portion of the received far-end voice signal g[n] and the modified echo estimate e′[n] may not be well correlated, subtracting the modified echo estimate signal e′[n] may have little effect on that portion. As a result, the resulting modified received far-end voice signal g′[n] includes the far-end voice signal r[n] and a distorted version of the echo (e.g., e′[n]-e[n]). As a result, when the modified received far-end voice signal g′[n] is reproduced by the speaker 332 of the third party communication service 130, the distorted version of the echo may be audibly distinguishable from the far-end voice signal r[n] by the third party communications assistant when they listen to the far-end voice signal to generate the text transcription for the text captions. Thus, the third party call assistant may have an improved experience in revoicing the correct voice signal, which may improve the accuracy of the text captions. Echo distortion may include any process that makes the echo portion audibly distinguishable from the far-end voice portion of the received far-end voice signal g[n]. Non-limiting examples of echo modification may include frequency shifting, signal modulation, partial or complete attenuation, adding white or colored noise, etc.
  • As discussed above, the echo volume control 310 includes an active signal detector 312 that determines whether the local user or remote user is talking. In some embodiments, the active signal detector 312 may include a double talk detector. The result of the active signal detector 312 may be generated in the form of the volume control command (d) that is packaged with the audio stream g′[n] to form a[n], which is received by the third party communication service 130 over a communication channel 314 (e.g., the Internet or other digital network, radio frequency communications network, optical communications network, serial or parallel bus, etc.). The third party communication service 130 (e.g., through audio processing logic 316) processes the audio stream g′[n] based, at least in part, on the results of the direction detector (e.g., according to the volume control command (d)) as discussed above. If the signal is from the local user the audio can be attenuated, or other processing can be performed as needed (e.g., filtering, amplification, etc.). If the signal is from the remote user the audio is passed unmodified, or other processing can be performed as needed (e.g., filtering, amplification, attenuation, etc.). After processing, the resulting signal is reproduced for the call assistant to hear and perform their duties of generating the text transcription of the far-end voice. Thus, in situations when doubletalk is present, the echo modifier 320 alters the echo portion e[n] of the received far-end voice signal g[n], such that the communications assistant at the third party communication service 130 can audibly distinguish between the near-end voice and far-end voice signals.
  • FIG. 4 is a captioning communication system 400 including an automatic volume control system according to another embodiment of the disclosure. The captioning communication system 400 includes similar elements as in FIG. 3, but with additional third party communication services 130A, 130B. Each third party communication service 130A, 130L may include audio processing logic 316A, 316L, and a speaker 332A, 332L for its call assistant.
  • As discussed above with respect to FIG. 3, the first communication device 110 may transmit the combined command (d) and the modified received far-end voice signal g′[n] (i.e., encoded signal packet a[n]) to any number of third party recipients over communication channels 314A, 314L. In some embodiments, the communication channels 314A, 314L may be the same or distinct communications channels for each third party communication service 130A, 130L. Each third party communication service 130A, 130L may refer to different call assistants within the same location (e.g., call center) or different call assistants located within different locations, as desired. Each audio processing logic 316A, 316L may process the encoded signal packet a[n] according to its specific needs. For example, the first audio processing logic 316A associated with a first call assistant may be configured to process the encoded signal packet a[n] differently than the third party audio processing logic 316B associated with a second call assistant.
  • In some embodiments, the near-end voice signal s[n] may also be transmitted to one or more of the third party communication services 130A, 130L through the communication channels 314A, 314L. In such an embodiment, it may be desirable for one call assistant to transcribe the near-end voice signal s[n], while another call assistant transcribe the far-end voice signal r[n] from the modified received far-end voice signal g′[n]. For example, the first call assistant may transcribe the near-end voice signal s[n] (in which case the modified received far-end voice signal g′[n] may be attenuated by audio processing logic 316A), and the second call assistant may transcribe the far-end voice signal r[n] from the modified received far-end voice signal g′[n] (in which case the near-end voice signal s[n] may be attenuated by the audio processing logic 316L).
  • FIG. 5 is a captioning communication system 500 including an automatic volume control system according to another embodiment of the disclosure. The captioning communication system 400 includes similar elements as in FIG. 3, but with the audio processing logic 316 being performed locally by the first communication device 110. For example, the volume control command (d) from the active signal detector 312 may be used locally to process the modified received far-end voice signal g′[n]. Thus, the processed audio stream g′[n] may be transmitted to the third party communication service 130 through the communication channel 314 having a volume with the first level or second level based on the volume control command (d). In some embodiments, rather than sending an attenuated encoded version of the modified received far-end voice signal g′[n], the audio processing logic 316 may be configured to not send any modified received far-end voice signal g′[n] in the near-end only situation determined by the active signal detector 312. The audio processing logic 316 may send the modified received far-end voice signal g′[n] to the third party communication service 130 through the communication channel 314 in the far-end only situation and/or the double talk situation determined by the active signal detector 312. The third party communication service 130 may receive the encoded version of the modified received far-end voice signal g′[n] and decode it to be reproduced by the speaker 332 for the call assistant to generate the text transcription of the far-end voice signal r[n] portion of the modified received far-end voice signal g′[n] as discussed above.
  • FIG. 6 is a captioning communication system 600 including an automatic volume control system according to another embodiment of the disclosure. The captioning communication system 600 includes similar elements as in FIG. 3, but with the audio processing logic 316A being performed locally by the first communication device 110 as well with third party audio processing logic 316B being performed remotely by the third party communication service 130. Thus, the active signal detector 312 may provide the volume control command (d) to the local audio processing logic 316A, and the third party audio processing logic 316B (via encoder 311) such that each may provide the appropriate audio processing of received version of the modified received far-end voice signal g′[n] according to its specific requirements based on the volume control command (d). The third party communication service 130 may also include a decoder (not shown in FIG. 6) that is configured to decode the signal received through the communication channel 314 for processing.
  • FIG. 7 is a captioning communication system 700 including an automatic volume control system according to another embodiment of the disclosure. The captioning communication system 700 includes similar elements as in FIG. 5 with the audio processing logic 316 of the echo volume control 310 being performed locally in the first communication device 110, but with the processed modified received far-end voice signal g′[n] (via the audio processing logic 316) only being forwarded to the third party communication service 130. In some embodiments, additional audio processing 366 may be performed remotely by the third party communication service 130 prior to being sent to the speaker 332.
  • FIG. 8 is a captioning communication system 800 including an automatic volume control system according to another embodiment of the disclosure. The captioning communication system 800 includes similar elements as in FIG. 3, but with the elements of the echo volume control 310 and the echo modifier 320 being performed by the third party communication service 130. Thus, the first communication device 110 may be configured to transmit the near-end voice signal s[n] and the received far-end voice signal g[n] to the third party communication service 130 through the communication channel 314. The third party communication service 130 may include the active signal detector 312, the audio processing logic 316, the echo estimator 322, and the echo distortion logic 324 that are configured as discussed above. Thus, the third party communication service 130 may perform the different active talker situations and related attenuation scenarios, as well as the echo estimation and echo modification. It is therefore contemplated that the features and methodology described herein may be performed locally the first communication device 110, by the third party communication service 130, or any combination thereof.
  • In FIGS. 3 through 8, the speaker 304 of the first communication device 110 may receive the received far-end voice signal g[n] or an echo canceled version thereof. Thus, in some embodiments, the received far-end voice signal g[n] may by processed through a conventional echo canceler locally even if the signal received by the second communication device 120 did not have an echo cancellation process performed thereon (see, e.g., echo canceller 305 in FIG. 3). As a result, the speaker 304 of the first communication device 110 may receive substantially the far-end voice signal r[n] with the echo removed locally. In other embodiments, the speaker 304 of the first communication device 110 may receive modified received far-end voice signal g′[n] that has been processed responsive to the volume control command d.
  • FIG. 9 is a flowchart 900 illustrating a method for operating a captioning communication system for a hearing-impaired user according to an embodiment of the disclosure. At operation 910, the active talker situation may be determined from the near-end voice signal and the received signal from the far-end communication device. The active talker situation may be determined to be a far-end only situation, a near-end only situation, or a double talk situation by comparing (e.g., cross correlating) the received signal and the near-end voice signal.
  • At operation 920, the echo portion of the received signal from the far-end communication device may be estimated through an adaptive filter that receives the near-end voice signal, and trains the filter based on the error signal generated from the difference between the received signal and the output from the echo estimator. In contrast with conventional systems, however, the estimated echo is not subtracted from the received signal to generate an echo cancelled signal. Rather, at operation 930, echo distortion is added to the received signal. The echo distortion may include distorting the estimate echo signal and subtractive the result from the received signal. The distortion may include frequency shifting, signal modulation, partial or complete attenuation, adding white or colored noise, or combinations thereof, to the estimated echo signal, which is then summed (e.g., subtracted) with the received signal to generate a modified received far-end voice signal that is used as the audio stream for the third party communication service.
  • At operation 940, the volume of the modified received far-end voice signal reproduced by the third party communication service may be automatically adjusted based on the determined active talker situation. For example, the volume for the far-end only situation may have a first level (e.g., high volume) and the volume for the near-end only situation may have a second level (e.g., low volume). In some embodiments, the second level for the near-end only situation may be complete attenuation of the modified received far-end voice signal such that the call assistant's speaker does not produce sound for generating a text transcription of the far-end voice signal portion of the modified audio signal. Different operations of FIG. 9 may be performed by the near-end communication device, the third party communication service, or a combination thereof.
  • FIG. 10 is a flowchart 1000 illustrating a method for determining an active talker situation for a captioning communication system for a hearing-impaired user according to an embodiment of the disclosure. At operation 1010, the received signal may be received from the far-end communication device. At operation 1020, the near-end voice signal may be received from the microphone of the near-end communication device. At operation 1030, the received signal and the near-end voice signal may be compared (e.g., cross correlation) to determine which signal is active at a given time or if both signals are active. At operation 1040, the result of the comparison may determine which situation is occurring. The situations may include the far-end only situation 1042, the near-end only situation 1044, and the double talk situation 1046. During each of these situations (e.g., states), the active signal detector may generate an volume control command (d) that is used by the audio processing of the audio stream to determine the automatic volume control to the speaker of the third party communication service. The volume control command (d) may include a binary flag, a numerical value, or other command that indicates to the audio processing the active talker situation, such that the audio processing can then take the appropriate actions (e.g., pass the audio, attenuate the audio, etc.). Different operations of FIG. 10 may be performed by the near-end communication device, the third party communication service, or a combination thereof.
  • FIG. 11 is a flowchart 1100 illustrating a method for processing audio for a captioning communication service of a captioning communication system for a hearing-impaired user according to an embodiment of the disclosure. At operation 1110, the modified received far-end voice signal including the far-end voice signal and modified echo may be received. At operation 1120, the volume control command may be received. At operation 1130, the active talker situation may be determined from the volume control command received. As discussed above, the situations may include the far-end only situation 1132, the near-end only situation 1134, and the double talk situation 1136. If the situation is the far-end only situation, the volume level for the produced audio for the call assistant may be set at a first level (e.g., higher) at operation 1140. If the situation is the near-end only situation, the volume level for the produced audio for the call assistant may be set at a second level (e.g., lower) at operation 1150. In some embodiments, if the situation is a double talk situation, the volume level for the produced audio for the call assistant may be set at the first level (i.e., the same as the far-end only situation). In some embodiments, if the situation is a double talk situation, the volume level for the produced audio for the call assistant may be set at a third level (i.e., different than the far-end only situation). Different operations of FIG. 11 may be performed by the near-end communication device, the third party communication service, or a combination thereof.
  • Embodiments of the disclosure, therefore, may be used to reduce negative effects of the presence of echo when traditional methods (e.g., echo cancellation) cannot be used or may not be preferred. In addition, the performance of standard echo suppression may be improved in the presence of doubletalk. As a result, remote third party devices (e.g., call assistant devices for a captioning communication service) receiving the audio stream may determine how audio is to be processed before reproducing the audio to the third party end user (e.g., call assistant). In addition, call assistants and other third party listeners may be provided with the ability to discern between local voice and remote voice signals as a result of the modified received far-end voice signal g′ [n] being used, which includes a distorted version of the echo that may assist the call assistant to audibly distinguish between the far-end voice signal and the echo that results from the near-end voice signal. This may make it easier for the call assistant to transcribe the correct talker's words in comparison to conventional systems that do not perform echo cancellation on the audio stream sent to the call assistant, or for which echo cancellation does not adequately eliminate all echo.
  • While certain illustrative embodiments have been described in connection with the figures, those of ordinary skill in the art will recognize and appreciate that embodiments encompassed by the disclosure are not limited to those embodiments explicitly shown and described herein. Rather, many additions, deletions, and modifications to the embodiments described herein may be made without departing from the scope of embodiments encompassed by the disclosure, such as those hereinafter claimed, including legal equivalents. In addition, features from one disclosed embodiment may be combined with features of another disclosed embodiment while still being encompassed within the scope of embodiments encompassed by the disclosure as contemplated by the inventors.

Claims (20)

What is claimed is:
1. A communication device specifically configured for use by a hearing-impaired user, the communication device comprising:
a microphone configured to generate a near-end voice signal;
communication elements configured to receive a received far-end voice signal through a network from a far-end communication device; and
a processor operably coupled with the microphone and the communication elements, the processor configured to automatically control a volume level of an audio stream signal reproduced by a third party captioning communication service responsive to determining which of the near-end voice signal and the received far-end voice signal is active.
2. The communication device of claim 1, wherein the processor comprises an automatic control system including an active signal detector configured to generate and send a volume control command to audio processing logic in response to determining which of the near-end voice signal and the received signal is active.
3. The communication device of claim 2, wherein the volume control command is one of a binary flag or a numerical value.
4. The communication device of claim 3, wherein the processor is further configured to encode the audio stream signal with the volume control command in packets to the audio processing logic that is part of the third party captioning communication service.
5. The communication device of claim 2, wherein the processor further comprises an echo modifier including:
an echo estimator configured to generate an estimated echo signal for the received far-end voice signal; and
echo distortion logic configured to add distortion to the estimated echo signal to generate a modified estimated echo signal, wherein a summation block receives the modified estimated echo signal and the received far-end voice signal to generate the audio stream signal sent to the audio processing logic.
6. The communication device of claim 5, wherein the echo distortion logic is configured to add distortion to the estimated echo signal by performing at least one of frequency shifting, signal modulation, attenuation, adding white noise or adding colored noise to the estimated echo signal.
7. The communication device of claim 1, wherein the processor is configured to automatically control the volume level of the audio stream signal to have a first volume level responsive to determining that only the near-end voice signal, and a second volume level responsive to determining that only the received far-end voice signal is active.
8. The communication device of claim 7, wherein the processor is configured to automatically control the volume level of the audio stream signal to have the first volume level responsive to determining that both the near-end voice signal and the received far-end voice signal are simultaneously active.
9. A method of operating a captioning communication service for hearing-impaired users, the method comprising:
determining an active talker situation responsive to comparing a near-end voice signal from a near-end communication device and a received far-end voice signal from a far-end communication device; and
automatically adjusting a volume level of an audio stream reproduced by a third party captioning communication service based on the determined active talker situation.
10. The method of claim 9, wherein comparing the near-end voice signal and the received far-end voice signal include cross-correlating the received far-end voice signal and the near-end voice signal.
11. The method of claim 9, wherein the active talker situation is selected from the group consisting of a near-end only situation, a far-end only situation, and a double talk situation.
12. The method of claim 9, wherein determining the active talker situation includes generating a volume control command indicating the active talker situation.
13. The method of claim 12, further comprising processing the audio stream according to the volume control command prior to being reproduced to have a first volume level for a first active talker situation and a second volume level for a second active talker situation.
14. The method of claim 13, wherein generating the volume control command is performed by the near-end communication device, and processing the audio stream is performed by the third party captioning communication service.
15. The method of claim 13, wherein generating the volume control command and processing the audio stream are both performed by the third party captioning communication service.
16. The method of claim 13, wherein generating the volume control command and processing the audio stream are both performed by the near-end communication device.
17. The method of claim 9, further comprising:
estimating an echo portion of the received far-end voice signal; and
adding distortion to the estimated echo portion to generate the audio stream such that the audio stream is a modified received far-end voice signal without cancelling the echo portion.
18. The method of claim 17, further comprising packetizing the audio stream with a volume control command and sending the packets to the third party captioning communication service.
19. A captioning communication system, comprising:
a near-end communication device including:
a microphone configured to capture a near-end voice signal during a communication session with a far-end communication device;
communication elements configured to receive a far-end voice signal from the far-end communication device during the communication session;
a speaker configured to reproduce the far-end voice signal;
an electronic display configured to display text captions during the communication session; and
a processor operably coupled with the microphone, the communication elements, the speaker, and the electronic display; and
a captioning communication service configured to generate a text transcription of the far-end voice signal during the communication session and transmit the text transcription in real time to the near-end communication device for the text captions to be displayed,
wherein at least one of the near-end communication device and the captioning communication system is configured to operate:
a volume control system configured to automatically adjust a volume of an audio stream reproduced by a speaker of the near-end communication device responsive to a volume control command identifying which of the far-end voice signal and the near-end voice signal is active at a given time; and
an echo modifier configured to add distortion to an echo portion of the far-end voice signal when generating the audio stream.
20. The captioning communication system of claim 19, wherein:
the volume control system includes an active signal detector configured to generate the volume control command responsive to a cross correlation of the near-end voice signal and the far-end voice signal; and
the echo modifier includes an echo estimator configured to provide an estimated echo signal to echo distortion logic without cancelling the echo.
US15/194,332 2015-09-16 2016-06-27 Automatic volume control of a voice signal provided to a captioning communication service Active 2036-01-02 US10574804B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/194,332 US10574804B2 (en) 2015-09-16 2016-06-27 Automatic volume control of a voice signal provided to a captioning communication service

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201562219654P 2015-09-16 2015-09-16
US14/933,893 US9380150B1 (en) 2015-09-16 2015-11-05 Methods and devices for automatic volume control of a far-end voice signal provided to a captioning communication service
US15/194,332 US10574804B2 (en) 2015-09-16 2016-06-27 Automatic volume control of a voice signal provided to a captioning communication service

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US14/933,893 Continuation US9380150B1 (en) 2015-09-16 2015-11-05 Methods and devices for automatic volume control of a far-end voice signal provided to a captioning communication service

Publications (2)

Publication Number Publication Date
US20170078463A1 true US20170078463A1 (en) 2017-03-16
US10574804B2 US10574804B2 (en) 2020-02-25

Family

ID=56136542

Family Applications (2)

Application Number Title Priority Date Filing Date
US14/933,893 Active US9380150B1 (en) 2015-09-16 2015-11-05 Methods and devices for automatic volume control of a far-end voice signal provided to a captioning communication service
US15/194,332 Active 2036-01-02 US10574804B2 (en) 2015-09-16 2016-06-27 Automatic volume control of a voice signal provided to a captioning communication service

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US14/933,893 Active US9380150B1 (en) 2015-09-16 2015-11-05 Methods and devices for automatic volume control of a far-end voice signal provided to a captioning communication service

Country Status (1)

Country Link
US (2) US9380150B1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170180558A1 (en) * 2015-12-22 2017-06-22 Hong Li Technologies for dynamic audio communication adjustment
CN109005281A (en) * 2018-06-27 2018-12-14 努比亚技术有限公司 A kind of In Call adjusting method, mobile terminal and computer readable storage medium
CN110012258A (en) * 2019-03-29 2019-07-12 努比亚技术有限公司 Best audio-video perception point acquisition methods, system, wearable device and storage medium
US10553235B2 (en) 2017-08-28 2020-02-04 Apple Inc. Transparent near-end user control over far-end speech enhancement processing

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10748523B2 (en) 2014-02-28 2020-08-18 Ultratec, Inc. Semiautomated relay method and apparatus
US20180034961A1 (en) 2014-02-28 2018-02-01 Ultratec, Inc. Semiautomated Relay Method and Apparatus
US20180270350A1 (en) 2014-02-28 2018-09-20 Ultratec, Inc. Semiautomated relay method and apparatus
US10878721B2 (en) * 2014-02-28 2020-12-29 Ultratec, Inc. Semiautomated relay method and apparatus
US10389876B2 (en) 2014-02-28 2019-08-20 Ultratec, Inc. Semiautomated relay method and apparatus
US9380150B1 (en) * 2015-09-16 2016-06-28 Captioncall, Llc Methods and devices for automatic volume control of a far-end voice signal provided to a captioning communication service
US9497315B1 (en) * 2016-07-27 2016-11-15 Captioncall, Llc Transcribing audio communication sessions
US10468028B2 (en) 2016-10-12 2019-11-05 Sorenson Ip Holdings, Llc Transcription presentation of communication sessions
US9992318B1 (en) * 2017-03-31 2018-06-05 Sorenson Ip Holdings, Llc Storing messages
US11223716B2 (en) * 2018-04-03 2022-01-11 Polycom, Inc. Adaptive volume control using speech loudness gesture
US11017778B1 (en) 2018-12-04 2021-05-25 Sorenson Ip Holdings, Llc Switching between speech recognition systems
US11170761B2 (en) 2018-12-04 2021-11-09 Sorenson Ip Holdings, Llc Training of speech recognition systems
US10388272B1 (en) 2018-12-04 2019-08-20 Sorenson Ip Holdings, Llc Training speech recognition systems using word sequences
US10573312B1 (en) 2018-12-04 2020-02-25 Sorenson Ip Holdings, Llc Transcription generation from multiple speech recognition systems
JP2020202448A (en) * 2019-06-07 2020-12-17 ヤマハ株式会社 Acoustic device and acoustic processing method
US11539900B2 (en) 2020-02-21 2022-12-27 Ultratec, Inc. Caption modification and augmentation systems and methods for use by hearing assisted user
US11321047B2 (en) * 2020-06-11 2022-05-03 Sorenson Ip Holdings, Llc Volume adjustments
US11488604B2 (en) 2020-08-19 2022-11-01 Sorenson Ip Holdings, Llc Transcription of audio

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5315585A (en) * 1992-05-15 1994-05-24 Kokusai Denshin Denwa Co., Ltd Echo canceller using two residual echoes
US20030007633A1 (en) * 2001-01-26 2003-01-09 Tucker Luke A. Double-talk detector suitable for a telephone-enabled PC
US6516050B1 (en) * 1999-02-25 2003-02-04 Mitsubishi Denki Kabushiki Kaisha Double-talk detecting apparatus, echo canceller using the double-talk detecting apparatus and echo suppressor using the double-talk detecting apparatus
US20130201272A1 (en) * 2012-02-07 2013-08-08 Niklas Enbom Two mode agc for single and multiple speakers
US20130317818A1 (en) * 2012-05-24 2013-11-28 University Of Rochester Systems and Methods for Captioning by Non-Experts
US20140229171A1 (en) * 2013-02-08 2014-08-14 Qualcomm Incorporated Systems and Methods of Performing Filtering for Gain Determination
US8917821B2 (en) * 2005-06-29 2014-12-23 Ultratec, Inc. Device independent text captioned telephone service
US9380150B1 (en) * 2015-09-16 2016-06-28 Captioncall, Llc Methods and devices for automatic volume control of a far-end voice signal provided to a captioning communication service

Family Cites Families (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4777469A (en) 1987-07-17 1988-10-11 Ultratec, Inc. Public terminal receptacle
US6075842A (en) 1988-10-11 2000-06-13 Ultratec, Inc. Text enhanced telephony
US5724405A (en) 1988-10-11 1998-03-03 Ultratec, Inc. Text enhanced telephony
US6549611B2 (en) 1988-10-11 2003-04-15 Ultratec, Inc. Text enhanced telephony
US5432837A (en) 1992-05-20 1995-07-11 Ultratec, Inc. Telecommunication device for the deaf with automatic transmission capability
US5081673A (en) 1988-10-11 1992-01-14 Engelke Robert M Voice bridge for relay center
US4959847A (en) 1989-04-05 1990-09-25 Ultratec, Inc. Telecommunications device with automatic code detection and switching
US6075841A (en) 1992-01-09 2000-06-13 Ultratec, Inc. In-line text display for telephone terminal employing data filtering
US5327479A (en) 1992-05-20 1994-07-05 Ultratec, Inc. Telecommunication device for the deaf with interrupt and pseudo-duplex capability
US5325417A (en) 1992-05-20 1994-06-28 Ultratec, Inc. Telecommunication device for the deaf with automatic self-identification
US5581593A (en) 1994-06-10 1996-12-03 Ultratec, Inc. Combination telephone and alphanumeric entry device
USD364865S (en) 1994-06-10 1995-12-05 Ultratec, Inc. Text telephone
US5604786A (en) 1994-06-10 1997-02-18 Ultratec, Inc. Telephone with unified features for hearing and deaf users
US5687222A (en) 1994-07-05 1997-11-11 Nxi Communications, Inc. ITU/TDD modem
US5809425A (en) 1995-01-03 1998-09-15 Ultratec, Inc. Gateway for low cost alphanumeric paging entry system
US5978654A (en) 1995-01-03 1999-11-02 Ultratec, Inc. Alphanumeric paging entry system
US5815496A (en) * 1995-09-29 1998-09-29 Lucent Technologies Inc. Cascade echo canceler arrangement
US5909482A (en) 1997-09-08 1999-06-01 Ultratec, Inc. Relay for personal interpreter
US6594346B2 (en) 1997-09-08 2003-07-15 Ultratec, Inc. Relay for personal interpreter
US6567503B2 (en) 1997-09-08 2003-05-20 Ultratec, Inc. Real-time transcription correction system
US6493426B2 (en) 1997-09-08 2002-12-10 Ultratec, Inc. Relay for personal interpreter
US6603835B2 (en) 1997-09-08 2003-08-05 Ultratec, Inc. System for text assisted telephony
US5974116A (en) 1998-07-02 1999-10-26 Ultratec, Inc. Personal interpreter
US7164753B2 (en) 1999-04-08 2007-01-16 Ultratec, Incl Real-time transcription correction system
US6496798B1 (en) * 1999-09-30 2002-12-17 Motorola, Inc. Method and apparatus for encoding and decoding frames of voice model parameters into a low bit rate digital voice message
GB2382744B (en) 2000-09-19 2004-06-02 Ultratec Inc Relay for personal interpreter
US6882707B2 (en) 2001-02-21 2005-04-19 Ultratec, Inc. Method and apparatus for training a call assistant for relay re-voicing
US6504910B1 (en) 2001-06-07 2003-01-07 Robert Engelke Voice and text transmission system
US7881441B2 (en) 2005-06-29 2011-02-01 Ultratec, Inc. Device independent text captioned telephone service
US6885731B2 (en) 2002-07-29 2005-04-26 Robert M. Engelke Captioned telephone with emergency access feature
GB2435373B (en) 2004-02-18 2009-04-01 Ultratec Inc Captioned telephone service
US8515024B2 (en) 2010-01-13 2013-08-20 Ultratec, Inc. Captioned telephone service
US20090089042A1 (en) * 2007-01-03 2009-04-02 Samuel Joseph Wald System and method for interpreter selection and connection to communication devices
US8379801B2 (en) 2009-11-24 2013-02-19 Sorenson Communications, Inc. Methods and systems related to text caption error correction
US9350857B1 (en) 2014-12-16 2016-05-24 Ultratec, Inc. 911 call assistance for assisted device user

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5315585A (en) * 1992-05-15 1994-05-24 Kokusai Denshin Denwa Co., Ltd Echo canceller using two residual echoes
US6516050B1 (en) * 1999-02-25 2003-02-04 Mitsubishi Denki Kabushiki Kaisha Double-talk detecting apparatus, echo canceller using the double-talk detecting apparatus and echo suppressor using the double-talk detecting apparatus
US20030007633A1 (en) * 2001-01-26 2003-01-09 Tucker Luke A. Double-talk detector suitable for a telephone-enabled PC
US8917821B2 (en) * 2005-06-29 2014-12-23 Ultratec, Inc. Device independent text captioned telephone service
US20130201272A1 (en) * 2012-02-07 2013-08-08 Niklas Enbom Two mode agc for single and multiple speakers
US20130317818A1 (en) * 2012-05-24 2013-11-28 University Of Rochester Systems and Methods for Captioning by Non-Experts
US20140229171A1 (en) * 2013-02-08 2014-08-14 Qualcomm Incorporated Systems and Methods of Performing Filtering for Gain Determination
US9380150B1 (en) * 2015-09-16 2016-06-28 Captioncall, Llc Methods and devices for automatic volume control of a far-end voice signal provided to a captioning communication service

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170180558A1 (en) * 2015-12-22 2017-06-22 Hong Li Technologies for dynamic audio communication adjustment
US10142483B2 (en) * 2015-12-22 2018-11-27 Intel Corporation Technologies for dynamic audio communication adjustment
US10553235B2 (en) 2017-08-28 2020-02-04 Apple Inc. Transparent near-end user control over far-end speech enhancement processing
CN109005281A (en) * 2018-06-27 2018-12-14 努比亚技术有限公司 A kind of In Call adjusting method, mobile terminal and computer readable storage medium
CN110012258A (en) * 2019-03-29 2019-07-12 努比亚技术有限公司 Best audio-video perception point acquisition methods, system, wearable device and storage medium

Also Published As

Publication number Publication date
US9380150B1 (en) 2016-06-28
US10574804B2 (en) 2020-02-25

Similar Documents

Publication Publication Date Title
US10574804B2 (en) Automatic volume control of a voice signal provided to a captioning communication service
CN101370323B (en) Apparatus capable of performing acoustic echo cancellation and a method thereof
US9294851B2 (en) Hearing assistance devices with echo cancellation
US9426300B2 (en) Matching reverberation in teleconferencing environments
US20070033030A1 (en) Techniques for measurement, adaptation, and setup of an audio communication system
EP2815566B1 (en) Audio signal processing in a communication system
EP2059014A1 (en) Echo canceller and echo cancelling program
US10313509B2 (en) Updating filter coefficients during echo cancellation
JP2018046452A (en) Signal processing apparatus, program, method, and communications device
US20120140918A1 (en) System and method for echo reduction in audio and video telecommunications over a network
US20100074455A1 (en) Use of non-audible band to relay information for echo cancellation in a distributed media system
JP5923705B2 (en) Call signal processing device
US8737601B2 (en) Echo canceller
JP2006505218A (en) Technology to improve phone audio quality
Raghavendran Implementation of an acoustic echo canceller using matlab
US9191494B1 (en) Device, system, and method for performing echo cancellation in different modes of a communication device
JP6635211B1 (en) Echo canceller and IP telephone
US10771887B2 (en) Anisotropic background audio signal control
JP2001189795A (en) Communication equipment
Chrin et al. Performance of soft phones and advances in associated technology
JPH01256821A (en) Adaptive type echo canceller
JPH05227110A (en) Side tone suppressing device for sng
JP2988119B2 (en) Echo suppression device
Papp et al. Hands-free voice communication platform integrated with TV
Sakauchi et al. An acoustic echo canceler with noise and echo reduction

Legal Events

Date Code Title Description
AS Assignment

Owner name: CAPTIONCALL LLC, UTAH

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BULLOUGH, JEFFREY CHARLES;ROYLANCE, SHANE A.;CHEVRIER, BRIAN;REEL/FRAME:039022/0061

Effective date: 20151104

AS Assignment

Owner name: SORENSON IP HOLDINGS, LLC, UTAH

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CAPTIONCALL, LLC;REEL/FRAME:041020/0788

Effective date: 20170103

AS Assignment

Owner name: JPMORGAN CHASE BANK, N.A., ILLINOIS

Free format text: SECURITY INTEREST;ASSIGNORS:SORENSON IP HOLDINGS, LLC;CAPTIONCALL, LLC;REEL/FRAME:042229/0120

Effective date: 20170105

AS Assignment

Owner name: U.S. BANK NATIONAL ASSOCIATION, MINNESOTA

Free format text: SECURITY INTEREST;ASSIGNORS:SORENSON IP HOLDINGS, LLC;CAPTIONCALL, LLC;REEL/FRAME:042242/0001

Effective date: 20170105

STCT Information on status: administrative procedure adjustment

Free format text: PROSECUTION SUSPENDED

AS Assignment

Owner name: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLAT

Free format text: PATENT SECURITY AGREEMENT;ASSIGNORS:SORENSEN COMMUNICATIONS, LLC;CAPTIONCALL, LLC;REEL/FRAME:050084/0793

Effective date: 20190429

Owner name: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT, NEW YORK

Free format text: PATENT SECURITY AGREEMENT;ASSIGNORS:SORENSEN COMMUNICATIONS, LLC;CAPTIONCALL, LLC;REEL/FRAME:050084/0793

Effective date: 20190429

AS Assignment

Owner name: INTERACTIVECARE, LLC, UTAH

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:049109/0752

Effective date: 20190429

Owner name: CAPTIONCALL, LLC, UTAH

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:049109/0752

Effective date: 20190429

Owner name: SORENSON IP HOLDINGS, LLC, UTAH

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:049109/0752

Effective date: 20190429

Owner name: SORENSON COMMUNICATIONS, LLC, UTAH

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:049109/0752

Effective date: 20190429

AS Assignment

Owner name: SORENSON COMMUNICATIONS, LLC, UTAH

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:U.S. BANK NATIONAL ASSOCIATION;REEL/FRAME:049115/0468

Effective date: 20190429

Owner name: CAPTIONCALL, LLC, UTAH

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:U.S. BANK NATIONAL ASSOCIATION;REEL/FRAME:049115/0468

Effective date: 20190429

Owner name: INTERACTIVECARE, LLC, UTAH

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:U.S. BANK NATIONAL ASSOCIATION;REEL/FRAME:049115/0468

Effective date: 20190429

Owner name: SORENSON IP HOLDINGS, LLC, UTAH

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:U.S. BANK NATIONAL ASSOCIATION;REEL/FRAME:049115/0468

Effective date: 20190429

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: CORTLAND CAPITAL MARKET SERVICES LLC, ILLINOIS

Free format text: LIEN;ASSIGNORS:SORENSON COMMUNICATIONS, LLC;CAPTIONCALL, LLC;REEL/FRAME:051894/0665

Effective date: 20190429

AS Assignment

Owner name: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, NEW YORK

Free format text: JOINDER NO. 1 TO THE FIRST LIEN PATENT SECURITY AGREEMENT;ASSIGNOR:SORENSON IP HOLDINGS, LLC;REEL/FRAME:056019/0204

Effective date: 20210331

AS Assignment

Owner name: CAPTIONCALL, LLC, UTAH

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CORTLAND CAPITAL MARKET SERVICES LLC;REEL/FRAME:058533/0467

Effective date: 20211112

Owner name: SORENSON COMMUNICATIONS, LLC, UTAH

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CORTLAND CAPITAL MARKET SERVICES LLC;REEL/FRAME:058533/0467

Effective date: 20211112

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4