GB2579085A - Handling multiple audio input signals using a display device and speech-to-text conversion - Google Patents

Handling multiple audio input signals using a display device and speech-to-text conversion Download PDF

Info

Publication number
GB2579085A
GB2579085A GB1818872.2A GB201818872A GB2579085A GB 2579085 A GB2579085 A GB 2579085A GB 201818872 A GB201818872 A GB 201818872A GB 2579085 A GB2579085 A GB 2579085A
Authority
GB
United Kingdom
Prior art keywords
hearing
audio signal
word
user
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB1818872.2A
Other versions
GB201818872D0 (en
Inventor
Boretzki Michael
Feilner Manuela
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sonova Holding AG
Original Assignee
Sonova AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sonova AG filed Critical Sonova AG
Priority to GB1818872.2A priority Critical patent/GB2579085A/en
Publication of GB201818872D0 publication Critical patent/GB201818872D0/en
Publication of GB2579085A publication Critical patent/GB2579085A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B17/00Teaching reading
    • G09B17/04Teaching reading for increasing the rate of reading; Reading rate control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2225/00Details of deaf aids covered by H04R25/00, not provided for in any of its subgroups
    • H04R2225/43Signal processing in hearing aids to enhance the speech intelligibility

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

A hearing system comprises a user-worn hearing device (eg hearing aid) that outputs a first audio signal, and a display device 14 that displays a word sequence translated from a second audio signal. Words 34 of the word sequence are displayed one above the other and aligned (36) vertically using an alignment position determined for each word of the sequence. One audio signal may be generated from a microphone of the hearing device, and the other may be received from an external source such as a telecommunication network, external microphone. Of the translated word sequence, some may be omitted from the display if they are determined to have a translation accuracy lower than a threshold. The alignment position may be a letter of the respective word, or may be retrieved from a database. Eye-tracking may be used to display words only when the user is looking on the display device.

Description

DESCRIPTION
Handling multiple audio input signals using a display device and speech-to-text conversion
FIELD OF THE INVENTION
The invention relates to a method, a computer program and a computer-readable medium for operating a hearing system. Furthermore, the invention relates to a hearing 10 system with a hearing device and a display device.
BACKGROUND OF THE INVENTION
Hearing devices are generally small and complex devices. Hearing devices may include a processor, microphone, speaker, memory, housing, and other electronical and mechanical components. Examples of hearing devices are Behind-The-Ear (BTE), ReceiverIn-Canal (RIC), In-The-Ear (ITE), Completely-In-Canal (CIC), and Invisible-In-The-Canal (IIC) devices. A user may prefer one of these hearing devices compared to another device based on hearing loss, aesthetic preferences, lifestyle needs, and budget.
A challenge for hearing impaired is the presence of multiple audio sources, often with low speech quality, especially in situations where lip reading is not possible. A typical case may be a telephone conversation where the other person is not visible. In these circumstances sound quality is often low due to limited bandwidth of the telecommunications connection. Known solutions comprise, for instance, speech-to-text conversion techniques where users may visually capture spoken words and speech. The generated text information may be presented in different ways, mostly using electronic displays. For example, US 8903174 B2 relates to a serial text display for optimal recognition.
A reading effectiveness may be improved using methods like RSVP (Rapid Serial Visual Representation). For example, US 2014189515 Al describes a method and a system for displaying text using RSVP. In these scenarios text is displayed sequentially word by word instead of presenting a whole sentence or a plurality of sentences at once.
US 8265931 B2 is about a method and a device for providing speech to text encoding and a related telephony service.
However, for users of hearing devices or hearing aid systems, especially handling multiple sources of audio signals often remains difficult in many day-to-day situations.
DESCRIPTION OF THE INVENTION
It is an objective of the invention to improve the speech reception and recognition 10 capabilities of human beings in respect to different present audio signals.
This objective is achieved by the subject-matter of the independent claims. Further exemplary embodiments are evident from the dependent claims and the following description.
lin a first aspect of the invention, a method for operating a hearing system is proposed.
A hearing system may be described as a set of components with functions and features which is aimed to support a human being's hearing capabilities. The hearing system comprises at least one hearing device to be worn in an ear or at/behind an ear of a user. The hearing system furthermore comprises a display device. This display device may be any type of device which may visually present text information, suitable to be read by a user. Examples are portable devices such as smartphones, tablet or laptop computers, wearable devices such as VR glasses, heads-up displays as used in cars or similar devices. A display device may contain dedicated processing resources such as microcontrollers, microprocessors, RAM/ROM, cache memory, hard disks and communications equipment.
The method comprises a step of outputting a first audio signal with an output unit of the hearing device. Such an output unit may be a loudspeaker or a cochlea implant. In other words, the output unit provides the first audio signal to the user in a way that he may physiologically recognize the first audio signal as such. In a next step a second audio signal is translated into a word sequence. The term "translate" is supposed to describe an encoding or transcoding process, where spoken words contained in the audio signal are recognized and/or detected and converted into alphanumeric or text information.
In a following step, the word sequence is displayed on the display device, wherein words of the word sequence are displayed one above the other on the display device. This may mean that in opposite to the usual horizontal alignment of a sequence of words the words are arranged vertically relative to each other. For example, only a limited number of words, for instance 4 to 6 words, are displayed on the display device. Due to the limited size of a usual screen of a wearable or portable display device, the number of words to be displayed may be limited. In an example, only one word is visible to the user and the words of the word sequence are timely sequentially displayed one after another.
When displaying the word sequence on the display device, an alignment position for each word of the word sequence may be determined. This alignment position may indicate which positions of the words have to be vertically aligned with respect to each other on the display device. In order to improve a recognition rate of a sequence of words, the display position and arrangement of words relative to each other may be optimized for faster recognition.
For example, a microphone may capture an ambient audio signal and an output signal, optionally after amplifying, filtering or processing, may be provided to the output unit of the hearing device. Typical audio signal sources may be ambient noise or human speech captured via microphones, but also telephones or external systems connected via T-Coil, WLAN, Bluetooth or similar technologies. Known solutions may also mix different audio signals generating a combined audio output signal and forward it to the user. However, especially when mixed signals are present, speech intelligibility often suffers and recognition rate may decrease. This may be overcome by presenting one of the audio signal in text form as described in the above and the below.
In a further example, an arrangement of a word of the word sequence on the display device is based on a fixed position of the alignment position of the displayed word relative to the display device. An advantage may be that a user's eye may rest and may remain focused on that alignment position of this particular word without moving focus or changing eye position during presentation of a plurality of words of the word sequence.
The described method may provide the advantage of presenting at least two different 30 audio signal sources to a user through the application of different presentation methods, for instance audio signal plus text information. In other words, for better recognition different human senses such as watching/reading and hearing are used. Hence, a clear differentiation between the different audio sources may be achieved and hearing recognition capabilities of a user may be improved.
According to an embodiment of the invention, one of the first audio signal and the second audio signal is generated from a microphone of the hearing device and the other one of the first audio signal and the second audio signal is received in the hearing system from an external source. In other words, two separate independent audio sources may be processed and independently handled. In an example, a switching unit may be configured to switch between the at least two different audio sources or audio signals. An advantage may be seen in the ability to apply different presentation techniques to select a preferred or most suitable mode for best recognition and hearing experience. If a user may face problems with understanding and recognizing speech as audio signal, the user may switch to speech-to-text processing for better recognition.
According to an embodiment of the invention, the external source is a telecommunications network and the other one of the first audio signal and the second audio signal is a telephone call. Alternatively, according to this embodiment of the invention, the external source is an external microphone for acquiring the other one audio signal and transmitting the other one audio signal to the hearing system. Furthermore, and alternatively, according to this embodiment of the invention the external source is an external audio device adapted for outputting the other one audio signal with a loudspeaker and for transmitting the other one audio signal to the hearing system. For instance, this external audio device may be a portable entertainment device, capable to transmit music via a wireless connection to the hearing device.
According to an embodiment of the invention, the first audio signal and the second audio signal are mixed before output by the output unit of the hearing device. In other words, the user may receive a combined audio signal containing both the first and second audio signal. In one example, this may be a mix of speech and music.
According to an embodiment of the invention, only some of the words of the word 30 sequence are displayed and other words of the word sequence are omitted. A translation accuracy is determined for each word, wherein the translation accuracy indicates a probability of a correct translation of that word. Only words with a translation accuracy higher than an accuracy threshold may displayed. In other words, if translation accuracy is below or equal to that accuracy threshold, the corresponding word is omitted and is not displayed. Advantage may be that misinterpretations and irritations of users caused by translation errors may be minimized.
According to an embodiment of the invention, the accuracy threshold is changeable by the user with a user interface of the hearing system. For example, if a user finds that too many words are omitted and it becomes difficult to capture the meaning of a sentence or 10 a sequence of words, the threshold may be lowered in order to display more words.
According to an embodiment of the invention, the alignment position of a word is a letter of the word. For example, an alignment position of a word may be the fourth letter of a total of 10 letters of that word. A display controller of the display device may use that specific letter as a basis for arranging the entire word on the display of the display device.
According to an embodiment of the invention, the alignment position of a word is determined or retrieved from a database storing words and associated alignment positions. The determination of suitable alignment positions within a word may depend on experience, empirical data, mathematical methods or physiological factors.
According to an embodiment of the invention, for each word a display duration is determined and the word is displayed on the display for its display duration. The underlying consideration is that, particularly for less frequently used or more complex words, the time for a user to safely recognize a word is longer than for a simple, short and/or very frequently used word. Therefore, it may be desirable to extend the duration of a word being displayed on a display in order to allow more time for a user to capture the word visually and its meaning.
According to an example, the display duration is determined from a database. Such a database may contain one or more display duration values, for example, in seconds or milliseconds, associated to specific words of a stored dictionary. In an example, a display duration value may be determined from statistics how often this word is occurring in a particular language. According to another example, a display duration may be increased when a word is displayed a first time. In turn, the word may be displayed accordingly shorter, if the word has already been shown several times within a particular timeframe.
According to an embodiment of the invention, a video stream of at least one eye of the user is evaluated, whether the user's eye is looking towards or onto the display device, and a display of words of the word sequence is solely performed, when the user is looking on the display device. The thought behind this feature is the consideration of a user's attention when displaying text information on the display device. Concretely, in situations where a user is distracted, his eyes move away from the display device and away from the displayed words. In such cases it may be desirable to interrupt or pause a display cycle to avoid losing information in the meantime before the user focuses his eyes and attention back on the display device. In an example, an eye tracking device may incorporate a camera or other suitable monitoring technology, which observes and tracks pupils of the user and is configured to detect eye and/or pupil constellations, which represent a focused state of a user's eye on the display device. In one example, a VR glass may combine such an eye-tracking unit and a display device.
According to an embodiment of the invention, the at least one hearing device is a hearing aid for compensating a hearing loss of the user and the first audio signal is processed by the hearing device to compensate a hearing loss of the user.
Further aspects of the invention relate to a computer program for operating a hearing 20 system, which, when being executed by a processor, is adapted to carry out the steps of the method as described in the above and in the following as well as to a computer-readable medium, in which such a computer program is stored.
For example, the computer program may be executed in a processor of a hearing device, which hearing device, for example, may be carried by the person behind the ear.
The computer-readable medium may be a memory of this hearing device. The computer program also may be executed by a processor of the display device and the computer-readable medium may be a memory of the display device. It also may be that steps of the method are performed by the hearing device and other steps of the method are performed by the displayed device or other components of the described hearing system.
According to an example, the execution of the steps of the method may also be executed by a cloud-based processing system which interacts with the hearing system.
In general, a computer-readable medium may be a floppy disk, a hard disk, an USB (Universal Serial Bus) storage device, a RAM (Random Access Memory), a ROM (Read Only Memory), an EPROM (Erasable Programmable Read Only Memory) or a FLASH memory. A computer-readable medium may also be a data communication network, e.g. the Internet, which allows downloading a program code. The computer-readable medium may be a non-transitory or transitory medium.
It has to be understood that features of the method as described in the above and in 10 the following may be features of the computer program, the computer-readable medium and the hearing system as described in the above and in the following, and vice versa.
These and other aspects of the invention will be apparent from and elucidated with reference to the embodiments described hereinafter.
BRIEF DESCRIPTION OF THE DRAWINGS
Below, embodiments of the present invention are described in more detail with reference to the attached drawings.
Fig. 1 schematically shows a hearing system with a hearing device, a display device and further external components, according to an embodiment of the invention.
Fig. 2 schematically shows a hearing device with a processor, a microphone, and an output unit, according to an embodiment of the invention.
Fig. 3 shows a simplified example of a display device, according to an embodiment of the invention, comprising a display for presenting a word sequence.
Fig. 4 shows a flow diagram of steps of the method for operating a hearing system, 25 according to an embodiment of the invention.
The reference symbols used in the drawings, and their meanings, are listed in summary form in the list of reference signs. In principle, identical parts are provided with the same reference symbols in the figures.
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
Fig. 1 shows a schematically simplified example of a hearing system 10 in an application scenario. A hearing device, which is worn by a hearing impaired in and/or behind his ear, may present an amplified or electronically processed audio signal to a user. For further details it is referred to Fig. 2 which describes a typical structure of such a hearing device 12. The hearing system 10 further comprises a display device 14, which may be carried out as a tablet computer, smart phone, but also as wearable glasses or heads-up display. Also possible are bigger display devices 14 which may be installed stationary. In a typical case, such display device 14 is a portable device with a limited weight and size to be carried by a user.
The hearing system 10 may further provide a link or connection into a data network 16. This data network 16 may be any type of telecommunications, voice, data, mobile (e.g. 3G/4G/5G) or similar type of network. In an example, the data network 16 is a telecommunications network providing a voice or audio connection and an associated audio signal. Such audio signal may be processed as one of a plurality of audio input signals and, for instance, coupled to the hearing device 12.
The hearing system 10, as shown in Fig. 1, shall demonstrate examples of different sources of audio signals and different possible options to link different input audio signals to different output formats. In a typical day-to-day situation, a hearing impaired is exposed to a variety of different audio sources and it may be desirable to not limit the possible audio sources to only one source which is captured and received by a user. For example, another typical external audio source may be a TV set 18, which also provides an audio signal. As a further example, an external microphone 20 shall symbolize alternative ambient audio sources. A typical example may be counters in an office where users may connect their hearing device 12 to a local, often stationary installed, microphone 20 to better understand an officer's speech, even when that person is located in further distance.
The connections shown as dotted lines in Fig. 1 refer to different kinds of audio signal forwarding and data connections. Examples of connections may be wired connections, in particular within the hearing device 12 or within the display device 14, but also wireless 30 connections using Bluetooth, Wireless LAN, LTE/4G/5G or similar technologies. In one example, the hearing device may furthermore be connected to one or more databases, which databases may provide information about the word sequence, translation and encoding functions, display duration and further retrievable data.
In Fig.?, a simplified example of a hearing device 12 is shown. The hearing device 12 is 5 configured to provide hearing aid for compensating a hearing loss of the user. Typically, the hearing device 12 is adapted to capture an audio signal and processes it in a specific way that a hearing impaired may better recognize speech and ambient noise. In many cases the frequency spectrum and related amplifying and filtering is adapted depending on the individual hearing capabilities of the user. In most cases a hearing device 12 10 comprises a microphone 22 which may be integrated within a housing of the hearing device 12 capturing ambient noise and speech and providing an audio signal to subsequent units of the hearing device 12. The microphone 22 may also be remotely connected or externally attached.
Typically, the hearing device 12 comprises a processor 24, which may be carried out as a microprocessor, a microcontroller or similar device. Optionally the processor 24 further comprises, if necessary, memory such as RAM, cache memory, SSD, or similar as well as interfaces and other necessary data processing components. The processor 24 may be configured to process audio signals digitally using DSP functionalities. This may include frequency filtering, amplifying, noise cancellation, mixing and other typical tasks. An interface (not shown) may provide users with possibilities to change and adapt settings of the hearing device 12 or other components of the hearing system 10. In one example, the processor and the associated peripheral components, such as memory, storage, power supply and others, may be located a remote data center or may be part of a cloud-based service. The person skilled in the art is informed about means to provide the necessary data communications connections, e.g. via a 4G/5G mobile data network using real-time or near to real-time data communications.
Furthermore, in regards to Fig. 2, a sender/receiver 26 provides functions to connect the hearing device 12 to other components of the hearing system 10. For instance, the sender/receiver 26 may be a Bluetooth interface which provides possibilities to receive 30 audio signals from external devices such as TV sets 18 or external microphones 20.
According to an example, the sender/receiver 26 may provide one of a first audio signal and a second audio signal, which is processed within the hearing device 12. The other one of the first audio signal and the second audio signal is provided by the microphone 22 of the hearing device 12. In other words, the hearing device may be configured to handle multiple audio sources and may process them by using the processor 24. In one example, the audio signal from the microphone 20 and the audio signal received from an external device through the sender/receiver 26 are mixed into a common output audio signal using DSP functions of the processor 24.
In one example, the sender/receiver 26 is configured to provide an output signal to an external component. According to an example, the sender/receiver 26 is connected to a display device 14, wherein the display device 14 performs a speech-to-text encoding and presenting the text to the user. In another example, the sender/receiver 26 is connected to a telecommunication network 16 and exchanges an audio signal as pad of a telephone call. The hearing device 12 also comprises an output unit 28, which may be a loudspeaker, and earphone, any type of audio output unit, or even a cochlear implant. In an example, the output unit 28 may also be configured to present a signal or information visually, haptic or as electric/electromagnetic output signal.
In Fig. 3 an example of a display device 14 is shown. The display device 14 comprises a display 30, which may be carried out as electronic display using LCD, LED, OLED or similar technologies. Such display 30 may also be part of a VR system or any projection system. The display device 14 comprises a processor 32 which may be adapted to perform speechto-text conversion. This means that an incoming audio signal contains spoken words in a particular language and the processor 32 provides functionalities and the necessary hardware and software components to perform a transcription or encoding of the contained words in the audio signal into a text format.
The display 30 is configured to present a sequence of discrete words 34 one after another. In this example, one single word only is shown her line on the display 30. According to the current example, a limited number of words 34 (here 4 lines) of a word sequence are visible at the same time, wherein a new word 34 appears in a first line of the display 30. Once a new word appears in the first line of the display 30, the remaining words displayed before are moved down one line accordingly, wherein the previously shown last line and the corresponding word 34 are not displayed anymore.
Seen from a time perspective, firstly, a new word 34 appears at point t1. After a few milliseconds, depending on the selected speed, at point t2, a next word appears in the first 5 line of the display 30 and the word 34 moves down one line. Another few milliseconds later, at t3, said word 34 moves down another line, and so forth. Advantage may be that even small displays, as often used in lightweight portable devices such as smart phones, watches, wearables, glasses, tablet computers and alike, may be used for this purpose. At the same time the user may get a reference in the flowing word sequence to previous 10 words and next words which may improve readability and intelligibility.
In one example, the display 30 of the display device 14 is configured to present only one single line of text or one single word 34 at a time. In this case, even very small displays may be used to present text information to a user.
In order to improve readability and speed of reading, the word 34 is spatially arranged on the display 30 such that the words are vertically aligned with respect to each other. For this purpose, a horizontal reference line 36 is provided. This artificial arrangement grid allows the processor 32 of the display device 14 to determine a specific horizontal position of the word 34 relative to the display 30. For this purpose, for every word an alignment position is determined, which may be a specific letter of that word. The processor 32 of the display device 14 now arranges the word 34 on the display 30 such that the alignment position of that word (for instance a specific letter) is located at the horizontal position of the horizontal reference line 36. The horizontal reference line 36 has a fixed position relative to the display 30 and/or to the display device 14. Advantage may be that a user's eye focus may rest on the display at a same horizontal position and does not need to change or move every time when a new word 34 appears.
The alignment position of a word 34 is determined based on an optimal physiological capturing of that word by a user. The alignment position can, according to an example, be retrieved from an external database. According to an example, the processor 32 is configured to determine a translation accuracy for each word in the speech to text conversion. Only those words with translation accuracy higher than in accuracy threshold are displayed on the display 30. In other words, some words of the word sequence are omitted and not displayed. This may improve quality of the speech to text conversion as well as improving and efficiency of the text presentation. In one example, the threshold of the translation accuracy is changeable and adjustable by a user, for instance through a user interface.
According to one example, the time between two steps, for instance between point t1 and point t2, may be seen as display duration. In other words, a single word 34 may be presented and shown differently long or short in terms of time. If rare or complex words are presented, the user may need more time to capture and recognize that word.
Accordingly, it may be desirable to extend the presentation time accordingly.
The display device 14 further comprises a camera device 38. This camera device 38 may be used to track a user's eye and/or its eye movement to determine and detect whether a user focuses on the display or is distracted and looking somewhere else. In case the user does not focus his eye on the display, the processor 32 may stop the presentation cycle until the user has returned his attention towards the display 30. Advantage may be that the user does not miss any text information when he is temporarily distracted. Additionally, this information may be provided as feedback back to the user. The display device 14 comprises a sender/receiver 40 to communicate with other components of the hearing system 10.
Referring to Fig. 4, an example of a method for operating a hearing system is described.
The hearing system comprises at least one hearing device 12 (see Fig. 1 and Fig. 2) to be worn in and/or behind an ear of a user as well as a display device. The method comprises the first step of outputting 110 a first audio signal with an output unit of the hearing device. This may mean a controlling of a specific output unit such as a loudspeaker, a headphone speaker or a similar audio output unit which may output an audio signal for reception by a user. In a next step 120 a second audio signal is translated into a word sequence. In other words, at least one further audio source is present and is presented differently, in this case as text information. For this purpose, this second audio signal is translated or encoded into text information.
In step 130 and alignment position for each word of the generated word sequence is determined. Background is that an optimal arrangement of that word on a display in a subsequent step is supposed to be achieved. For this purpose, the alignment position for each word is needed, wherein that alignment position may be retrieved from a database, or may alternatively be calculated or determined.
In step 140 the word sequence is displayed one above the other on the display device. In other words, every single word may be, for example, presented in one single line of the display and several lines are vertically aligned on that display device.
While the invention has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive; the invention is not limited to the disclosed embodiments. Other variations to the disclosed embodiments may be understood and effected by those skilled in the art and practicing the claimed invention, from a study of the drawings, the disclosure, and the appended claims. In the claims, the word "comprising" does not exclude other elements or steps, and the indefinite article "a" or "an" does not exclude a plurality. A single processor or controller or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage. Any reference signs in the claims should not be construed as limiting the scope.
LIST OF REFERENCE SYMBOLS
hearing system 12 hearing device 14 display device 16 data network 18 TV set external microphone 22 microphone (hearing device) 24 processor (hearing device) 26 sender/receiver 28 output unit display 32 processor (display device) 34 word (single) 36 horizontal reference line 38 camera device sender/receiver (display device) method 110 outputting translating determining 140 displaying

Claims (14)

  1. CLAIMS1. A method (100) for operation a hearing system (10), the hearing system (10) comprising at least one hearing device (12) to be worn in and/or behind an ear of a user and a display device (14), the method (100) comprising: outputting (110) a first audio signal with an output unit (28) of the hearing device (12); translating (120) a second audio signal into a word sequence; displaying (140) the word sequence on the display device; wherein words (34) of the word sequence are displayed one above the other on the display device (14); wherein for each word (34) of the word sequence an alignment position is determined (130), the alignment position indicating, which positions of the words (34) are vertically aligned with respect to each other on the display device (14).
  2. 2. The method (100) of claim 1, wherein one of the first audio signal and the second audio signal is generated from a microphone (22) of the hearing device (12); wherein the other one of the first audio signal and the second audio signal is received in the hearing system (10) from an external source.
  3. 3. The method (100) of claim 2, wherein the external source is a telecommunication network (16) and the other one audio signal is a telephone call; or wherein the external source is an external microphone (20) for acquiring the other one audio signal and transmitting the other one audio signal to the hearing system (10); 30 or wherein the external source is an external audio device (18) adapted for outputting the other one audio signal with a loudspeaker and for transmitting the other one audio signal to the hearing system (10).
  4. 4. The method (100) of one of the previous claims, wherein the first audio signal and the second audio signal are mixed before output by the output unit (28) of the hearing device (12).
  5. 5. The method (100) of one of the previous claims, wherein only some of the words (34) of word sequence are displayed and other 10 words (34) of the word sequence are omitted; wherein a translation accuracy is determined for each word (34), the translation accuracy indication a probability of a correct translation of the word (34); wherein only words with a translation accuracy higher than a accuracy threshold are displayed.
  6. 6. The method (100) of claim 5, wherein the accuracy threshold is changeable by the user with a user interface of the hearing system (10).
  7. 7. The method (100) of one of the previous claims, wherein the alignment position of a word (34) is a letter of the word
  8. 8. The method (100) of one of the previous claims, wherein the alignment positions are determined from a database storing words 25 (34) and associated alignment positions.
  9. 9. The method (100) of one of the previous claims, wherein for each word (34) a display duration is determined and the word is displayed on the display (30) for its display duration.
  10. 10. The method (100) of one of the previous claims, wherein a video stream of at least one eye of the user is evaluated, whether the user is looking on the display device (14) and the display of words (34) of the word sequence is solely performed, when the user is looking on the display device (14).
  11. 11. The method (100) of one of the previous claims, wherein the at least on hearing device (12) is a hearing aid for compensating a hearing loss of the user; wherein the first audio signal is processed by the hearing device (12) to compensate a hearing loss of the user.
  12. 12. A computer program for operation a hearing system (10), which, when being executed by at least one processor, is adapted to carry out the steps of the method (100) of one of the previous claims.
  13. 13. A computer-readable medium, in which a computer program according to claim 12 is stored.
  14. 14. A hearing system (10) comprising at least one hearing device (12) and a display device (14), wherein the hearing system (10) is adapted for performing the method (100) of one of claims 1 to 11.
GB1818872.2A 2018-11-20 2018-11-20 Handling multiple audio input signals using a display device and speech-to-text conversion Withdrawn GB2579085A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB1818872.2A GB2579085A (en) 2018-11-20 2018-11-20 Handling multiple audio input signals using a display device and speech-to-text conversion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1818872.2A GB2579085A (en) 2018-11-20 2018-11-20 Handling multiple audio input signals using a display device and speech-to-text conversion

Publications (2)

Publication Number Publication Date
GB201818872D0 GB201818872D0 (en) 2019-01-02
GB2579085A true GB2579085A (en) 2020-06-10

Family

ID=64740130

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1818872.2A Withdrawn GB2579085A (en) 2018-11-20 2018-11-20 Handling multiple audio input signals using a display device and speech-to-text conversion

Country Status (1)

Country Link
GB (1) GB2579085A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022165317A1 (en) * 2021-01-29 2022-08-04 Quid Pro Consulting, LLC Systems and methods for improving functional hearing

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0728790A (en) * 1993-07-09 1995-01-31 Canon Inc Method and device for document processing
US20070202474A1 (en) * 2006-02-24 2007-08-30 Justin William Miller Reader interfaced text orchestration
JP2008040773A (en) * 2006-08-07 2008-02-21 Fuji Xerox Co Ltd Image output device and image output program
US20090076825A1 (en) * 2007-09-13 2009-03-19 Bionica Corporation Method of enhancing sound for hearing impaired individuals
US20150036856A1 (en) * 2013-07-31 2015-02-05 Starkey Laboratories, Inc. Integration of hearing aids with smart glasses to improve intelligibility in noise
US20150149169A1 (en) * 2013-11-27 2015-05-28 At&T Intellectual Property I, L.P. Method and apparatus for providing mobile multimodal speech hearing aid
WO2016204995A1 (en) * 2015-06-17 2016-12-22 Microsoft Technology Licensing, Llc Serial text presentation
US20170188173A1 (en) * 2015-12-23 2017-06-29 Ecole Polytechnique Federale De Lausanne (Epfl) Method and apparatus for presenting to a user of a wearable apparatus additional information related to an audio scene
US20180270350A1 (en) * 2014-02-28 2018-09-20 Ultratec, Inc. Semiautomated relay method and apparatus

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0728790A (en) * 1993-07-09 1995-01-31 Canon Inc Method and device for document processing
US20070202474A1 (en) * 2006-02-24 2007-08-30 Justin William Miller Reader interfaced text orchestration
JP2008040773A (en) * 2006-08-07 2008-02-21 Fuji Xerox Co Ltd Image output device and image output program
US20090076825A1 (en) * 2007-09-13 2009-03-19 Bionica Corporation Method of enhancing sound for hearing impaired individuals
US20150036856A1 (en) * 2013-07-31 2015-02-05 Starkey Laboratories, Inc. Integration of hearing aids with smart glasses to improve intelligibility in noise
US20150149169A1 (en) * 2013-11-27 2015-05-28 At&T Intellectual Property I, L.P. Method and apparatus for providing mobile multimodal speech hearing aid
US20180270350A1 (en) * 2014-02-28 2018-09-20 Ultratec, Inc. Semiautomated relay method and apparatus
WO2016204995A1 (en) * 2015-06-17 2016-12-22 Microsoft Technology Licensing, Llc Serial text presentation
US20170188173A1 (en) * 2015-12-23 2017-06-29 Ecole Polytechnique Federale De Lausanne (Epfl) Method and apparatus for presenting to a user of a wearable apparatus additional information related to an audio scene

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
QuickEye, October 2010. "Speed Reading Software: Quickeye Software Features". Available at https://www.youtube.com/watch?v=nu0egaZvkM4 [Accessed 16 May 2019] *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022165317A1 (en) * 2021-01-29 2022-08-04 Quid Pro Consulting, LLC Systems and methods for improving functional hearing
US11581008B2 (en) 2021-01-29 2023-02-14 Quid Pro Consulting, LLC Systems and methods for improving functional hearing

Also Published As

Publication number Publication date
GB201818872D0 (en) 2019-01-02

Similar Documents

Publication Publication Date Title
US20230134787A1 (en) Headset Noise Processing Method, Apparatus, and Headset
KR101954550B1 (en) Volume adjustment method, system and equipment, and computer storage medium
CN108156568B (en) Hearing aid system and voice acquisition method of hearing aid system
US9299348B2 (en) Method and apparatus for obtaining information from the web
US11264035B2 (en) Audio signal processing for automatic transcription using ear-wearable device
EP3038383A1 (en) Hearing device with image capture capabilities
US20150149169A1 (en) Method and apparatus for providing mobile multimodal speech hearing aid
US11605395B2 (en) Method and device for spectral expansion of an audio signal
EP4085655A1 (en) Hearing aid systems and methods
US20120189129A1 (en) Apparatus for Aiding and Informing a User
US10945083B2 (en) Hearing aid configured to be operating in a communication system
US9906871B2 (en) Method for providing distant support to a personal hearing system user and system for implementing such a method
US9368884B2 (en) Apparatus for electrically coupling contacts by magnetic forces
CN113228710B (en) Sound source separation in a hearing device and related methods
US11589173B2 (en) Hearing aid comprising a record and replay function
CN103685752A (en) Loudspeaker and earphone automatic switching method based on camera
GB2579085A (en) Handling multiple audio input signals using a display device and speech-to-text conversion
EP3163904A1 (en) Sound recording method and device for generating 5.1 surround sound channels from three microphone channels
CN115243134A (en) Signal processing method and device, intelligent head-mounted equipment and medium
EP4290886A1 (en) Capture of context statistics in hearing instruments
WO2024202805A1 (en) Acoustic processing device, information transmission device, and acoustic processing system
JP2024034347A (en) Sound generation notification device and sound generation notification method
Know Widex Hearing Aids in The UK & Ireland
WO2023165844A1 (en) Circuitry and method for visual speech processing
CN114760575A (en) Individualized adjustment method and device for hearing aid

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)