EP3624465A1 - Hörgerätesteuerung mit semantischem inhalt - Google Patents
Hörgerätesteuerung mit semantischem inhalt Download PDFInfo
- Publication number
- EP3624465A1 EP3624465A1 EP18193769.9A EP18193769A EP3624465A1 EP 3624465 A1 EP3624465 A1 EP 3624465A1 EP 18193769 A EP18193769 A EP 18193769A EP 3624465 A1 EP3624465 A1 EP 3624465A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- sound signal
- sound
- directional
- hearing device
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000005236 sound signal Effects 0.000 claims abstract description 124
- 238000000034 method Methods 0.000 claims abstract description 43
- 238000012545 processing Methods 0.000 claims abstract description 9
- 238000006243 chemical reaction Methods 0.000 claims abstract description 5
- 238000011156 evaluation Methods 0.000 claims description 19
- 238000004590 computer program Methods 0.000 claims description 8
- 230000004886 head movement Effects 0.000 claims description 8
- 238000012800 visualization Methods 0.000 claims description 8
- 238000004891 communication Methods 0.000 description 8
- 238000001514 detection method Methods 0.000 description 6
- 239000000284 extract Substances 0.000 description 6
- 206010011878 Deafness Diseases 0.000 description 5
- 230000010370 hearing loss Effects 0.000 description 5
- 231100000888 hearing loss Toxicity 0.000 description 5
- 208000016354 hearing loss disease Diseases 0.000 description 5
- 230000008569 process Effects 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000013518 transcription Methods 0.000 description 2
- 230000035897 transcription Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 210000003477 cochlea Anatomy 0.000 description 1
- 210000000613 ear canal Anatomy 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 239000007943 implant Substances 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R25/00—Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception
- H04R25/43—Electronic input selection or mixing based on input signal analysis, e.g. mixing or selection between microphone and telecoil or between microphones with different directivity characteristics
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02087—Noise filtering the noise being separate speech, e.g. cocktail party
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0316—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R25/00—Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception
- H04R25/40—Arrangements for obtaining a desired directivity characteristic
- H04R25/407—Circuits for combining signals of a plurality of transducers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R25/00—Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception
- H04R25/55—Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception using an external connection, either wireless or wired
- H04R25/558—Remote control, e.g. of amplification, frequency
Definitions
- the invention relates to a method, a computer program and a computer-readable medium directionally amplifying a sound signal of a hearing device. Furthermore, the invention relates to a hearing system.
- Hearing devices are generally small and complex devices. Hearing devices can include a processor, microphone, speaker, memory, housing, and other electronical and mechanical components. Some example hearing devices are Behind-The-Ear (BTE), Receiver-In-Canal (RIC), In-The-Ear (ITE), Completely-In-Canal (CIC), and Invisible-In-The-Canal (IIC) devices. Some hearing devices may compensate a hearing loss of a user.
- BTE Behind-The-Ear
- RIC Receiver-In-Canal
- ITE In-The-Ear
- CIC Completely-In-Canal
- IIC Invisible-In-The-Canal
- Some hearing devices may compensate a hearing loss of a user.
- hearing devices steer the directivity in such situations either to the front (narrow or broad) or to a sector, where the sound sources are dominant.
- US 6 157 727 A shows a hearing aid interconnected with a translation system.
- a first aspect of the present disclosure relates to a method for directionally amplifying a sound signal of a hearing device.
- a hearing device may be a device worn by a user, for example in the ear or behind the ear.
- a hearing device may be a hearing aid adapted for compensating a hearing loss of the user.
- the method comprises: receiving the sound signal from a microphone of the hearing device.
- the sound signal may be a digital signal.
- the sound signal may be composed of data packets encoding volume and/or frequencies of the sound signal over time.
- the sound signal may comprise sound data from more than one microphone of the hearing aid.
- the method comprises: extracting directional sound signals and optionally a user voice signal from the sound signal.
- Such an extraction may be performed with spatial sound filters of the hearing device, which can extract sound from a specific direction from a sound signal.
- Such sound filters may comprise beam formers, which are adapted for amplifying sound from a specific direction based on sound data from several microphones.
- a directional sound signal may be associated with a direction and/or a position, for example a position of a sound source and/or speaker contributing to the directional sound signal.
- the method comprises: determining a word sequence from each directional sound signal and optionally the user voice signal. This may be performed with automatic speech recognition.
- the hearing aid and/or an evaluation system may comprise an automatic speech recognition module, which translates the respective sound signal into a word sequence.
- a word sequence may be encoded as character string.
- the method comprises: determining a semantic representation from each word sequence.
- a semantic representation may contain information about the semantic content of the word sequence.
- a semantic content may refer to a specific situation, which is talked about in a conversation, and/or a topic of a conversation, such as weather, holydays, politics, job, etc.
- a semantic representation may encode the semantic content with one or more values and/or with one or more words representing the situation/topic.
- a semantic representation may contain semantic weights for the semantic content of the word sequence. These weights may be determined with automated natural language understanding.
- the hearing aid and/or the evaluation system may comprise a natural language understanding module, which translates the word sequence into a semantic representation.
- a semantic weight may be a value that indicates the probability of a specific semantic content.
- a semantic representation may be a vector of weights, for example output by the natural language understanding module.
- the natural language understanding module may be or may comprise a machine learning module, which identifies words belonging to the same conversation situation, such as weather, holyday, work, etc.
- the semantic representation may contain a count of specific words in the word sequence.
- the relative count of words also may be seen as a semantic weight for the words.
- a semantic representation of a word sequence also may contain the substantives or specific substantiates extracted from the associated word sequence.
- the method comprises: identifying conversations from the semantic representations, wherein each conversion is associated with one or more directional sound signals and wherein each conversation is identified by clustering semantic representations.
- the hearing aid and/or the evaluation system may comprise a clustering module, which identifies clusters in the semantic representations.
- the semantic representations may be clustered by their semantic weights. Distances of semantic representations in a vector space of weights may be compared and semantic representations with a distance smaller than a threshold may be clustered.
- a set of substantive-pairs for each pair of sound sources and/or speakers may be compiled by pairing each substantive of the first sound source/speaker with each of the second sound source/speaker.
- a set of probabilities may be determined for each substantive-pair by looking up each pair in a dictionary. After that a conversation probability that the first sound source/speaker and the second sound source/speaker are in the same conversation may be determined from set of probabilities. The conversation probability may be compared with each other and the sound sources/speakers may be clustered based on this.
- the dictionary of substantive-pairs may be determined from a large number of conversation transcriptions, for example in a big data approach. From the conversation transcriptions, substantives may be extracted and for each pair of substantives a probability that they occur in the same conversation may be determined. From these pairs, the dictionary of substantive-pairs and associated probabilities may be compiled.
- conversations are identified based on question-response patterns, semantic relations of content between sound sources, etc.
- a conversation may be a data structure having references to the directional sound signals and/or their semantic representations and optionally to the user voice signal and its semantic representation.
- a conversation also may have a semantic representation of its own, which is determined from the semantic representations of its directional sound signals and optionally the user voice signal.
- a conversation may be associated with a direction and/or position.
- the method comprises: processing the sound signal, such that directional sound signals associated with one of the conversations are amplified.
- the directional sound signals associated with the selected conversation may be amplified stronger than other sound signals from other conversations.
- the conversion to be amplified may be selected automatically, for example as the conversation being associated with the user voice signal, and/or by the user, for example with a mobile device that is in communication with the hearing device.
- directionality of the hearing device may be steered and/or controlled, such that all sound sources belonging to the same conversation are within the focus of directivity and/or sound sources which do not belong to the same conversation are suppressed.
- Semantic content may be used to decide whether multiple speech sources belong to the same conversation or not.
- a conversation topology may be created. It may be decided, which conversation the user of the hearing device participates and the hearing aid may be controlled, such that sound sources belonging to this conversation are amplified, while other sound sources are suppressed.
- a current sound situation and/or environment, in which the user is situated is determined from the semantic representations and/or the semantic content of the conversation to optimize the hearing aid processing.
- a program of the hearing device for processing a specific sound situation may be selected and/or started, when the specific sound situation is identified based on the semantic representations. For example, when one of the speakers around the user says "this is a really nice restaurant", then a restaurant sound situation may be identified and/or a program for a restaurant sound situation may be started in the hearing aid.
- the method comprises: outputting the processed sound signal by the hearing device.
- the hearing device may comprise a loudspeaker, which may output the processed sound signal into the ear of the user. It also may be possible that the hearing device comprises a cochlea implant as outputting device.
- an environment of the user is divided into directional sectors and each directional sound signal is associated with one of the directional sectors.
- the environment may be divided into quadrants around the user and/or in equally angled sectors around the user.
- Each directional sound signal may be extracted from the sound signal by amplifying sound with a direction from the corresponding directional sector.
- a beam former with a direction and/or opening angle as the sector may be used for generating the corresponding directional sound signal.
- the conversation topology may be analyzed sector based, i.e. the conversations may be analyzed per sector.
- sound sources are identified in the sound signal. This may be performed with time-shifts between signal sound data from different microphones. A position of each sound source may be determined by analysis of spatial cues, directional classification and/or other signal processing techniques to enhance spatial cues in diffuse environments, such as onset or coherence driven analysis or direct to reverberation ratio analysis. Each directional sound signal may be associated with one of the sound sources. Each directional sound signal may be extracted from the sound signal by amplifying sound with a direction towards the corresponding sound source. Again, this may be performed with a beam former with a direction towards the sound source and/or an opening angle solely covering the sound source.
- the conversation topology may be analyzed source based, i.e. the conversations may be analyzed per source. Each sound source may be detected, located and analyzed individually.
- speakers are identified in the sound signal.
- a speaker may be identified with characteristics of his/her voice. Such characteristics may be extracted from the sound signal.
- a number and/or positions of speakers may be determined based on speaker analysis, such as fundamental frequency and harmonic structure analysis, male/female speech detection, distribution of spatial cues taking into account head movements, etc.
- Each directional sound signal may be associated with one of the speakers. Each directional sound signal is extracted from the sound signal by amplifying sound with characteristics of the speaker.
- a speaker may be treated as a sound source and/or a beam former with a direction towards the speaker and/or an opening angle solely covering the speaker may generate the directional sound signal.
- the conversation topology may be analyzed speaker based, i.e. the conversations may be analyzed per speaker. Each speaker may be detected, located and analyzed individually.
- the sound signal is processed by adjusting at least one of a direction and a width of a beam former of the hearing device.
- a beam former that is directed towards the conversation.
- a user voice signal is extracted from the sound signal and a word sequence is determined from the user voice signal.
- the user voice signal is determined in another way.
- the user voice signal may be determined via bone conduction sensors, ear canal sensors and/or additional acoustic microphones.
- the method comprises: automatically selecting a conversation to be amplified by selecting a conversation, which has the highest concordance with the semantic representation of the user voice signal. For example, there may be only one conversation, which is associated with the user. In this case, this conversation may be selected.
- Every conversation has been identified by clustering semantic representations.
- a conversation may have the highest concordance with a specific semantic representation, when the semantic representations, from which the conversation has been determined, have the lowest distances from the specific sematic representation in the space, where the clustering has been performed.
- the method comprises: storing semantic representations of a user of the hearing aid over time, and automatically selecting a conversation to be amplified by selecting a conversation, which has the highest concordance with the stored semantic representations.
- the hearing system may learn preferences and/or interests of the user from semantic content detection for optimizing the selection of the conversation.
- the method comprises: presenting a visualization of the conversations to the user.
- the conversation topology may be visualized and/or displayed on a mobile device of the user, such as a smartphone.
- the sectors, sound sources and/or speakers may be shown by icons on a map of the environment of the user.
- the conversations may be visualized in the map.
- the user may then select one conversation or sound source. This selection of the user for a conversation to be amplified is then received in the hearing system.
- the visualization of the conversation topology may also allow the user to assign names to the detected speakers (e.g. spouse, grandchild). Additionally, an information flow between speakers may be visualized in the conversation topology map, such as active, inactive, muted speakers, etc.
- a virtual reality and/or augmented reality may be presented to the user with glasses and/or lenses with displays-
- a conversation selection then may be done with voice and/or gesture control and/or gaze analysis.
- the method comprises: detecting head movements of a user of the hearing device. This may be performed with a sensor of the hearing device, such as an acceleration sensor and/or magnetic field sensor, i.e. a compass. Then, directions and/or positions associated with the sectors, sound sources, speakers and/or conversations may be updated based on the head movements.
- a sensor of the hearing device such as an acceleration sensor and/or magnetic field sensor, i.e. a compass.
- the method comprises: receiving a remote sound signal from an additional microphone in the environment of the user of the hearing device.
- a microphone may be communicatively connected with hearing device and/or the evaluation system, for example with a mobile device of the user.
- at least the directional sound signals may be determined from the sound signal of the microphone of the hearing device and the remote sound signal.
- the steps of the method associated with automatic speech recognition and/or natural language understanding are performed by the hearing device. However, it also may be possible that these steps are at least partially performed by an external evaluation system, for example by a mobile device of the user.
- the data necessary for the steps performed in the evaluation system may be sent to the evaluation system and the results of the processing of the evaluation system may be sent back to the hearing device.
- This may be performed via a wireless connection, such as BluetoothTM or WiFi.
- the evaluation systems are a server, which is connected via Internet with the hearing device and/or the mobile device.
- the analysis may be performed in the cloud based on raw microphone signals, i.e. the sound signal of the microphone(s) of the hearing device and optionally the remote sound signal from an additional, external microphone. It also is possible that the analysis may be performed in the cloud based on pre-processed sound signals.
- At least one of the sound signal, the user voice signal, the directional sound signals, the word sequences and the semantic representations are sent to the evaluation system and/or at least one of the user voice signal, the directional sound signals, the word sequences, the semantic representations and the conversations are determined by the evaluation system.
- the hearing device is a hearing aid, wherein the sound signal is processed for compensating a hearing loss of the user.
- the sound signal, which has been processed for amplifying the selected conversation may be additionally processed for compensating a hearing loss of the user.
- the computer program may be executed in a processor of the hearing device, which hearing device, for example, may be carried by the person behind the ear.
- the computer-readable medium may be a memory of this hearing device.
- the computer program also may be executed by processors of the hearing device and/or the evaluation system.
- the computer-readable medium may be a memory of the hearing device and/or the evaluation system.
- a computer-readable medium may be a floppy disk, a hard disk, an USB (Universal Serial Bus) storage device, a RAM (Random Access Memory), a ROM (Read Only Memory), an EPROM (Erasable Programmable Read Only Memory) or a FLASH memory.
- a computer-readable medium may also be a data communication network, e.g. the Internet, which allows downloading a program code.
- the computer-readable medium may be a non-transitory or transitory medium.
- a further aspect of the present disclosure relates to a hearing system for directionally amplifying a sound signal of a hearing device comprising the hearing device.
- the hearing system may be adapted for performing the method as described in the above and in the following.
- the hearing system may comprise an evaluation system, which at least performed some of the steps of the method.
- the evaluation system may comprise a mobile device carried by the user and/or a server connected via Internet to the hearing device and/or the mobile device.
- the hearing device may send the sound signal to the mobile device, which may perform automatic speech recognition and/or natural language understanding.
- the mobile device also may send the sound signal to the server, which may perform automatic speech recognition and/or natural language recognition.
- the hearing system may be adapted for detecting, which speakers participate in the same conversation and/or which ones do not, and based on this may optimize the hearing performance of the user of the hearing device.
- the hearing system may analyze the conversation topology and the semantic content of the conversation(s). It may analyze the semantic content of the voice of the user. It may cluster and/or compare semantic content and may detect in which conversation the user is participating. In the end, the hearing system may control and/or steer a directivity of the hearing device, such that speakers participating not in the same conversation as the user are suppressed.
- Fig. 1 schematically shows a hearing system 10, which comprises a hearing device 12, a mobile device 14 and optionally a server 16.
- the hearing device 12 may be a binaural hearing device 12, which has two components 18 for each ear of a user. Each of the components 18 may be seen as a hearing device of its own.
- Each of the components 18, which may be carried behind the ear or in the ear, may comprise one or more microphones 20 and one or more loudspeakers 22. Furthermore, each or one of the components 18 may comprise a sensor 23, which is adapted for measuring head movements of the user, such as an acceleration sensor.
- the mobile device 14 which may be a smartphone, may be in data communication with the hearing device 12, for example via a wireless communication channel such as BluetoothTM.
- the mobile device 14 may have a display 24, on which a visualization of a conversation topology may be shown (see Fig. 2 ).
- the mobile device 14 may be in data communication with the server 16, which may be a cloud server provided in a cloud computing facility remote from the hearing device 12 and/or the mobile device 14.
- an additional microphone 26, which is situated in the environment around the user of the hearing device 12, may be in in data communication with the server 16.
- Communication between the mobile device 14 and/or the additional microphone 26 and the server 16 may be established via Internet, for example via BluetoothTM and/or a mobile phone communication network.
- the mobile device 14 and the server 16 may be seen as an evaluation system 28 that may perform some of the steps of the method as described with respect to Fig. 3 externally to the hearing device 12.
- Fig. 2 shows a diagram that may be displayed by the mobile device 14. It shows the user 30, further persons and/or speakers 32 in the environment of the user 30 and conversations 34 in which these speakers 32 participate. All these information may have been determined by the hearing system 10.
- Fig. 3 shows a modular configuration of the hearing system 10.
- the modules described in the following also may be seen as method steps of a method that may be performed by the hearing system 10. It has to be noted that the modules described below may be implemented in software and/or may be part of the hearing device 12, the mobile device 14 and/or the server 16.
- the microphones 20 of the hearing device produce a sound signal 36, which may comprise sound data of all the microphones 20.
- the sound signal 36 may be seen as a multi-component sound signal.
- a sector beam former module 38 receives the sound signal 36 and extracts sector sound signals 40 from the sound signal 36.
- An environment of the user 30 may be divided into directional sectors 39 (see Fig. 2 ), such as quadrants.
- the sector beam former module 38 may generate a sector sound signal 40.
- the sector beam former module 38 may comprise a beam former for each sector 39, which direction and angle width is adjusted to the respective sector 39.
- Each sector sound signal 40 may be extracted from the sound signal 36 by amplifying sound with a direction from the corresponding directional sector 39, for example with a beam former.
- a plurality of sector sound signals 40 may be generated, but that only one of the signals 40 is shown in Fig. 3 . Also, for much of the signals and/or data mentioned in the following, which are associated to speakers 32, sound sources, conversations 34, etc., only one line may be shown in Fig. 3 .
- the sector sound signals 40 may be received in a speaker detection module 42, which identifies speakers 32 and/or sound sources in the respective sector 39. For example, each speaker may be detected as a separate sound source. In general, a number of sound sources and positions/directions of these sound sources may be determined by the speaker detection module 42.
- the directions and/or positions of the sound sources may be input into a speaker beam former module 44.
- the speaker beam former module 44 may comprise a plurality of beam formers, each of which may be adjusted to a direction and/or position of one of the sound sources. Each beam former of the speaker beam former module 44 then extracts a speaker voice signal 46 from the sound signal 36 by amplifying sound with a direction from the corresponding sound source 32.
- the module 44 receives a remote sound signal from the additional microphone 26.
- the speaker voice signals 46 then may be determined from the sound signal 36 of the one or more microphones 20 of the hearing device 12 and the remote sound signal.
- the speaker voice signals 46 and optionally the information from the speaker detection module 42 may be received by a speaker identification module 48, which extracts speaker characteristics 50 from the respective speaker sound signal 46.
- the speaker identification module 48 may identify a speaker 32 with the aid of speaker characteristics stored in the database 49. Also, the characteristics 50 extracted by the module 48 may be enriched with characteristics stored in the database 49 associated with an identified speaker 32.
- speakers 32 are identified in the sound signal 36 directly with the speaker identification module 48 and that the speaker voice signals 46 are extracted from the sound signal 36 by amplifying sound with characteristics 50 of the speaker 32. Also, these characteristics 50 may be identified by the speaker identification module 48 as described above.
- the sector sound signals 40 and the speaker voice signals 46 may be seen as directional sound signals, which are then further processed by automatic speech recognition and automatic natural language understanding.
- the sector sound signals 40 and/or the speaker voice signals 46 are received by an automatic speech recognition module 52. As indicated in Fig. 3 , either the sector sound signals 40 or the speaker voice signals 46 may be input into the automatic speech recognition module 52. For example, the user 30 may select one of these options.
- the automatic speech recognition module 52 determines a word sequence 54 for the respective directional sound signal 40, 46, which is then input into a natural language understanding module 56, which determines a semantic representation 58.
- a semantic representation 58 may contain semantic weights for a semantic content of the word sequence 54.
- the semantic representation may be a vector of weights and each weight indicates the probability of a specific conversation subject, such as holidays, work, family, etc.
- Both the automatic speech recognition module 52 and the natural language understanding module 56 may be based on machine learning algorithm, which have to be trained for identifying words in a data stream containing spoken language and/or identifying semantic content in a word sequence.
- the automatic speech recognition module 52 and/or the natural language understanding module 56 may use speaker characteristics 50 of the speaker 32 associated with the speaker voice signal during processing their input.
- a further automatic speech recognition module 52 and a further natural language understanding module 56 process input from a user voice extractor 60.
- the user voice extractor 60 extracts a user voice signal 62 from the sound signal 36.
- the user voice signal 62 is then translated by the further automatic speech recognition module 52 into a word sequence 54.
- the further natural language understanding module 56 determines a semantic representation 58 from this word sequence.
- semantic representations 58 for the user 30 and for several sectors 39 or speakers 32 have been generated. This information and/or data is then used to identify conversations 34 and to process the sound signal 36 in such a way that the conversion 34, in which the user 30 participates, is amplified.
- the semantic representations 58 are then input into a comparator 64, which generates distance information 66 (which may be a single value) for each pair of semantic representations 58.
- the distance information 66 may be or may comprise a distance in the vector space of weights of the semantic representations 58.
- the clustering module 68 identifies conversations 34, which, for example, are sets of sound sources (such as the user, the speakers, the sectors), which have a low distance according to the distance information 66. It also may be that the clustering module 68 directly clusters semantic representations 58 by their semantic weights into conversations 34. Each conversation 34 may be associated with a sector 39, a sound source and/or a speaker 32 and optionally the user 30. Each conversation also may be associated with a semantic representation 58, which may be an average of the semantic representations 58, which define the cluster, the conversation 34 is based on.
- the identified conversations 34 are input into a map module 74, which generates a conversation topology map 76.
- the conversations 34 may be associated with the positions and/or directions of the sectors 39, sound sources and/or speakers 32 and optionally the user 30, the conversation is associated with.
- the map module 74 may update the conversation topology map 74, such as directions and/or positions associated with the conversations 34, based on the head movements.
- the conversation topology map 74 may be updated, when the user is moving and/or turning within his/her environment.
- the conversation topology map 74 is actualized over time, for example, when conversation subjects change and/or speakers 32 enter or leave conversations.
- the conversations 34 generated by the clustering module 68 may be identified with already existing conversations 34 in the conversation topology map 74, which are then updated accordingly.
- the conversation topology map 76 may be visualized by a visualization module 78, which may generate an image that is presented on the display 24 of the mobile device 14 of the user 30.
- a visualization may look like the diagram of Fig. 2 .
- the user 30 can select one of the conversations 34 that are displayed and that this conversation 34 is then amplified by the hearing device 12. It also may be that the map module 74 automatically selects a conversation 34 to be amplified by selecting a conversation 34, which has the highest concordance with the semantic representation 58 of the user voice signal 62 and/or which is associated with the user 30.
- selection information 80 such as a direction, a position, an angle, etc. is input into a control module 82 of the hearing device 12, which controls and/or steers the sound processing of the hearing device 12.
- the control module 82 determines control parameters 84, which are provided to a signal processor 86, which processes the sound signal 36 and generates an output signal 88, which may be output by the loudspeaker 22 of the hearing device 12.
- the signal processor 86 may be adjusted with the control parameters 84, such that the directional sound signals 40, 46 associated with one of the conversations 34 are amplified. For example, this may be performed by adjusting a beam former such that its direction and opening angle (width) are directed towards all sound sources, sectors and/or speakers associated with the conversation 34. Additionally, it may be that the sound signal 36 is processed by the signal processor 86 for compensating a hearing loss of a user 30 of the hearing device 12.
Landscapes
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Neurosurgery (AREA)
- Otolaryngology (AREA)
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Circuit For Audible Band Transducer (AREA)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP18193769.9A EP3624465B1 (de) | 2018-09-11 | 2018-09-11 | Hörgerätesteuerung mit semantischem inhalt |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP18193769.9A EP3624465B1 (de) | 2018-09-11 | 2018-09-11 | Hörgerätesteuerung mit semantischem inhalt |
Publications (2)
Publication Number | Publication Date |
---|---|
EP3624465A1 true EP3624465A1 (de) | 2020-03-18 |
EP3624465B1 EP3624465B1 (de) | 2021-03-17 |
Family
ID=63557332
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP18193769.9A Active EP3624465B1 (de) | 2018-09-11 | 2018-09-11 | Hörgerätesteuerung mit semantischem inhalt |
Country Status (1)
Country | Link |
---|---|
EP (1) | EP3624465B1 (de) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6157727A (en) | 1997-05-26 | 2000-12-05 | Siemens Audiologische Technik Gmbh | Communication system including a hearing aid and a language translation system |
EP3249944A1 (de) * | 2016-05-27 | 2017-11-29 | EM-Tech Co., Ltd. | Headset-vorrichtung mit aktiver rauschunterdrückung mit hörgerätemerkmalen |
US20180033428A1 (en) * | 2016-07-29 | 2018-02-01 | Qualcomm Incorporated | Far-field audio processing |
US9980042B1 (en) * | 2016-11-18 | 2018-05-22 | Stages Llc | Beamformer direction of arrival and orientation analysis system |
-
2018
- 2018-09-11 EP EP18193769.9A patent/EP3624465B1/de active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6157727A (en) | 1997-05-26 | 2000-12-05 | Siemens Audiologische Technik Gmbh | Communication system including a hearing aid and a language translation system |
EP3249944A1 (de) * | 2016-05-27 | 2017-11-29 | EM-Tech Co., Ltd. | Headset-vorrichtung mit aktiver rauschunterdrückung mit hörgerätemerkmalen |
US20180033428A1 (en) * | 2016-07-29 | 2018-02-01 | Qualcomm Incorporated | Far-field audio processing |
US9980042B1 (en) * | 2016-11-18 | 2018-05-22 | Stages Llc | Beamformer direction of arrival and orientation analysis system |
Non-Patent Citations (1)
Title |
---|
MOATTAR M H ET AL: "A review on speaker diarization systems and approaches", SPEECH COMMUNICATION, ELSEVIER SCIENCE PUBLISHERS, AMSTERDAM, NL, vol. 54, no. 10, 29 May 2012 (2012-05-29), pages 1065 - 1103, XP028449260, ISSN: 0167-6393, [retrieved on 20120606], DOI: 10.1016/J.SPECOM.2012.05.002 * |
Also Published As
Publication number | Publication date |
---|---|
EP3624465B1 (de) | 2021-03-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DK1912474T3 (da) | Fremgangsmåde til drift af en hørehjælpeindretning samt en hørehjælpeindretning | |
CN106898348B (zh) | 一种出声设备的去混响控制方法和装置 | |
US20130148829A1 (en) | Hearing apparatus with speaker activity detection and method for operating a hearing apparatus | |
US20180213339A1 (en) | Adapting hearing aids to different environments | |
US9332359B2 (en) | Customization of adaptive directionality for hearing aids using a portable device | |
US20150088501A1 (en) | Methods and apparatus for signal sharing to improve speech understanding | |
US11264035B2 (en) | Audio signal processing for automatic transcription using ear-wearable device | |
CN106257936B (zh) | 用于助听器的原位适配系统以及助听器系统 | |
CN108629241B (zh) | 一种数据处理方法和数据处理设备 | |
EP3900399B1 (de) | Quellentrennung in hörgeräten und zugehörige verfahren | |
US11627398B2 (en) | Hearing device for identifying a sequence of movement features, and method of its operation | |
EP3716650B1 (de) | Gruppierung von hörgerätenutzern basierend auf räumlicher sensoreingabe | |
EP3624465B1 (de) | Hörgerätesteuerung mit semantischem inhalt | |
DK2120484T3 (da) | Fremgangsmåde til brug af et høreapparat og et høreapparat | |
EP2688067B1 (de) | System zum Trainieren und Verbessern der Rauschunterdrückung in Hörgeräten | |
US11451910B2 (en) | Pairing of hearing devices with machine learning algorithm | |
US11457320B2 (en) | Selectively collecting and storing sensor data of a hearing system | |
EP2888736B1 (de) | Audioverbesserungssystem | |
EP4068805A1 (de) | Verfahren, computerprogramm und computerlesbares medium zum konfigurieren eines hörgeräts, steuerung zum betrieb eines hörgeräts und hörsystem | |
US20220377468A1 (en) | Systems and methods for hearing assistance | |
EP3996390A1 (de) | Verfahren zur auswahl eines hörprogramms in einem hörgetät, basierend auf einer detektion der eigenen stimme | |
EP4178228A1 (de) | Verfahren und computerprogramm zum betrieb eines hörsystems, hörsystem und computerlesbares medium | |
WO2023110836A1 (en) | Method of operating an audio device system and an audio device system | |
CN117059115A (zh) | 语音增强方法、装置、系统、存储介质和辅听耳机 | |
EP4449413A1 (de) | Verfahren zum betrieb eines audiogerätesystems und audiogerätesystem |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN PUBLISHED |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: GRANT OF PATENT IS INTENDED |
|
17P | Request for examination filed |
Effective date: 20200909 |
|
RBV | Designated contracting states (corrected) |
Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: H04R 25/00 20060101AFI20200922BHEP Ipc: G10L 21/0208 20130101ALN20200922BHEP Ipc: G10L 21/0316 20130101ALN20200922BHEP |
|
INTG | Intention to grant announced |
Effective date: 20201012 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE PATENT HAS BEEN GRANTED |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R096 Ref document number: 602018013979 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: REF Ref document number: 1373381 Country of ref document: AT Kind code of ref document: T Effective date: 20210415 |
|
REG | Reference to a national code |
Ref country code: LT Ref legal event code: MG9D |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210617 Ref country code: BG Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210617 Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210618 Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 Ref country code: HR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: MK05 Ref document number: 1373381 Country of ref document: AT Kind code of ref document: T Effective date: 20210317 |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: MP Effective date: 20210317 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 Ref country code: RS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 Ref country code: LV Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 Ref country code: EE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 Ref country code: SM Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210717 Ref country code: PT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210719 Ref country code: PL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R097 Ref document number: 602018013979 Country of ref document: DE |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 Ref country code: AL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 |
|
26N | No opposition filed |
Effective date: 20211220 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
REG | Reference to a national code |
Ref country code: BE Ref legal event code: MM Effective date: 20210930 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210717 Ref country code: MC Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20210911 Ref country code: IE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20210911 Ref country code: BE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20210930 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20210930 Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20210930 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: HU Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO Effective date: 20180911 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20210317 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20240927 Year of fee payment: 7 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20240927 Year of fee payment: 7 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20240925 Year of fee payment: 7 |